Skip to main content
Splunk Lantern

Command line string length

Command lines that are extremely long may be indicative of malicious activity on your hosts that you may want to investigate.

You can run a simple or a more complex search to investigate strings of this type. The first search listed here returns a table of all command line logs in a certain time period, showing the length of command line strings that you can start to investigate. If that search returns too many results, you can run the second search listed here which uses statistical methods to calculate an average string length and then show you unusually long strings only.

Procedures

Option 1 - Find the length of all command line strings

Run the following search. You can optimize it by specifying an index and adjusting the time range.

sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" CommandLine=* 
| table _time host CommandLine
| eval cl_length=len(CommandLine)

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation

sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" 

Search Sysmon operational data.

CommandLine=* 

Filter for logs with a value in the command line field.

| table _time host CommandLine 

Display the results in a table with columns in the order shown.

| eval cl_length=len(CommandLine)

Create a new field called cl_length that shows the length of each command line string the search returns.

Next steps

False positives from this search may occur since some legitimate applications start with long command lines.

If your result set is not large, you might decide to read through the contents of the strings to see if anything looks suspicious. However, if the search returns a large number of events, you might decide to apply statistical methods to the data. The search below provides an example of this so that you can better determine which ones are outliers that you might want to investigate. You could also use the Sort and Where commands to filter out data below your defined threshold and bring the longest (or shortest) strings to the top. 

Finally, you might be interested in other processes associated with the Detecting a ransomware attack or Monitoring command line interface actions use cases.

Option 2 - Find unusually long command line strings

  1. Content developed by the Splunk Security Research team requires the use of consistent, normalized data provided by the Common Information Model (CIM). For information on installing and using the CIM, see the Common Information Model documentation. To run this search, your deployment needs to be ingesting endpoint data that tracks process activity, including parent-child relationships, from your endpoints to populate the Endpoint data model in the Processes node. The command line arguments are mapped to the process field in the Endpoint data model.
  2. Run the following search. You can optimize it by specifying an index and adjusting the time range.
| tstats allow_old_summaries=true count, min(_time) AS firstTime, max(_time) AS lastTime FROM datamodel=Endpoint.Processes BY "Processes.user", "Processes.dest", "Processes.process_name", "Processes.process" 
| rename "Processes.*" AS "*" 
| convert timeformat="%Y-%m-%dT%H:%M:%S" ctime(firstTime) 
| convert timeformat="%Y-%m-%dT%H:%M:%S" ctime(lastTime) 
| eval processlen=len(process) 
| eventstats stdev(processlen) AS stdev, avg(processlen) AS avg BY dest 
| stats max(processlen) AS maxlen, values(stdev) AS stdevperhost, values(avg) AS avgperhost BY dest, user, process_name, process 
| eval threshold=3 
| where (maxlen > ((stdevperhost * threshold) + avgperhost))

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation
| tstats allow_old_summaries=true count, min(_time) AS firstTime, max(_time) AS lastTime FROM datamodel=Endpoint.Processes BY "Processes.user", "Processes.dest", "Processes.process_name", "Processes.process"  Query the Endpoint data model at the Processes node to return user, dest, process name and process information.
| rename "Processes.*" AS "*"  Rename data model fields for better readability.
| convert timeformat="%Y-%m-%dT%H:%M:%S" ctime(firstTime) 
| convert timeformat="%Y-%m-%dT%H:%M:%S" ctime(lastTime) 
Convert these times into readable strings.
| eval processlen=len(process)  Create a new field called process that shows the length of strings the search returns.
| eventstats stdev(processlen) AS stdev, avg(processlen) AS avg BY dest  Calculate the average and the standard deviation of string length and name those results avg and stdev. Sort the results by destination.
| stats max(processlen) AS maxlen, values(stdev) AS stdevperhost, values(avg) AS avgperhost BY dest, user, process_name, process  Calculate the mean, standard deviation, and average value, sorting by dest, user, process name and process.
| eval threshold=3  Set a minimum threshold for the length of strings.
| where (maxlen > ((stdevperhost * threshold) + avgperhost)) Filter out anything below threshold.

Next steps

False positives from this search may occur because some legitimate applications start with long command lines.

If the search returns a large number of events, you might decide to apply further statistical methods to the data, for example through using maximum or minimum, on these numeric values so that you can better determine which ones are outliers that you might want to investigate. You could also use the sort and where commands to filter out data below your defined threshold and bring the longest strings to the top. 

Finally, you might be interested in other processes associated with the Detecting a ransomware attack or Monitoring command line interface actions use cases.