Improving Splunk platform searches with the foreach command
When iterating over multivalue fields (for example, summing), you often have to create an expensive search that takes time to complete. You might also get a warning about search hitting the memory limit and search results getting capped. Some example use cases for this include:
- You have a spreadsheet with hundreds or thousands of rows and columns of your retail product sales, split out by product and date, and you want to sum the numerical values in each row.
- Your company has over 10,000 employees. You want to extract values (such as employee ID) for given key names from a JSON array so you can perform further operations on those values.
- You work at a large university and need to calculate grade averages across all students in a given major.
You can find more examples in the Splunk Documentation for the foreach command, or read on for two detailed examples.
Solution
Foreach
as a command isn’t new, but the mode
argument extends the existing command. The extension enables the command to support iteration over multivalue fields and field representations of JSON arrays. Let’s look at what it does.
Let’s say you have a multivalue field, as shown in the following sample search where 1, 2, and 3 are values of this multivalue field mv
.
| makeresults | eval mv=mvappend("1", "2", "3")
How would you sum these fields? You might try to use some multivalue eval
commands, but there isn't one for summing. So you might try to use mvexpand
and mvcombine
or you might convert the data to a JSON array and use spath.
Neither of these are convenient or efficient. You might also have tried something like the following search, which would have worked, but isn't efficient and clean. In addition, mvexpand
tends to become expensive if the result size is large, and it can hit memory limits.
| makeresults | eval mv=mvappend("1", "2", "3"), total=0 | mvexpand mv | eventstats sum(mv) AS total | mvcombine mv
So let's go back to our original search, but add the optional mode
argument. Mode
defaults to multifield, so in our search we’re going to specify multivalue
. We also specify the field we want to iterate over, which is mv
in this case. Lastly, rather than <<Field>>
, we’re going to use <<ITEM>>
and then run the search.
| makeresults
| eval mv=mvappend("1", "2", "3"), total=0
| foreach mode=multivalue mv [eval total = total + <<ITEM>>]
Another way to accomplish the same task is to replace the multivalue field with a JSON array, and also change the mode
.
| makeresults
| eval arr=json_array(1, 2, 3), total = 0
| foreach mode=json_array arr [eval total = total + <<ITEM>>]
In this case, we get the same result.
The following video shows the search described above. Note that the video has no sound.
Additional example
Let’s say we have a studentToGrade
JSON object and we want to compute the average. There’s a json_keys
command, but there’s no json_values
command to generate an array of the values.
| makeresults | eval studentToGrade=json_object("Adam", 80, "Samantha", 95, "Tina", 76), keys=json_keys(studentToGrade), grades=json_array()
Let's add a foreach
command to the search with the mode
argument. What this does is extract Adam as the key and append the key value to the grades object.
| makeresults
| eval studentToGrade=json_object("Adam", 80, "Samantha", 95, "Tina", 76), keys=json_keys(studentToGrade), grades=json_array()
| foreach mode=json_array keys [eval grades = json_append(grades,"",json_extract(studentToGrade, <<ITEM>>))]
Next, we add the table
command to see what we created.
| makeresults
| eval studentToGrade=json_object("Adam", 80, "Samantha", 95, "Tina", 76), keys=json_keys(studentToGrade), grades=json_array()
| foreach mode=json_array keys [eval grades = json_append(grades,"",json_extract(studentToGrade, <<ITEM>>))]
| table grades
Finally, we’ll add a few lines, with another foreach
to iterate through this new JSON array of grades. The eval
total will give us the total score, and dividing by the count will give us the average.
| makeresults
| eval studentToGrade=json_object("Adam", 80, "Samantha", 95, "Tina", 76), keys=json_keys(studentToGrade), grades=json_array()
| foreach mode=json_array keys [eval grades = json_append(grades,"",json_extract(studentToGrade, <<ITEM>>))]
| table grades
| eval total = 0, count = 0
| foreach mode=json_array grades [eval total = total + <<ITEM>>, count = count + 1]
| eval avgGrade = total / count
The following video shows the search described above. Note that the video has no sound.
Next steps
This new mode
argument - with the introduction of multivalue
and json_array modes
- allows fast and efficient iteration over fields, effectively creating a function on top of an array of elements. You can use it like a mapping function, that maps one collection of elements to a new collection or that reduces the collection to a single value.
Currently, only a single eval
expression is supported with foreach
command. Multiple eval
assignments are still permitted, but they must be under a single eval
pipeline.
These additional resources might help you understand and implement this guidance:
- Splunk Docs: foreach
- Splunk Docs: Evaluate and manipulate fields with multiple values
- Product Tips: Working with multivalue fields