Writing better queries in Splunk Search Processing Language
Poorly written queries can lead to slow, inefficient performance. Here are some best practices to improve them.
- Minimize the number of trips to the indexers.
One of the best ways to minimize the number of trips to the indexers is to avoid using the
appendcommands. Although these commands are widely used, they’re not the most efficient.
This is because both commands make use of a subsearch (the content between the square brackets). With each subsearch comes additional trips to the indexers, which increase the level of communication and overhead that might need to be involved.
Subsearches have additional limitations. By default, they have a timeout of 60 seconds and a limitation of 50,000 events (see subsearch_maxtime and subsearch_maxout in limits.conf for Splunk Enterprise or Splunk Cloud Platform). These factors lead to a truncation of results, which often goes unnoticed and leads to incorrect answers.
So, what’s the solution?
Combine your subsearch with your primary search and accomplish the
statscommand instead. Here is an example
index=_internal sourcetype=splunkd component=Metrics | stats count AS metric_count BY host | join host type=left [search index=_audit sourcetype=audittrail | stats count AS audit_count BY host] | table host metric_count audit_count
(index=_internal sourcetype=splunkd component=Metrics) OR (index=_audit sourcetype=audittrail) | stats count(eval(sourcetype="splunkd")) AS metric_count count(eval(sourcetype="audittrail")) AS audit_count BY host
This technique can also be used in place of the
- Minimize the amount of data coming back from the indexers.
To lower the amount of data coming back from the indexers, many articles recommend filtering your data early on.
While this does cut down on the number of events (vertical) that are retrieved, you should also focus on cutting down the number of fields (horizontal) that are retrieved.
By using the
fieldsstreaming command early on within your SPL, you not only lower the amount of data being pulled from the indexers, but also the amount that has to be transferred to and processed by the search head.
Whenever possible, try using the
fieldscommand right after the first pipe of your SPL as shown below.
<base query> |fields <field list> |fields - _raw
Here’s a real-life example of how impactful using the
fieldscommand can be.
# of Fields Disk Usage Events Time Spent Query without use of fields 155 18458240 498478 166s Query with use of fields 18 5681152 498478 103s
- Perform calculations on the smallest amount of data.
It’s most efficient to save calculations that use commands like
foreachuntil after your data set has been made as succinct as possible through the previous steps. It’s also most efficient to combine commands whenever possible. For example, observe how you could combine the following
evalstatements into one comma-delimited
… | eval var1="value1" | eval var2="value2" | eval var3="value3" …
… eval var1="value1", var2="value2", var3="value3" …
- Use non-streaming commands as late in the query as possible.
An additional query best practice is to save non-streaming, transforming commands for last. These are the commands that really give you the answers you’re looking for such as
With the above tips in mind, here’s a sample query template to follow.
|Base query||base query|
|Minimize data||fields <list of fields>|
|Combine/Summarize data||use of stats for join/append/summarizations|
|Execute calculations||eval, lookup, etc|
|Format the data||stats, chart, timechart, etc.|
But remember — every query is different, so think of these tips as guidelines rather than rules.
If you've implemented the query writing tips in this article, but are still experiencing problems, try troubleshooting your queries using the Job Inspector. You can also read Optimizing search for advanced recommendations that go beyond inefficient search practices.
Need more help? Contact our Splunk Elite Partner, SP6. SP6 is a technology firm specializing in cybersecurity, CMMC compliance, and systems observability. SP6 has built North America’s largest Splunk Services team. Their team of cybersecurity and technology observability specialists ensures that the digital assets of customers are both protected and highly performant. SP6 delivers this expertise through both project-based Professional Services, as well as Managed Services for those organizations that can benefit from additional guidance.