Writing better queries in Splunk Search Processing Language

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Poorly written queries can lead to slow, inefficient performance. Here are some best practices to improve them.

Solution

Minimize the number of trips to the indexers.
One of the best ways to minimize the number of trips to the indexers is to avoid using the join and append commands. Although these commands are widely used, they’re not the most efficient.

This is because both commands make use of a subsearch (the content between the square brackets). With each subsearch comes additional trips to the indexers, which increase the level of communication and overhead that might need to be involved.

Subsearches have additional limitations. By default, they have a timeout of 60 seconds and a limitation of 50,000 events (see subsearch_maxtime and subsearch_maxout in limits.conf for Splunk Enterprise or Splunk Cloud Platform). These factors lead to a truncation of results, which often goes unnoticed and leads to incorrect answers.

So, what’s the solution?

Combine your subsearch with your primary search and accomplish the join with a stats command instead. Here is an example:

Using join (before)
```
index=_internal sourcetype=splunkd component=Metrics 
| stats count AS metric_count BY host 
| join host type=left 
        [search index=_audit sourcetype=audittrail  
        | stats count AS audit_count BY host] 
| table host metric_count audit_count 
```
Using stats (after)
```
(index=_internal sourcetype=splunkd component=Metrics) OR (index=_audit sourcetype=audittrail) 
| stats count(eval(sourcetype="splunkd")) AS metric_count count(eval(sourcetype="audittrail")) AS audit_count BY host 
```
This technique can also be used in place of the append, dedup, and table commands.

Minimize the amount of data coming back from the indexers.

To lower the amount of data coming back from the indexers, many articles recommend filtering your data early on.

While this does cut down on the number of events (vertical) that are retrieved, you should also focus on cutting down the number of fields (horizontal) that are retrieved.

By using the fields streaming command early on within your SPL, you not only lower the amount of data being pulled from the indexers, but also the amount that has to be transferred to and processed by the search head.

Whenever possible, try using the fields command right after the first pipe of your SPL as shown below.

<base query>
|fields <field list>
|fields - _raw

Here’s a real-life example of how impactful using the fields command can be.

	# of Fields	Disk Usage	Events	Time Spent
Query without use of fields	155	18458240	498478	166s
Query with use of fields	18	5681152	498478	103s

Perform calculations on the smallest amount of data.
It’s most efficient to save calculations that use commands like eval, lookups, and foreach until after your data set has been made as succinct as possible through the previous steps. It’s also most efficient to combine commands whenever possible. For example, observe how you could combine the following eval statements into one comma-delimited eval statement.

Before
```
…
| eval var1="value1"
| eval var2="value2"
| eval var3="value3"
…
```
After
```
…
eval var1="value1", var2="value2", var3="value3"
…
```
Use non-streaming commands as late in the query as possible.
An additional query best practice is to save non-streaming, transforming commands for last. These are the commands that really give you the answers you’re looking for such as stats, chart, and timechart.

Next steps

With the above tips in mind, here’s a sample query template to follow.

Step	SPL
Base query	base query
Minimize data	fields <list of fields>
Combine/Summarize data	use of `stats` for `join`/`append`/`summarizations`
Run calculations	`eval`, `lookup`, etc
Format the data	`stats`, `chart`, `timechart`, etc.

But remember — every query is different, so think of these tips as guidelines rather than rules.

Next steps

If you've implemented the query writing tips in this article, but are still experiencing problems, try troubleshooting your queries using the Job Inspector. You can also read Optimizing search for advanced recommendations that go beyond inefficient search practices.

Need more help? Contact our Splunk Elite Partner, SP6. SP6 is a technology firm specializing in cybersecurity, CMMC compliance, and systems observability. SP6 has built North America’s largest Splunk Services team. Their team of cybersecurity and technology observability specialists ensures that the digital assets of customers are both protected and highly performant. SP6 delivers this expertise through both project-based Professional Services, as well as Managed Services for those organizations that can benefit from additional guidance.

The user- and community-generated information, content, data, text, graphics, images, videos, documents and other materials made available on Splunk Lantern is Community Content as provided in the terms and conditions of the Splunk Website Terms of Use, and it should not be implied that Splunk warrants, recommends, endorses or approves of any of the Community Content, nor is Splunk responsible for the availability or accuracy of such. Splunk specifically disclaims any liability and any actions resulting from your use of any information provided on Splunk Lantern.