Skip to main content

 

Splunk Lantern

Writing better queries in Splunk Processing Language

 

Poorly written queries can lead to slow, inefficient performance. Here are some best practices to improve them.

Solution

  1. Minimize the number of trips to the indexers.

    One of the best ways to minimize the number of trips to the indexers is to avoid using the join and append commands. Although these commands are widely used, they’re not the most efficient.

    This is because both commands make use of a subsearch (the content between the square brackets). With each subsearch comes additional trips to the indexers, which increase the level of communication and overhead that might need to be involved.

    Subsearches have additional limitations. By default, they have a timeout of 60 seconds and a limitation of 50,000 events (see subsearch_maxtime and subsearch_maxout in limits.conf). These factors lead to a truncation of results, which often goes unnoticed and leads to incorrect answers.

    So, what’s the solution?

    Combine your subsearch with your primary search and accomplish the join with a stats command instead. Here is an example

    Using join (before)

    index=_internal sourcetype=splunkd component=Metrics 
    | stats count AS metric_count BY host 
    | join host type=left 
            [search index=_audit sourcetype=audittrail  
            | stats count AS audit_count BY host] 
    | table host metric_count audit_count 
    

    Using stats (after)

    (index=_internal sourcetype=splunkd component=Metrics) OR (index=_audit sourcetype=audittrail) 
    | stats count(eval(sourcetype="splunkd")) AS metric_count count(eval(sourcetype="audittrail")) AS audit_count BY host 
    

    This technique can also be used in place of the append, dedup, and table commands.

  2. Minimize the amount of data coming back from the indexers.

    To lower the amount of data coming back from the indexers, many articles recommend filtering your data early on.

    While this does cut down on the number of events (vertical) that are retrieved, you should also focus on cutting down the number of fields (horizontal) that are retrieved.

    By using the fields streaming command early on within your SPL, you not only lower the amount of data being pulled from the indexers, but also the amount that has to be transferred to and processed by the search head.

    Whenever possible, try using the fields command right after the first pipe of your SPL as shown below.

    <base query>
    |fields <field list>
    |fields - _raw
    

    Here’s a real-life example of how impactful using the fields command can be.

      # of Fields Disk Usage Events Time Spent
    Query without use of fields 155 18458240 498478 166s
    Query with use of fields 18 5681152 498478 103s

  3. Perform calculations on the smallest amount of data.

    It’s most efficient to save calculations that use commands like eval, lookups, and foreach until after your data set has been made as succinct as possible through the previous steps. It’s also most efficient to combine commands whenever possible. For example, observe how you could combine the following eval statements into one comma-delimited eval statement.

    Before

    …
    | eval var1="value1"
    | eval var2="value2"
    | eval var3="value3"
    …
    After
     
    …
    eval var1="value1", var2="value2", var3="value3"
    …
  4. Use non-streaming commands as late in the query as possible.

    An additional query best practice is to save non-streaming, transforming commands for last. These are the commands that really give you the answers you’re looking for such as stats, chart, and timechart.

Next steps

With the above tips in mind, here’s a sample query template to follow.

Step SPL
Base query base query
Minimize data fields <list of fields>
Combine/Summarize data use of stats for join/append/summarizations
Execute calculations eval, lookup, etc
Format the data stats, chart, timechart, etc.

But remember — every query is different, so think of these tips as guidelines rather than rules.

If you've implemented the query writing tips in this article, but are still experiencing problems, try troubleshooting your queries using the Job Inspector. You can also read Optimizing search in Splunk Cloud Platform for advanced recommendations that go beyond inefficient search practices.

Need more help? Contact our Splunk Elite Partner, SP6. SP6 is a technology firm specializing in cybersecurity, CMMC compliance, and systems observability. SP6 has built North America’s largest Splunk Services team. Their team of cybersecurity and technology observability specialists ensures that the digital assets of customers are both protected and highly performant. SP6 delivers this expertise through both project-based Professional Services, as well as Managed Services for those organizations that can benefit from additional guidance.