Optimizing search in Splunk Enterprise
Slow searches can be caused by inefficient search practices, but they can also be caused by poor data quality. Inefficiencies such as incorrect event breaks and time stamp errors in the data can cause indexers to work overtime both when indexing data and finding the search results. You want to resolve these issues to get performance improvements.
Use the Monitoring Console
Use the Monitoring Console dashboards to determine if any searches have performance issues that need attention. The Monitoring Console comes with preconfigured health checks in addition to platform alerts. You can modify existing health checks or create new ones. You can interpret results in the following dashboards to identify ways to optimize and troubleshoot your deployment.
- Search activity dashboards. The Search Activity: Instance and Search Activity: Deployment dashboards show search activity across your deployment with detailed information broken down by instance.
- Scheduler activity dashboards. The Scheduler activity: Deployment dashboard shows information about the past executions of scheduled searches, and their success rates. If you have a search head cluster, the Search head clustering Scheduler delegation dashboard deals with how the captain orchestrates scheduler jobs.
- Indexing performance dashboards. The Indexing performance: Deployment and Indexing performance: Instance dashboards show indexing performance across the deployment.
Improve your searches
- Select an index in the first line of your search. The computational effort of a search is greatest at the beginning, so searching across all indexes (index=*) slows down a search significantly.
- Use the TERM directive. Major breakers, such as a comma or quotation mark, split your search terms, increasing the number of false positives. For example, searching for average=0.9* searches for 0 and 9*. Searching for TERM(average=0.9*) searches for average=0.9*. If you aren't sure what terms exist in your logs, you can use the walklex command (available in version 7.3 and higher) to inspect the logs. You can use the TERM directive when searching raw data or when using the tstats command.
- Use the tstats command. The tstats command performs statistical queries on indexed fields, so it's much faster than searching raw data. The limitation is that because it requires indexed fields, you can't use it to search some data. However, if you are on 8.0 or higher, you can use the PREFIX directive instead of the TERM directive to process data that has not been indexed while using the tstats command. PREFIX matches a common string that precedes a certain value type.
Take additional steps
- Improve your source types. Review the data quality dashboards to identify and resolve data quality issues.
- Peek at pipelines. Review the indexing performance dashboards to identify any issues or load in a particular pipeline.
- Secure your Splunk. Review the safeguards for risky commands in the Splunk Enterprise Securing Splunk Enterprise Manual.
- Use tokens to build high-performance dashboards. Searches saved in dashboards can use tokens to allow users to switch between commands. When the token is in a child search, only the child search is updated as the token input changes. The base search, which can contain the index and other costly functionality, only needs to run once, which speeds up the search overall.
- Update your configuration files. If you never use the TERM directive, you can turn off the major breakers in your segmenters.conf file by moving all the minor breakers to the major breakers field in the [search] section of this configuration file. Doing so reduces bucket size but increases CPU usage. In Splunk Enterprise version 7.2.x and higher, using the zstd compression algorithm in the indexes.conf file, rather than gzip, also makes buckets smaller, thereby increased search speed. Finally, you can also update the tsidxWritingLevel to 3 in the indexes.conf file in Splunk Enterprise version 7.3.x and higher. Doing so takes advantage of newer tsidxfile formats for metrics and log events that decrease storage cost and increase.
- Preload expensive datasets using loadjob. The loadjob command uses the results of a previous search. If you run a lengthy search in one browser tab and keep it open, the data remains on the search head for some time, as long as you keep the tab open. Eventually, the search will time out, but while it is available, you can run other searches based off that initial data using the search job id (sid).
- Test your search string performance. The Search Performance Evaluator dashboard allows you to evaluate your search strings on key metrics, such as run duration (faster is better), the percentage of buckets eliminated from a search (bigger is better), and the percentage of events dropped by schema on the fly (lower is better).
- Identify SVC utilization changes. The Splunk Chargeback App can be used to monitor SVC consumption by business unit, department, or an individual user. An unexpected increase in SVC consumption could indicate adoption of inefficient searches or dashboards.
These additional Splunk resources might help you understand and implement these recommendations:
- .conf Talk: How to get the most out of your lexicon
- .conf Talk: Extend the Splunk Platform with custom search commands and setup pages