Note: This article applies only to Splunk Enterprise.
Splunk works fine out of the box. As you increase load on your system, though, you'll want to get familiar with ways to enhance its ability to handle that load. We’ll show you how to identify the cause of slow searches and review possible trouble spots in your deployment.
Do more with less through search optimization
Slow searches can be caused by inefficient search practices, but they can also be caused by poor data quality. You can find remarkable performance improvements when you resolve issues such as incorrect event breaks and time stamp errors in the data. Inefficiencies like these can cause indexers to work overtime both when indexing data and finding the search results. If your searches run more efficiently, you can earn the following benefits:
- Faster loading dashboards
- User experience is improved with faster completing searches
- User productivity improves as run\test cycles are accelerated
- Better performance enabling more use cases
- Improvements of x10 and x100 allow users to attack new problems
- Users can examine weeks and months of data, instead of just hours and minutes
- Reduced need for precomputation (summaries)
- Reduced server load
- More users supported on less hardware
- ROI on hardware investment improved
Use the Monitoring Console
Use the Monitoring Console dashboards to determine if any searches have performance issues that need attention. The Monitoring Console comes with preconfigured health checks in addition to platform alerts. You can modify existing health checks or create new ones. You can interpret results in these dashboards to identify ways to optimize and troubleshoot your deployment.
- Search activity dashboards. The Search activity: Instance and Search activity: Deployment dashboards show search activity across your deployment with detailed information broken down by instance.
- Scheduler activity dashboards. The Scheduler activity: Deployment dashboard shows information about the past executions of scheduled searches, and their success rates. If you have a search head cluster, the Search head clustering Scheduler delegation dashboard deals with how the captain orchestrates scheduler jobs.
- Indexing performance dashboards. The Indexing performance: Deployment and Indexing performance: Instance dashboards show indexing performance across the deployment.
Improve your searches
- Select an index in the first line of your search. The computational effort of a search is greatest at the beginning, so searching across all indexes (index=*) slows down a search significantly.
- Use the TERM directive. Major breakers, such as a comma or quotation mark, split your search terms, increasing the number of false positives. For example, searching for average=0.9* searches for 0 and 9*. Searching for TERM(average=0.9*) searches for average=0.9*. If you aren't sure what terms exist in your logs, you can use the walklex command (available in version 7.3 and higher) to inspect the logs. You can use the TERM directive when searching raw data or when using the tstats command.
- Use the tstats command. The tstats command performs statistical queries on indexed fields, so it's much faster than searching raw data. The limitation is that because it requires indexed fields, you can't use it to search some data. However, if you are on 8.0 or higher, you can use the PREFIX directive instead of the TERM directive to process data that has not been indexed while using the tstats command. PREFIX matches a common string that precedes a certain value type.
For more information on these recommendations, watch the 2020 .Conf presentation, How to get the most out of your lexicon. Splunk's technical documentation also provides useful tips for search optimization.
Other steps to take
- Run a health check. Access and customize the health check to expose issues with source types, among other things.
- Improve your source types. Review the data quality dashboards to identify and resolve data quality issues.
- Peek at pipelines. Review the indexing performance dashboards to identify any issues or load in a particular pipeline.
- Secure your Splunk. Review the safeguards for risky commands in the Splunk Enterprise Securing Splunk Enterprise Manual.
- Use tokens to build high-performance dashboards. Searches saved in dashboards can use tokens to allow users to switch between commands. When the token is in a child search, only the child search is updated as the token input changes. The base search, which can contain the index and other costly functionality, only needs to run once, which speeds up the search overall.
- Update your configuration files. If you never use the TERM directive, you can turn off the major breakers in your segmenters.conf file by moving all the minor breakers to the major breakers field in the [search] section of this configuration file. Doing so reduces bucket size but increases CPU usage. In Splunk Enterprise version 7.2.x and higher, using the zstd compression algorithm in the indexes.conf file, rather than gzip, also makes buckets smaller, thereby increased search speed. Finally, you can also update the tsidxWritingLevel to 3 in the indexes.conf file in Splunk Enterprise version 7.3.x and higher. Doing so takes advantage of newer tsidxfile formats for
metrics and log events that decrease storage cost and increase.
- Preload expensive datasets using loadjob. The loadjob command uses the results of a previous search. If you run a lengthy search in one browser tab and keep it open, the data remains on the search head for some time, as long as you keep the tab open. Eventually, the search will time out, but while it is available, you can run other searches based off that initial data using the search job id (sid).
- Test your search string performance. The Search Performance Evaluator dashboard allows you to evaluate your search strings on key metrics, such as run duration (faster is better), the percentage of buckets eliminated from a search (bigger is better), and the percentage of events dropped by schema on the fly (lower is better).