Note: This article only applies to Splunk Cloud.
Splunk works fine out of the box. As you increase load on your system, though, you'll want to get familiar with ways to enhance its ability to handle that load. We’ll show you how to identify the cause of slow searches and review possible trouble spots in your deployment.
Do more with less through search optimization
Slow searches can be caused by inefficient search practices, but they can also be caused by poor data quality. You can find remarkable performance improvements when you resolve issues such as incorrect event breaks and time stamp errors in the data. Inefficiencies like these can cause indexers to work overtime both when indexing data and finding the search results. If your searches run more efficiently, you can earn the following benefits:
- Faster loading dashboards
- User experience is improved with faster completing searches
- User productivity improves as run\test cycles are accelerated
- Better performance enabling more use cases
- Improvements of x10 and x100 allow users to attack new problems
- Users can examine weeks and months of data, instead of just hours and minutes
- Reduced need for precomputation (summaries)
- Reduced server load
- More users supported on less hardware
- ROI on hardware investment improved
Use the Monitoring Console
Use Splunk Cloud Monitoring Console (CMC) dashboards to determine if any searches have performance issues that need attention. The CMC enables you to monitor Splunk Cloud deployment health and to enable platform alerts. You can modify existing alerts or create new ones. You can interpret results in these dashboards to identify ways to optimize and troubleshoot your deployment.
- Search Usage Statistics. This dashboards shows search activity across your deployment with detailed information broken down by instance.
- Scheduler Activity. This dashboard shows Information about scheduled search jobs (reports) and you can configure the priority of scheduled reports.
- Forwarders: Instance and Forwarders: Deployment. These dashboards show information about forwarder connections and status. Read about how to troubleshoot forwarder/receiver connection in Forwarding Data.
Improve your searches
- Select an index in the first line of your search. The computational effort of a search is greatest at the beginning, so searching across all indexes (index=*) slows down a search significantly.
- Use the TERM directive. Major breakers, such as a comma or quotation mark, split your search terms, increasing the number of false positives. For example, searching for average=0.9* searches for 0 and 9*. Searching for TERM(average=0.9*) searches for average=0.9*. If you aren't sure what terms exist in your logs, you can use the walklex command to inspect the logs. You can use the TERM directive when searching raw data or when using the tstats command.
- Use the tstats command. The tstats command performs statistical queries on indexed fields, so it's much faster than searching raw data. The limitation is that because it requires indexed fields, you can't use it to search some data. However, you can use the PREFIX directive instead of the TERM directive to process data that has not been indexed while using the tstats command. PREFIX matches a common string that precedes a certain value type.
For more information on these recommendations, watch the 2020 .Conf presentation, How to get the most out of your lexicon. Splunk's technical documentation also provides useful tips for search optimization.
Other steps to take
- Improve your source types. Review the data quality dashboards to identify and resolve data quality issues.
- Check the HTTP Event Collection status. If you have set up the HTTP Event Collector, you can use it to monitor the progress of a token. Click here for a video that demonstrates how.
- Use tokens to build high-performance dashboards. Searches saved in dashboards can use tokens to allow users to switch between commands. When the token is in a child search, only the child search is updated as the token input changes. The base search, which can contain the index and other costly functionality, only needs to run once, which speeds up the search overall.
- Preload expensive datasets using loadjob. The loadjob command uses the results of a previous search. If you run a lengthy search in one browser tab and keep it open, the data remains on the search head for some time, as long as you keep the tab open. Eventually, the search will time out, but while it is available, you can run other searches based off that initial data using the search job id (sid).
- Test your search string performance. The Search Performance Evaluator dashboard allows you to evaluate your search strings on key metrics, such as run duration (faster is better), the percentage of buckets eliminated from a search (bigger is better), and the percentage of events dropped by schema on the fly (lower is better).