Using the Performance Insights for Splunk app: Installation and diagnostic process
Installation
PerfInsights is available on Splunkbase. To install PerfInsights, access your Splunk platform UI as an administrator and select Apps > Find More Apps. Select Generic, Utilities and search for “Performance Insights”. Locate the application and install it.
After installation, you will need to wait for the application to be replicated to the search heads before it is fully functional. You might see search errors if you try to use it too soon. If this happens, wait a few minutes and try again.
Diagnostic process
PerfInsights is typically used with either a reactive approach or a proactive approach to diagnose performance issues.
- Reactive diagnosis occurs when a problem has already appeared, and the goal is to discover its cause. In these cases, the system's actual behavior, from a user’s perspective, has deviated from expected behavior. An example might be searches reporting warnings that some data is missing.
- Proactive diagnosis involves using PerfInsights to examine system metrics and identify trends that could lead to future issues. For example, increasing search times could lead to increased search concurrency and, eventually, skipped searches.
Reactive diagnosis
To conduct a reactive diagnosis:
- Focus the tool's dashboards around the point in time when the issue first appeared.
- Observe what was happening in the system at that time to see what might have caused the unexpected behavior. Look for:
- High or saturated resources (for example, CPU, memory, network I/O, disk I/O).
- High or saturated queues (for example, search or indexing queues, replication queues).
- Plateaus (for example, search or indexing throughput).
- Sudden spikes (for example, search concurrency, error rates).
- Anything that looks unusual.
- Separate cause and effect. Adjust the time range to determine the order of abnormal events.
- Investigate abnormal events starting from the earliest.
- For each abnormal event, determine if it is a contributing cause, a symptom, or an unrelated event. This often requires additional searches or tools beyond PerfInsights.
Proactive diagnosis
To conduct a proactive diagnosis:
- Start with a large time range, ending at the current time (for example, past 7 days to now).
- Look for charts that are trending upwards (for example, CPU, memory, search times, queue lengths).
- Look for spikes in charts that last longer or become more frequent (for example, search concurrency, bundle replication).
- Repeat with smaller time ranges (for example, past 1 day or 1 hour). While long time ranges can show trends, they can also hide granular data. Spikes that might be aggregated away over a week might become visible over an hour.
- For each potentially problematic behavior observed, try to understand the cause. For example, if search times are increasing, you might also find that your data ingestion rate has been increasing. Whether this becomes an issue depends on the system's current state and any further increases in data ingestion rates.
- Proactive diagnosis is subject to the observer effect, where the act of observing the system affects its behavior.
- Be cautious when observing unstable systems, as the extra search load could further destabilize the system.
- When reading PerfInsights charts and tables, consider the additional search load the tool is adding.
Deeper dives
PerfInsights aims to be a single application that helps identify all potential performance issues and provides as much detail as possible to resolve them. However, in some cases, PerfInsights might not provide sufficient information to complete the performance analysis. When this occurs, use PerfInsights as a starting point for further investigation.
Expand existing charts
Sometimes existing charts are close to what you need but require some adjustments. Using the magnifying glass link on a chart opens the search in a new window, allowing for customization. All dashboards are editable, so if you find yourself making these customizations frequently, you can modify the search directly in the dashboard or add a new chart. Be aware that reinstalling the application will overwrite your changes. If your change could benefit others, consider sending a message to the support team with the suggestion.
Add system data to indexes
By default, not all OS-level logging is available in indexes. This information can be very useful when investigating performance issues. On Splunk Cloud Platform deployments, consider adding local forwarders on your servers configured to read system logs and index them.
Forming and testing hypotheses
Performance issues are often complex and multifaceted. Determining root causes can be tricky, and even sound and logical reasoning can lead to incorrect conclusions if not all data is considered. Resolving issues can be an iterative process.
After you’ve observed the situation and developed a reasonable explanation for the behavior, it’s time to test that hypothesis. Even if you started with a reactive diagnosis, you will iterate through proactive diagnoses here.
For each iteration:
- Start by identifying a minimal set of changes that could correct the issue.
- Use Splunk Help, Splunk Support, or the Splunk Community to help you identify appropriate changes.
- Change only one thing at a time, if possible.
- Using the same charts that helped you build your hypothesis, verify a positive change over similar time periods that exposed the issue.

