Skip to main content

 

Splunk Lantern

Using AI-assisted thresholding in Splunk ITSI

Splunk ITSI (ITSI) provides AI-assisted capabilities to help you configure and maintain KPI thresholds more effectively. These features analyze historical data to recommend appropriate thresholds and detect when thresholds have gradually shifted over time, potentially masking developing problems.

This article is part of the The definitive guide to best practices for IT Service Intelligence, which provides essential guidelines to ensure optimal operations and an excellent end-user experience, helping you to unlock the full potential of ITSI.

Prerequisites

Not all of these might be required for every AI feature, but having them all installed ensures full functionality across the AI capabilities in ITSI.

How to use Splunk software for this use case

AI-assisted thresholding

AI-assisted thresholding analyzes your historical or back-filled KPI data to recommend appropriate threshold values. Previously, configuring time-based thresholds required manually specifying what the high and low values should be at each point in the day - for example, defining that at 10 AM your threshold should be X and at 10 PM it should be Y. This was time-consuming and error-prone, even when using standard deviation calculations. AI-assisted thresholding automates this process by learning patterns from your data.

The AI examines patterns in your KPI data over time and calculates threshold recommendations based on observed behavior. This is particularly useful for KPIs with cyclical patterns where manual threshold configuration would be tedious and difficult to maintain accurately.

When the AI generates threshold recommendations, it provides an explanation with a description of why these specific thresholds are recommended based on the analyzed data. An example of this is shown in the screenshot below.

clipboard_9cabf811-18cb-47a6-8c0f-47f901b0a108.png

It also provides a visual preview showing:

  • Where your current thresholds would have triggered alerts historically
  • Where the proposed thresholds would have triggered alerts over the same period

An example of this visual preview is shown in the screenshot below.

clipboard_4d22e814-2a6d-44c4-9258-d8edc53ea55a.png

AI recommendations should be reviewed before applying. The visual preview is essential for validating whether the suggested thresholds are appropriate:

  • If the preview shows minimal red alerts during periods when you know the system was operating normally, the thresholds are likely appropriate.
  • If the preview shows excessive red alerts that shouldn't occur, the suggested thresholds are not suitable for your use case and will generate noise rather than actionable signals.

Only apply the recommended thresholds after confirming that the visual preview aligns with your expectations for normal and abnormal system behavior.

Drift detection

Drift detection identifies situations where adaptive thresholds have gradually shifted over time, potentially masking developing problems.

When using adaptive thresholds, ITSI learns what "normal" looks like based on historical patterns and recalculates these thresholds periodically. However, if a metric slowly creeps upward over weeks or months, the adaptive threshold adjusts accordingly, continuously accepting the new values as normal.

For example, a system might originally operate at 60% CPU utilization. If utilization gradually increases to 70%, then 80%, then 90%, the adaptive threshold keeps adjusting and never triggers an alert. Eventually, the system considers 95% CPU utilization as "normal" - leaving no headroom for spikes and putting you at risk of resource exhaustion without warning.

Drift detection monitors your adaptive thresholds for gradual shifts and creates a notable event when significant drift is detected. This alerts you to investigate whether the drift represents a legitimate change in system behavior that should be accepted or whether the drift indicates a developing problem that needs attention before it causes an outage.

clipboard_e41c6acb-72e5-4425-b498-3f6a34993ed6.png

    Additional resources

    For more information on using the ITSI Configuration Assistant, see Splunk Help.

    These resources might also help you understand and implement this guidance:

    • Splunk OnDemand Services: Use these credit-based services for direct access to Splunk technical consultants with a variety of technical services from a pre-defined catalog. Most customers have OnDemand Services per their Success Plan. Engage the ODS team at ondemand@cisco.com if you would like assistance.