A common use case for Splunk ITSI is to alert on degraded services. New users might think that the best way to configure those alerts is to use KPI alerting. However, common alerting requirements require a lot of configuration for each service. For example:
- Correlation. Group KPI alerts across services, as well as with alerts from external monitoring tools.
- Enrichment. Enrich alerts with metadata like runbook instructions or drilldown URLs.
- Routing. Configure who should be notified to allow teams to self-serve.
- Intelligent alerting. Create alerts based on abnormal or concerning service behavior, and throttle alert actions to reduce noise.
- Auto-clear. Configure individual alerts to clear and episodes to auto-clear when all alerts are cleared.
You need a faster, more automated method of creating and maintaining these essential alerts.
This article is part of the Definitive Guide to Best Practices for IT Service Intelligence. Splunk ITSI end users and administrators will benefit from adopting this practice as they work on Event Analytics.
The Content Pack for ITSI Monitoring & Alerting offers robust capabilities, is highly performant, and requires little configuration for Splunk ITSI admins. The content pack
- Correlates alerts within a service.
- Correlates alerts across multiple services and with alerts from external monitoring tools.
- Creates notable events for any KPI that remains unhealthy for 15 minutes.
- Generates clearing alerts when any service or KPI returns to normal.
- Raises or lowers episode severity and auto-clears the episode when all alarms have cleared.
- Alerts intelligently by only paging out when a critical notable event has been seen.
By using this content pack, you enable most of your service degradation alerting requirements with out of-the-box functionality that just works. The basic steps are:
- Ensure you have installed the Splunk App for Content Packs.
- From the Splunk ITSI main menu, click Configuration > Data Integrations.
- Enable recommended aggregation policies. Typically, you'll want Episodes by ITSI Service and Episodes by Alert Group.
- Enable recommended pre-built correlation searches. Typically, you'll want Service Monitoring and Episode Monitoring flagged as (Recommended) in the UI.
- Configure remaining customizations, such as alert routing and enrichment.
For detailed installation instructions, see the Content Pack for ITSI Monitoring and Alerting manual.
This content comes from the .Conf23 session, The Definitive List of Best Practices for Splunk® IT Service Intelligence: How to Configure, Administer, and Use ITSI for Optimal Results. In the session replay, you can watch Jason Riley and Jeff Wiedemann share the many awesome best practices they've amassed for designing key performance indicators (KPIs), services, episodes, and machine learning to maximize end-user experience and insights. Whether you're new or experienced, you'll come away with tactical guidance you can use right away.
You might also be interested in the following Splunk resources:
- Splunk Docs: Event analytics manual