Configuring the ITSI Notable Event Aggregation Policy
You want to configure and enable the Notable Event Aggregation Policy (NEAP) to process notable events so they can be grouped into meaningful Splunk ITSI episodes.
Before following these steps, make sure you have done the following:
- Integrated Observability Cloud alerts with Cloud or Enterprise
- Normalized Observability Cloud alerts into the ITSI Universal Alerting schema
- Configured Universal Correlation Searches to create notable events
Solution
The diagram below shows the overarching architecture for the integration that's described in Managing the lifecycle of an alert: from detection to remediation. The scope for this article is indicated by the pink box in the diagram.
In this article, you'll learn how to configure the Notable Event Aggregation Policy (NEAP) within the Content Pack for ITSI Monitoring and Alerting. The NEAP does two things:
- Determines how notable events should be grouped into the episodes that are presented in Splunk ITSI.
- Processes action rules that control the behavior of the episode, such as when to create an incident in Splunk On-Call (formerly VictorOps), or when to auto-close the episode if all notable events in the episode have been cleared. Note that this use case doesn't cover the full configuration of these action rules. After you've completed the procedure laid out into this use case, you'll need to configure the Splunk On-Call integration with IT Service Intelligence then configure action rules in the ITSI Notable Event Aggregation Policy for Splunk On-Call Integration.
The resulting episodes are stored in a Splunk index called ITSI_Grouped_Alerts
. The episodes can reduce alert noise significantly since although the alerts might be initiated from Splunk Real User Monitoring, Splunk Synthetic Monitoring, or Splunk Application Performance Monitoring, you end up with one actionable episode rather than many individual events.
Configuration
- If you haven't already, download this ITSI Backup file and use the ITSI Backup/Restore utility to restore the artifacts into your instance of Splunk ITSI.
- On the Notable Event Aggregation Policies page, find "Episodes by Application/SRC o11y" and click it.
- The first tab, as seen below, is the "Filtering Criteria and Instructions". This defines what notable events are being evaluated. Note the first grouping below where certain notable events are excluded. This is specific to this solution since you don’t want to evaluate Splunk ITSI service health or KPI-generated notable events. You only want to evaluate events coming in from your Splunk Observability Cloud alert detectors.
- Scroll down and review other policy settings. Specifically note the following two:
- Split events by field. This is set to app_name. This represents metadata from Splunk Observability Cloud that identifies the application.
- Episode information > Episode Title. The %% fields will be resolved as variables or metadata coming over from a Splunk Observability Cloud alert (detector).
- Click the Action Rules tab to review the actions configured to create the Splunk On-Call (VictorOps) incident, as well as the closing of the Splunk On-Call incident when the Splunk Observability Cloud alert is cleared.
- The first rule needs a source name of "Episode Monitoring - Trigger OnCall Incident" within the notable alerts (itsi_tracked_alerts index). The source correlates to the originating correlation search, which indicates it is time to create a new Splunk On-Call incident. When found, a comment is added and a call to the "Create VictorOps Incident" integration is performed.
- The second rule needs a source name of "Episode Monitoring - Set Episode to Highest Alarm Severity o11y" within the notable alerts (itsi_tracked_alerts index). The source correlates to the originating correlation search, which indicates it is time to close the Splunk On-Call incident. The rule also needs the "set_episode_status" to equal 5. When both equate to true, then the episode status is changed to "Closed", a comment is added to indicate closure, and the "Create VictorOps" integration is called with a value of "RECOVERY" to indicate the incident should be closed. This allows for synchronization between the Splunk ITSI episode and the Splunk On-Call incident.
Next steps
Now that you’ve successfully reviewed the configuration of the Splunk ITSI Notable Event Aggregation Policy (NEAP), continue to the next article to configure ITSI correlation searches for monitoring episodes.
Still having trouble? Splunk has many resources available to help get you back on track.
- Splunk Answers: Ask your question to the Splunk Community, which has provided over 50,000 user solutions to date.
- Splunk Customer Support: Contact Splunk to discuss your environment and receive customer support.
- Splunk Observability Training Courses: Comprehensive Splunk training to fully unlock the power of Splunk Observability Cloud.