Skip to main content
 
 
Splunk Lantern

Using On-Call reporting to improve your team performance

 

You probably already know how Splunk On-Call has helped your organization manage the incident response process from start to finish. You have used Splunk On-Call to streamline your responder's schedules and reduce on-call fatigue. Now that the up-front processes are running more smoothly, you can to use the data and lessons learned from your incident response to improve how your organization operates.

Solution

Reporting in Splunk On-Call gives you clear, valuable information to improve your processes, teams, and business operations. Use each of the reports listed below to refine your internal processes. You can access them under User Interface > Reports.

Post-Incident Review Report

The Post-Incident Review report helps analyze events surrounding either a single incident, or a range of time. This report provides a post-mortem for a single incident. It contains:

  • A detailed timeline of events related to this incident
  • Notes for events in the timeline
  • Alert details and payloads
  • Action items you have noted

This is an interactive reporting tool that you can use to review and tune your incident response policies and procedures. In addition to providing an audit trail of events, it allows you to track:

  • Decisions that were made, reasons for making those decisions, and tool sets used
  • Detailed steps that were taken to resolve the incident
  • Contributing factors identified and details about the affected system
  • Action items and prioritization of tasks

Response Metrics Report

Metrics, such as the time it took to acknowledge (MTTA) and recover (MTTR) from an incident, provide a way to score your procedures and team's actions. Use the Response Metrics Report to view these scores.

If you have high values for these metrics, here are some suggestions for improving your response score:

  • Review team organization and responsibilities
  • Re-examine and update contact methods for your responders
  • If MTTA is too high, survey responders to understand the reason for slow starts
  • Research if and how teams work together to resolve an issue
  • Determine if responders have the necessary tools or access to systems
  • Reduce alert overhead by finding sets of alerts that can be grouped into a single incident

On-Call Review Report

The On-Call Review Report provides an overview of your team's workload and an individual's workload. This report shows how much time is spent on-call and the incidents worked during that time. With this information, you can validate that team members are allocated wisely. An individual's workload can be:

  • The number of on-call hours
  • Involvement for specific incidents in Splunk On-Call

The information in this report can be useful for making decisions regarding:

  • Staffing needs
  • Compensation needs
  • Workload balance within or across teams
  • Team member performance

Incident Frequency Report

The Incident Frequency Report helps you understand patterns in the frequency of incidents over larger time periods - days, weeks or even months. You can visualize the number of incidents grouped by type, team, service, integration, and other characteristics. Use this report to help identify:

  • How often and from where your incidents originate
  • Parts of your platform that need attention
  • Incident trends over time that are impacting your team
  • Incident distribution across responding teams

Next steps

If you found this article useful and want to advance your skills, Splunk Education offers a 4.5-hour, instructor-led course on Splunk On-Call Administration. The hands-on labs in the course will teach you how to:

  • Create new policies and schedules
  • Create teams and add users and managers using both the UI and API
  • Create a routing key using best practices
  • Configure Splunk On-Call integrations
  • Differentiate between the types of reports
  • Track flow of incidents after the fact using the Incident Frequency report
  • Use the Alert Rules Engine to add annotations to an incident and transform an alert
  • Create outgoing Webhooks to extend product functionality
  • Use the public API portal to find details on the public API

Click here for the course catalog where you can read the details about this and other Splunk On-Call courses, as well as register.