Using SRE golden signals for KPIs
Splunk ITSI administrators often struggle to extract meaningful KPIs from service owners when building new Business Services. And often, ITSI administrators don’t understand the service well enough to propose meaningful KPIs. In these instances, a framework to help identify meaningful KPIs is needed.
This article is part of the Definitive Guide to Best Practices for IT Service Intelligence. ITSI end users will benefit from adopting this practice as they work on Service Insights.
Solution
In Adopting monitoring frameworks - LETS, we discuss the SRE Golden Signals which establish benchmarks for each metric showing when the system is healthy – ensuring positive customer experiences and uptime. However, these Golden Signals aren't just valuable to SREs; we can apply a business lens to the SRE Golden Signals to create meaningful business-centric KPIs perfectly suited for ITSI. The following table shows how you can apply a business lens to the golden signals, and where you can pull the necessary data from to measure each one.
Golden Signal | Business Service Context |
---|---|
Response time. Is the service running slower than usual? | Login response time |
Volume. Are we experiencing much higher or lower traffic than usual? | Login volume |
Error rate. Is the service producing more errors than usual? | Login error rate |
Saturation. Will the service slow down or break if we get more volume? | Concurrent users |
While a team could always monitor more metrics or logs across the system, the four golden signals are the essential building blocks for any effective monitoring strategy. Common data sources for these golden signals are as follows:
- Access logs (Apache/IIS access, AWS Cloudtrail, Linux Secure)
- Database records (from custom tables using Splunk DBConnect)
- Custom application logs
- APM tools
- Synthetic monitoring
Next steps
You might also be interested in the following Splunk resources:
- Splunk Docs: Service insights manual
- Blog: SRE Metrics: Four golden signals of monitoring