Summarizing high-cardinality metrics by using metrics pipeline management
You are a site reliability engineer (SRE) for your organization in charge of monitoring observability ingest usage for your team. You need to make sure you stay within your company’s budget.
You notice your team's metrics usage has recently increased. You obtain a detailed metrics usage report that gives you insights into the metrics volume, high cardinality dimensions, usage of the metrics in charts and detectors, and distribution of metrics.
The metrics usage report shows your team sends 128 metric time series (MTS) for the
k8s.container.restarts metric to Splunk Observability Cloud. You know based on discussions with your team that not all the data is necessary at full granularity. To understand more about the cardinality of different dimensions, you review the report and notice that the
container.id dimension is the highest cardinality dimension for
You know your team cares most about Kubernetes (k8s) container names when it comes to k8s restarts, so they only need to monitor the
k8s.container.names dimension. The
container.id dimension is not information they need to monitor.
You need to discard the
container.id from the data being sent to Splunk Observability Cloud.
In Splunk Observability Cloud, you can use metrics pipeline management to create an aggregation rule that reduces the cardinality of
k8s.container.restarts by keeping the
k8s.container.names dimension and discarding
- In the left navigation pane, click Metrics Pipeline Management, then + Create new rules.
- Search for the
k8s.container.restartsmetric and click OK.
- Click Add aggregation rule.
- Under Show related dimensions, select container.id.
- In Specify dimensions, select Drop.
- In the New aggregated metric name field, enter
k8s.container.restarts_nameand click Generate Name.
- Next, download the list of charts and detectors that use the
k8s.container.restartsmetric. Click View list of charts and detectors and then, in the pop-up window, click Download.
- For each chart and detector identified in the list, replace
k8s.container.restarts_nameby editing the associated chart and detector in Splunk Observability Cloud. You now have a new aggregated
k8s.container.restarts_namemetric that yields an acceptable MTS level.
- You can now drop the unaggregated raw metric that the team no longer needs to monitor. Do this by selecting
k8s.container.restartson the Metrics Pipeline Management page to view current rules for the metric, and change Keep data to Drop data. Then click Save.
- Verify the new metric volume after dropping the data you don’t need, and save the rules.
By combining aggregation and data dropping rules, you have successfully summarized a high cardinality metric, creating a more focused monitoring experience for your team while minimizing storage costs for the company.
These resources might help you understand and implement this guidance:
- Product tip: Using high-cardinality metrics in monitoring systems
Still need help with this use case? Most customers have OnDemand Services per their license support plan. Engage the ODS team at OnDemand-Inquires@splunk.
com if you require assistance.