As DevOps teams develop applications in the cloud, they need visibility into the performance of their apps and infrastructure. Without a centralized observability team or a platform engineering team, each DevOps team likely acquires its own monitoring tools with two negative outcomes:
- When a performance issue or an outage occurs, it’s hard for individual teams to know whether their service caused the issue. To resolve the issue all the teams join a call and share what they see on their dashboards. However, each dashboard presents a different picture and there is no single source of truth for the state of IT. The result is long MTTR.
- With each team monitoring their own services in a silo, it is common for developers to instrument and send more data than needed, or to turn on monitoring features and then forget about them. As a result, one team can easily burn through the entire monitoring budget that the organization set in a very short time.
Many organizations try to resolve the problem described above by creating a centralized observability team or a platform engineering team to provide DevOps with the tools they need. However most existing monitoring tools in the market have one or both of the following limitations:
- They do not support sharing of dashboards, alerts, and best practices, making it difficult for different DevOps teams to create a single source of truth. Because of this, MTTRs remain long.
- There are no or few cost controls in place, so individual teams can still burn through their entire monitoring budget.
How can Splunk help?
Splunk is one of the largest contributors to OpenTelemetry and uses OpenTelemetry natively as its main source of telemetry. The Splunk Distribution of the OpenTelemetry Collector provides a number of capabilities to help customers build their own observability practice.
- Configuration of the OpenTelemetry Collector includes integrations to the major cloud vendors and applications. You can walk through a wizard of options to construct precisely the command needed to deploy your Collector easily. When the integration is complete, the data is collected in the form of metrics, traces, and logs that populate out-of-the-box content.
- Navigators help you explore your tech stack. You can use navigators that Splunk Observability Cloud provides or create your own customizations that pre-filter or show custom views of the data.
- Splunk Observability Cloud provides a number of detectors that are automatically observed in your environment which can be configured in many ways. For example, if a detector has been fired recently, AutoDetect detectors can run automatically but only inform an external entity, user, or a system when it is subscribed. Users can also create alert customizations either for a subset of the environment or universally.
- Metrics pipeline management helps customers control their metric usage. You can aggregate metrics or drop metrics that are not being fully utilized without updating configurations on your servers.
- Splunk Observability Cloud provides a number of ways to automate processes with rich APIs and a Terraform provider.
Watch the following video to learn more.
With Splunk Observability Cloud, platform engineers provide software engineers with observability tools that provide all the DevOps teams with a single source of truth, enable them to share best practices across the org, and help them collaborate to minimize MTTR. At the same time, the platform engineers can monitor and maintain access and cost control of these observability tools so that everyone operates within budget.