Skip to main content


Splunk Lantern

Monitoring Kubernetes pods


You've got your Kubernetes data into Splunk Observability Cloud, and now you're not sure what to do next. There are seemingly limitless ways to use Splunk software to achieve different use cases, and you need to start learning how to get value from the platform.

Some questions you might have about using Splunk Infrastructure Monitoring to monitor your Kubernetes environment include:

  • How can I identify which pods are failing or stuck in a pending state?
  • How can I ensure that the number of running instances matches what I expect?
  • How do I know if appropriate resource limits have been applied, and if any pods are exceeding those limits?

This article is part of the Splunk Use Case Explorer for Observability, which is designed to help you identify and implement prescriptive use cases that drive incremental business value. It explains the solution using a fictitious example company, called CSCorp, that hosts a cloud native application called Online Boutique. In the AIOps lifecycle described in the Use Case Explorer, this article is part of Infrastructure monitoring.

​Data required

Kubernetes data

How to use Splunk software for this use case

Splunk experts have recorded five-minute videos on some high-value foundational use cases. Each video is self-contained, so you can pick the capabilities most relevant to you and your organization. 

We recommend that new users complete at least two or three of these capabilities to get comfortable with the Splunk Observability Cloud platform and its basic functionality.

By completing these capabilities, you will learn by doing and acquire the following skills:

  • Interpret the built-in dashboards / Infrastructure Navigator to understand your Kubernetes environment
  • Build custom detectors / visualizations
  • Set-up custom alerts and notifications
  • Interpret detectors, visualizations, and dashboards to complete basic troubleshooting

clipboard_e0edfdd71ca90f963d7692217b7afc029.png Video 1 - Detect Kubernetes nodes running out of resources or pods that are in a pending phase

Access this video here.

In organizations that use the Kubernetes container management platform, it is common for nodes to run out of resources, rendering your applications unable to scale. It is imperative that Kubernetes nodes are monitored carefully to ensure you can take action quickly when this happens. In this video, you'll learn how to create a detector in Splunk Infrastructure Monitoring that monitors for this situation.

clipboard_ea83d4b7cc3625b3507a1afe05408dd33.pngVideo 2 - How to monitor CPU utilization for no-limit pod configuration situations

Access this video here.

In the first video, you configured your Splunk Infrastructure Monitoring detector for pods that are in a pending state due to running out of resources. In a situation when pod limits are not set, it can consume more CPU usage than intended. In this video, you'll learn how to monitor node CPU usage in that situation so you can prevent impact to your customers.

clipboard_ed41978246e25bf1fd54fbd42df826e08.pngVideo 3 - How to create alerting – Splunk On-Call, Email, Slack

Access this video here.

When a pending state is detected, every second counts - you'll need to quickly alert the Kubernetes support staff who can resolve this incident. Each organization is unique in how it alerts its support team members, whether that happens over email, Slack, or Splunk On-Call. In this video, you'll learn how to configure alerting your staff for all of these platforms so your support team can take action straight away.

clipboard_e5c38722aa2a669ce021324adc91d1e2d.png Video 4 - How to review pod status in the Kubernetes navigator: running vs desired # of pods, pods in pending status, failed pods

Access this video here.

In this video for Kubernetes administrators you'll learn how to easily view pod status so that you can quickly and easily troubleshoot and reduce your Mean-Time-To-Repair (MTTR).

Next steps

Still having trouble? Splunk has many resources available to help get you back on track.