Skip to main content

 

Splunk Lantern

Monitoring Kubernetes pods

Applicability

  • Product: Splunk Infrastructure Monitoring
  • Feature: Kubernetes integration

Problem

You've got your Kubernetes data into Splunk Observability Cloud, and now you're not sure what to do next. There are seemingly limitless ways to use Splunk to achieve different use cases, and you need to start learning how to get value from the platform.

Some questions you might have about using Splunk Infrastructure Monitoring to monitor your Kubernetes environment include:

  • How can I identify which pods are failing or stuck in a pending state?
  • How can I ensure that the number of running instances matches what I expect?
  • How do I know if appropriate resource limits have been applied, and if any pods are exceeding those limits?

Solutions

Splunk experts have recorded short, 5-minute videos on some high-value foundational use cases. Each video is self-contained, so you can pick the capabilities most relevant to you and your organization. 

We recommend that new users complete at least 2-3 of these capabilities to get comfortable with the Splunk Observability Cloud platform and its basic functionality.

By completing these capabilities, you will "learn by doing" and acquire the following skills:

  • Interpret the built-in dashboards / Infrastructure Navigator to understand your Kubernetes environment
  • Build custom detectors / visualizations
  • Set-up custom alerts and notifications
  • Interpret detectors, visualizations, and dashboards to complete basic troubleshooting

clipboard_e0edfdd71ca90f963d7692217b7afc029.png Video 1 - Detect Kubernetes nodes running out of resources or pods that are in a pending phase

Access this video here.

In organizations that use the Kubernetes container management platform, it is common for nodes to run out of resources, rendering your applications unable to scale. It is imperative that Kubernetes nodes are monitored carefully to ensure you can take action quickly when this happens. In this video, you'll learn how to create a detector in Splunk Infrastructure Monitoring that monitors for this situation.

clipboard_ea83d4b7cc3625b3507a1afe05408dd33.pngVideo 2 - How to monitor CPU utilization for no-limit pod configuration situations

Access this video here.

In the first video, you configured your Splunk Infrastructure Monitoring detector for pods that are in a pending state due to running out of resources. In a situation when pod limits are not set, it can consume more CPU usage than intended. In this video, you'll learn how to monitor node CPU usage in that situation so you can prevent impact to your customers.

clipboard_ed41978246e25bf1fd54fbd42df826e08.pngVideo 3 - How to create alerting – Splunk Oncall, Email, Slack

Access this video here.

When a pending state is detected, every second counts - you'll need to quickly alert the Kubernetes support staff who can resolve this incident. Each organization is unique in how it alerts its support team members, whether that happens over email, Slack, or Splunk OnCall. In this video, you'll learn how to configure alerting your staff for all of these platforms so your support team can take action straight away.

clipboard_e5c38722aa2a669ce021324adc91d1e2d.png Video 4 - How to review pod status in the Kubernetes navigator: running vs desired # of pods, pods in pending status, failed pods

Access this video here.

In this video for Kubernetes Administrators, you'll learn how to easily view pod status so that you can quickly and easily troubleshoot and reduce your Mean-Time-To-Repair (MTTR).


What to do if you get stuck: 

Still having trouble? Splunk has many resources available to help get you back on track. We recommend the following:

Splunk OnDemand Services: Credit-based services that allow direct access to Splunk technical consultants for a variety of technical services from a pre-defined catalog. Many Splunk customers already have OnDemand credits included as part of their software license. To request OnDemand Services, file a ticket through the Support Portal.

At this stage of your journey, the following OnDemand tasks may be most helpful:

  • Assist with building a simple dashboard / chart
  • Create a simple detector

Splunk Answers:  Ask your question to the Splunk Community, which has provided over 50,000 user solutions to date.

Splunk Customer Support: Contact Splunk to discuss your environment and receive customer support.

Splunk education resources: 

To fully unlock the power of Splunk, we strongly recommend our comprehensive Splunk training. At this stage in your journey, we recommend the following courses:

Next steps 

Now you're doing more with your Kubernetes data, get even more value through implementing use cases, or find out how to get data in from additional data sources.