Skip to main content
 
Splunk Lantern

Troubleshooting AWS CloudWatch metrics observability

 

You have configured the Amazon Web Services (AWS) integration to connect your AWS account to Splunk Observability Cloud, but you are still experiencing some problems with your data.

Where are your metrics?

Check for AWS API call throttling

AWS has limits on the number of API calls that can be made against its endpoints.

  1. Plot the sf.org.num.awsServiceCallThrottles metric in a new chart.
  2. If data appears, click Data Table beneath the chart to get details.
  3. Use the account, region, namespace, and method information to determine where the problem is.
  4. If you are being throttled, work with AWS to increase your API limits.

Check for metric time series creation throttling

Splunk Observability Cloud limits the number of metric time series you can create, which is 6,000 per minute or more, based on your subscription. This throttle is a funnel. New metric time series will eventually be created. You can do any of the following to understand your limits:

  • Plot the sf.org.limit.metricTimeSeriesCreatedPerMinute metric to see your limit.
  • Plot the sf.org.numMetricTimeSeriesCreated metric to see the number of MTS created.
  • Plot the sf.org.numThrottledMetricTimeSeriesCreateCallsByToken metric to see number of creations throttled.

Check your active metric time series limits

Splunk Observability Cloud limits the number of active metric time series you can have, based on your subscription. New metric time series will not be created until you are under your limit. You can do any of the following to understand your limits:

  • Plot the sf.org.limit.activeTimeSeries metric to see your limit.
  • Plot the sf.org.numActiveTimeSeries metric to see your number of active MTS.
  • Plot the sf.org.numLimitedMetricTimeSeriesCreateCalls metric to see whether new MTS creations are limited.

Where are your tags?

  1. Use Splunk documentation to verify that the Splunk Observability Cloud syncs AWS metadata and tags for that service.
  2. Be patient. It may take up to 15 min for metadata to be synced to your Splunk Infrastructure Monitoring data.
  3. Plot the metric sf.org.num.awsServiceCallThrottles to see if AWS API calls are being throttled for the methods listTags or listTagsForResouce.

Why is your data late?

Data on charts with AWS Cloudwatch data can be 4-10 minutes behind. This is usually due to latency with AWS APIs and nothing can be done in Splunk Observability Cloud to fix the problem. Verify that latency is the issue using the lag rollup.

  1. Plot the metric that is showing up late.
  2. Remove any Analytics Functions.
  3. Click Configure plot.
  4. From the Rollup dropdown select Lag.

The plot shows how late the the data was when it arrived to Splunk Observability Cloud. The time is in ms. For example, 400k ms is over 6 minutes.

Next steps

These additional Splunk resources might help you understand and implement these recommendations: