Skip to main content

 

Splunk Lantern

Identifying database performance issues

Applicability

  • Product: Splunk APM
  • Feature: Analysis
  • Function: Using APM

Problem

Databases have always been the backbone of applications – both web and enterprise. You need to know not just overall statistics about your database, but you must also identify how database performance interacts with the network, operating system, servers, configuration, and even third party dependencies.

You're increasingly finding that in order to resolve issues with your databases, you need to identify long-running, unoptimized, poorly indexed, or heavily resource-consuming queries and isolate the source of the issue very quickly.

Solution

With Splunk APM you can find database issues quickly and view them in context with service performance, speeding your time to resolution in distributed systems, all without requiring instrumentation of your database.

Here’s an example of how to identify database-related problems in Splunk APM.

In the screenshot below, your APM service map is showing a high amount of latency between the order processor and the database, which impacts checkout. Clicking into the database provides context of the most problematic query in that, or any, specific database. By expanding the Database Query Performance feature (on the bottom right-hand side of the screen) you can investigate the scope and radius of the impact.

Database Query Performance clearly shows the performance of your slowest or highest execution queries over time and against historical time periods to help you isolate the problematic query. 

The Tag Spotlight section on the bottom-right of the Database Query Performance screen provides directed troubleshooting to help you understand the issue further. Here you can receive additional context, like the table that the slow queries were acting upon or the services and business workflows impacted by the performance degradation of this query. You can see that the Checkout Business Workflow, identified by the API call to the checkout service, was severely impacted by this issue and hindered end-user experience. 

You can confirm that by focusing the service map only on the Checkout workflow to see the impact on the external client during the time window of the incident.

Additional resources

The content in this guide comes from a previously published blog, one of the thousands of Splunk resources available to help users succeed. In addition, these Splunk resources might help you understand and implement this use case:

  • Was this article helpful?