Skip to main content
 
Splunk Lantern

Top ten highest network utilization queries

 

Queries that result in large transfers of data into or out of the database may be a sign of an inefficient design and can place a strain on the networking infrastructure between the client and the database. You want to create a list of the top ten so that you can work with the owners to refactor such queries.

Data required   

Database data

Procedure 

  1. Verify that you have deployed Splunk Stream.
  2. Run the following search. You can optimize it by specifying an index and adjusting the time range.
|stats count sum(bytes) AS total_bytes BY query
|eval avg_bytes_per_execution = round(total_bytes/count, 2)
|sort - total_bytes
|head 10

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation

|stats count sum(bytes) AS total_bytes BY query

Count each query, as well as the total bytes used by that query. 

|eval avg_bytes_per_execution = round(total_bytes/count, 2)

Return the average bytes per query execution (rounded to two decimal points) by dividing the total bytes by the number of queries. 

|sort - total_bytes

Sort the results according to total bytes with the largest value first.

|head 10

Display the top 10 results in a table.

Next steps

The output from the query includes the text for each of the SQL statements, the count of the number of times the statement appears, the total bytes in the result set, and the average number of bytes for each statement. The following table is a sample result.

query count total_bytes avg_bytes_per_execution

SELECT * FROM hr_employee_info

1700

16994436

9996.73

SELECT password FROM SYS.USER$ WHERE name='SYS'

887

678370

764.79

SELECT * FROM customers WHERE customer_uid=3bf48b89-77b0-4cc2-ab5b-afbe08dc1e96

22

52722

2396.45

SELECT * FROM customers WHERE customer_uid=b56a5c74-87a3-4593-9eb5-2d2b1c38a0d8

22

45145

2052.05

This sample output is telling in several ways. First, large result sets, such as the SELECT * FROM result in the first row, should not be expected. If they occur often, they might be indicative of inefficient SQL. Owners should be notified when their queries are in the top ten largest data transfer queries. When possible, and if appropriate, owners should refactor the queries to transfer less data. They may also indicate a security concern if the data is of a sensitive nature, such as human resources data and passwords. The example above shows both. 

Potential next steps are to correlate this database activity with applications and users to put it into a business context. If user or application id is in the stream, you could add it to the split by clause of the stats command like this:

| stats count sum(bytes) AS total_bytes BY query user

Lastly, you could vary the sort by fields to look at count or avg_bytes_per_execution and use the results to determine the most beneficial troubleshooting steps to take. 

Finally, you might also want to look at similar searches in our article Analyzing wire data from databases.