Synthetic checks for URL response times
Certain websites and URLs, both internal and external, are critical for employees and customers. Using basic synthetic checks to ensure that URLs are returning quickly, or within an expected SLA can help detect problems before they are reported to the help desk.
Data required
Procedure
- Verify that you installed the Website Monitoring app on your search head or heavy forwarder, depending on the availability of the URLs to be checked.
- Using the Create Inputs from the navigation bar of the Website Monitoring app, create one or more inputs. Adjust additional configurations as needed, such as which index to send data, usage of proxy servers, and what results constitute failure. For more information, see the Wiki for this project.
- Run the following search. You can optimize it by specifying an index and adjusting the time range.
index="<name of web ping index>" sourcetype="web_ping" title=* | eval response_code=(case((timed_out == "True"),"Connection timed out",(isnull(response_code) OR (response_code == "")),"Connection failed",true(),response_code) . coalesce((" " . has_expected_string),"")) | eval error_threshold=coalesce(error_threshold,1000), warning_threshold=coalesce(warning_threshold,800), status=case(((((((((response_code >= 400) OR (total_time >= error_threshold)) OR (response_code == "*false")) OR (response_code == "Connection failed")) OR (response_code == "Connection timed out")) OR (timed_out == True)) OR (response_code == "")) OR (has_expected_string == "false")),"Failed",(total_time >= warning_threshold),"Warning",true(),"OK") | stats sparkline(avg(total_time)) AS response_time_trend_ms avg(total_time) AS avg_response_time_ms max(total_time) AS max_response_time_ms latest(response_code) AS response_code latest(status) AS status by title, url
Search explanation
The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.
Splunk Search | Explanation |
---|---|
index="<name of web ping index>" sourcetype="web_ping" title=* |
Search the index where the Website Monitoring application is configured to store results from synthetic checks. For the index name, you can use the macro Optionally, you can replace title=* with a more limited filter on the specific URLs you wish to view and alert on as seen in the demo data SPL |
| eval response_code=(case((timed_out == "True"),"Connection timed out",(isnull(response_code) OR (response_code == "")),"Connection failed",true(),response_code) . coalesce((" " . has_expected_string),"")) | eval error_threshold=coalesce(error_threshold,1000), warning_threshold=coalesce(warning_threshold,800), status=case(((((((((response_code >= 400) OR (total_time >= error_threshold)) OR (response_code == "*false")) OR (response_code == "Connection failed")) OR (response_code == "Connection timed out")) OR (timed_out == True)) OR (response_code == "")) OR (has_expected_string == "false")),"Failed",(total_time >= warning_threshold),"Warning",true(),"OK") |
Create response_code and status fields based on various thresholds. |
| stats sparkline(avg(total_time)) AS response_time_trend_ms avg(total_time) AS avg_response_time_ms max(total_time) AS max_response_time_ms latest(response_code) AS response_code latest(status) AS status BY title, url | Format the results in a table to see how each check is performing, renaming the fields as shown and using a sparkline to show trends. |
Next steps
To alert when a synthetic check takes too long, you can use the SPL in this procedure to configure an alert. You can filter the most recent results in several different ways to obtain the list of URLs that require action, but the simplest recommendation is to add | where status!=OK
to the end of the SPL to alert on any URL which is either taking too long, timing out, or returning an unexpected status code.
Finally, you might be interested in other processes associated with the Managing web server performance use case.