Skip to main content

 

Splunk Lantern

Synthetic checks for URL responses

You might want to monitor the response time or code status of one or more critical URLs within your organization when doing the following:

Prerequisites 

In order to execute this procedure in your environment, the following data, services, or apps are required:

Example

Certain websites and URLs, both internal and external, are critical for employees and customers. Using basic synthetic checks to ensure that URLs are returning the appropriate status (typically 200) and are within the appropriate response time to meet your SLAs can help detect problems before they are reported to the help desk. 

To optimize the search shown below, you should specify an index and a time range. 

  1. Verify that you installed the Website Monitoring App on your search head or heavy forwarder, depending on the availability of the URLs to be checked.
  2. Using the Create Inputs from the navigation bar of the Website Monitoring app, create one or more inputs. Adjust additional configurations as needed, such as which index to send data, usage of proxy servers, and what results constitute failure. For more information, see the Wiki for this project.
  3. Run the following search: 
    index="<name of web ping index>" sourcetype="web_ping" title=*
    | eval response_code=(case((timed_out == "True"),"Connection timed out",(isnull(response_code) OR (response_code == "")),"Connection failed",true(),response_code) . coalesce((" " . has_expected_string),""))
    | eval error_threshold=coalesce(error_threshold,1000), warning_threshold=coalesce(warning_threshold,800), status=case(((((((((response_code >= 400) OR (total_time >= error_threshold)) OR
     (response_code == "*false")) OR (response_code == "Connection failed")) OR (response_code == "Connection timed out")) OR (timed_out == True)) OR 
    (response_code == "")) OR (has_expected_string == "false")),"Failed",(total_time >= warning_threshold),"Warning",true(),"OK") 
    | stats sparkline(avg(total_time)) AS response_time_trend_ms avg(total_time) AS avg_response_time_ms max(total_time) AS max_response_time_ms latest(response_code) AS response_code latest(status) AS status BY title, url
    

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation
index="<name of web ping index>" sourcetype="web_ping" title=*

Search the index where the Website Monitoring application is configured to store results from synthetic checks. 

For the index name, you can use the macro website_monitoring_search_index which is shipped with the Website Monitoring app.

Optionally, you can replace title=* with a more limited filter on the specific URLs you wish to view and alert on as seen in the demo data SPL

| eval response_code=(case((timed_out == "True"),"Connection timed out",(isnull(response_code) OR (response_code == "")),"Connection failed",true(),response_code) . coalesce((" " . has_expected_string),""))

| eval error_threshold=coalesce(error_threshold,1000), warning_threshold=coalesce(warning_threshold,800), status=case(((((((((response_code >= 400) OR (total_time >= error_threshold)) OR
 (response_code == "*false")) OR (response_code == "Connection failed")) OR (response_code == "Connection timed out")) OR (timed_out == True)) OR 
(response_code == "")) OR (has_expected_string == "false")),"Failed",(total_time >= warning_threshold),"Warning",true(),"OK") 

Create response_code and status fields based on various thresholds.
| stats sparkline(avg(total_time)) AS response_time_trend_ms avg(total_time) AS avg_response_time_ms max(total_time) AS max_response_time_ms latest(response_code) AS response_code latest(status) AS status BY title, url Format the results in a table to see how each check is performing, renaming the fields as shown and using a sparkline to show trends.

Result

To alert when a synthetic check takes too long or does not provide the expected response code, you can use the SPL in this procedure to configure a Core Splunk alert. You can filter the most recent results in several different ways to obtain the list of URLs that require action, but the simplest recommendation is to add | where status!=OK to the end of the SPL to alert on any URL which is either taking too long, timing out, or returning an unexpected status code. 

You might also want to monitor trends in your URL response codes over time.

  • Was this article helpful?