First time seen stack trace
Application problems introduced by regular software releases, system patching, configuration changes, code deployments, and other production changes might go undetected if they're not monitored, only to be reported later by end users. You want to be proactive during production changes by monitoring for stack traces that have been emitted from the logs in the last 24 hours.
Data required
Procedure
Run the following search. You can optimize it by specifying an index and adjusting the time range.
host=<host to look at> linecount>3 (unhandled OR exception OR traceback OR stacktrace) | rex field=_raw "(?<FirstLine>(.*){1})\n(?<SecondLine>(.*){1})" | stats min(_time) AS firstTimeSeenEpoch, max(_time) AS lastTimeSeenEpoch first(_raw) AS stacktrace count by index sourcetype linecount SecondLine | where (lastTimeSeenEpoch-firstTimeSeenEpoch) < (60*60*24*1) | convert ctime(firstTimeSeenEpoch) AS firstTimeSeen, ctime(lastTimeSeenEpoch) AS lastTimeSeen | table index sourcetype stacktrace firstTimeSeen lastTimeSeen count | sort - count
Search explanation
The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.
Splunk Search | Explanation |
---|---|
host=<host to look at> | Search a specific host. |
linecount>3 | Search for a line count greater than three. Stack traces are multiline messages or events. |
(unhandled OR exception OR traceback OR stacktrace) | Find events with specific words in them, such as “unhandled’, “exception”, “traceback”, or “stacktrace”. |
| rex field=_raw "(?<FirstLine>(.*){1})\n(?<SecondLine>(.*){1})" | Extract the first and second lines of the stack trace to group them. They have the same number of lines, and the second line is the same between stack traces. |
| stats min(_time) AS firstTimeSeenEpoch, max(_time) AS lastTimeSeenEpoch first(_raw) AS stacktrace count by index sourcetype linecount SecondLine | Calculate the first and last time the stack trace was seen. |
| where (lastTimeSeenEpoch-firstTimeSeenEpoch) < (60*60*24*1) | Include only results where the difference between last time seen and first time seen is less than 1 day. |
| convert ctime(firstTimeSeenEpoch) AS firstTimeSeen, ctime(lastTimeSeenEpoch) AS lastTimeSeen | Convert this time into a readable string. |
| table index sourcetype stacktrace firstTimeSeen lastTimeSeen count | Display the results in a table with columns in the order shown. |
| sort - count | Sort the results in descending order. |
Next steps
Sample results for this search are shown in the table below. You can see the index where the stacktrace was found, the sourcetype that generated it, text of the trace, first and last time seen, and total count.
index | sourcetype | stacktrace | firstTimeSeen | lastTimeSeen | count |
---|---|---|---|---|---|
_internal |
web_availability_modular_input |
2020-11-16 14:10:46,451 ERROR Exception generated when attempting to get the proxy configuration stanza=web_ping://https://expired.badssl.com/ Traceback (most recent call last): File "/opt/splunk/etc/apps/website_monitoring/bin/web_ping.py", line 959, in run_ping self.get_proxy_config(input_config.session_key, conf_stanza) File "/opt/splunk/etc/apps/website_monitoring/bin/modular_input.zip/modular_input/shortcuts.py", line 31, in wrapper return function(*args, **kwargs) File "/opt/splunk/etc/apps/website_monitoring/bin/web_ping.py", line 817, in get_proxy_config website_monitoring_config = self.get_app_config(session_key, stanza) File "/opt/splunk/etc/apps/website_monitoring/bin/modular_input.zip/modular_input/shortcuts.py", line 31, in wrapper return function(*args, **kwargs) File "/opt/splunk/etc/apps/website_monitoring/bin/web_ping.py", line 747, in get_app_config server_response, server_content = splunk.rest.simpleRequest('/servicesNS/nobody/website_monitoring/admin/website_monitoring/' + stanza + '?output_mode=json', sessionKey=session_key) File "/opt/splunk/lib/python2.7/site-packages/splunk/rest/__init__.py", line 553, in simpleRequest raise splunk.AuthenticationFailed AuthenticationFailed: [HTTP 401] Client is not authenticated |
17:52.9 |
33:13.3 |
15 |
Additionally, you might need to detect a first time seen stack trace when using stack traces to detect application errors.