Skip to main content
Splunk Lantern

First time seen stack trace

Application problems introduced by regular software releases, system patching, configuration changes, code deployments, and other production changes might go undetected if they're not monitored, only to be reported later by end users. You want to be proactive during production changes by monitoring for stack traces that have been emitted from the logs in the last 24 hours.

Procedure

Run the following search. You can optimize it by specifying an index and adjusting the time range.

host=<host to look at>
linecount>3 (unhandled OR exception OR traceback OR stacktrace)
| rex field=_raw "(?<FirstLine>(.*){1})\n(?<SecondLine>(.*){1})"
| stats min(_time) AS firstTimeSeenEpoch, max(_time) AS lastTimeSeenEpoch first(_raw) AS stacktrace count by index sourcetype linecount SecondLine
| where (lastTimeSeenEpoch-firstTimeSeenEpoch) < (60*60*24*1)
| convert ctime(firstTimeSeenEpoch) AS firstTimeSeen, ctime(lastTimeSeenEpoch) AS lastTimeSeen
| table index sourcetype stacktrace firstTimeSeen lastTimeSeen count
| sort - count

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation
host=<host to look at> Search a specific host.
linecount>3 Search for a line count greater than three. Stack traces are multiline messages or events. 
(unhandled OR exception OR traceback OR stacktrace) Find events with specific words in them, such as “unhandled’, “exception”, “traceback”, or “stacktrace”.
| rex field=_raw "(?<FirstLine>(.*){1})\n(?<SecondLine>(.*){1})" Extract the first and second lines of the stack trace to group them. They have the same number of lines, and the second line is the same between stack traces.
| stats min(_time) AS firstTimeSeenEpoch, max(_time) AS lastTimeSeenEpoch first(_raw) AS stacktrace count by index sourcetype linecount SecondLine Calculate the first and last time the stack trace was seen.
| where (lastTimeSeenEpoch-firstTimeSeenEpoch) < (60*60*24*1) Include only results where the difference between last time seen and first time seen is less than 1 day. 
| convert ctime(firstTimeSeenEpoch) AS firstTimeSeen, ctime(lastTimeSeenEpoch) AS lastTimeSeen Convert this time into a readable string.
| table index sourcetype stacktrace firstTimeSeen lastTimeSeen count Display the results in a table with columns in the order shown.
| sort - count Sort the results in descending order.

Next steps

Sample results for this search are shown in the table below. You can see the index where the stacktrace was found, the sourcetype that generated it, text of the trace, first and last time seen, and total count. 

index sourcetype stacktrace firstTimeSeen lastTimeSeen count

_internal

web_availability_modular_input

2020-11-16 14:10:46,451 ERROR Exception generated when attempting to get the proxy configuration stanza=web_ping://https://expired.badssl.com/

Traceback (most recent call last):

  File "/opt/splunk/etc/apps/website_monitoring/bin/web_ping.py", line 959, in run_ping

self.get_proxy_config(input_config.session_key, conf_stanza)

  File "/opt/splunk/etc/apps/website_monitoring/bin/modular_input.zip/modular_input/shortcuts.py", line 31, in wrapper

return function(*args, **kwargs)

  File "/opt/splunk/etc/apps/website_monitoring/bin/web_ping.py", line 817, in get_proxy_config

website_monitoring_config = self.get_app_config(session_key, stanza)

  File "/opt/splunk/etc/apps/website_monitoring/bin/modular_input.zip/modular_input/shortcuts.py", line 31, in wrapper

return function(*args, **kwargs)

  File "/opt/splunk/etc/apps/website_monitoring/bin/web_ping.py", line 747, in get_app_config

server_response, server_content = splunk.rest.simpleRequest('/servicesNS/nobody/website_monitoring/admin/website_monitoring/' + stanza + '?output_mode=json', sessionKey=session_key)

  File "/opt/splunk/lib/python2.7/site-packages/splunk/rest/__init__.py", line 553, in simpleRequest

raise splunk.AuthenticationFailed

AuthenticationFailed: [HTTP 401] Client is not authenticated

17:52.9

33:13.3

15

Additionally, you might need to detect a first time seen stack trace when using stack traces to detect application errors