While metrics help you isolate which hosts are having problems and when those problems began, logs and events generally contain information needed to get to the true cause of the issue. You want to use Splunk to isolate logs and events coming from the host and look for any common indicators of trouble such as “error” or “failed”.
System log data
- Run the following search. You can optimize it by specifying an index and adjusting the time range.
host=* sourcetype=* (error OR fail*)
The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.
|host=*||Search any host in your deployment.|
|sourcetype=* (error OR fail*)||Search any source type in your deployment.|
|(error OR fail*)||Search for error or failure events.|
These search results give you easy and quick visibility into which hosts and data sources you need to investigate.
You might be interested in other processes associated with the Recovering lost visibility of IT infrastructure use case.