About Splunk Infrastructure Monitoring
Splunk Infrastructure Monitoring addresses the monitoring of the lower parts of service stack. This includes elements such as servers, databases, container systems, app servers, or storage.
Monitoring these systems is critical to ensuring the applications that sit on top of them are available and performing well. An error in any one component can potentially negatively impact users. For example, if a server is running out of available memory, an end user trying to make a purchase on a web store that sits on top of that server could experience delays in response time.
These types of issues must be resolved quickly. The time it takes to detect and repair an issue is called Mean-Time-to-Repair (MTTR). A key goal of an organization's support team is to reduce the MTTR to such a level that the user’s impact is minimized or possibly avoided to begin with.
When an issue is detected with Splunk Infrastructure Monitoring, it triggers the incident management lifecycle. In addition to detecting an issue, Splunk Infrastructure Monitoring is also used in the investigation step of the incident management cycle. Splunk Infrastructure Monitoring may be a single part of a set of investigation tools used depending on the issue - for example, Splunk Log Observer and Splunk ITSI could also be used.
The incident management lifecycle
Ready to begin? To get started, select the next step that applies to you: