Scenario: You work for a large retailer that relies on Apache servers to support its ecommerce. Your organization is growing quickly and you have some concerns about your infrastructure and its ability to keep up with customer demand. You'd like to create some searches to monitor performance and be able to correlate that information with usage.
How Splunk software can help
You can use Splunk software to track page performance, including response codes and times, and to track user data, including activity users, geographic location, and user activity. Alerts from Splunk can be used to signal conditions that require action on the part of the Web Server Administrator, such as troubleshooting or performance optimizations.
What you need
To succeed in implementing this use case, you need the following dependencies, resources, and information.
The best person to implement this use case is a system administrator who is familiar with the installation and configuration of the particular web server software (Apache or IIS), as well as the host operating system. Key skills include familiarity with network and encryption protocols such as TCP and HTTP and TLS, load balancing, file permissions, and performance troubleshooting. This person might come from your team, a Splunk partner, or Splunk OnDemand Services.
Preparing the data needed to manage a web server using Splunk software can last up to a couple of hours. Understanding what conditions to alert on can take further time as it requires knowledge unique to your applications hosted on the web to support business goals and the organization.
The following technologies, data, and integrations are useful in successfully implementing this use case:
- Splunk Enterprise or Splunk Cloud
- Data sources onboarded
How to use Splunk software for this use case
You can run many searches with Splunk software to manage a web server. Depending on what information you have available, you might find it useful to identify some or all of the following:
- Top ten slowest web pages on a web server
- Top five most common web browsers
- Long-term website performance trends
- Trends in web server response codes
- Web hosts with HTTP error status codes
- Distribution of web traffic across servers
- Web access and web error log correlation
- Long-term trends in web server user load
- Number of current users on a website
Other steps you can take
To maximize their benefit, the how-to articles linked in the previous section likely need to tie into existing processes at your organization or become new standard processes. These processes commonly impact success with this use case:
- Dynamic web content comes from a combination of business logic and database tables. Monitoring the application server functions, such as log in, checkout, and form submission, as well as database performance, is necessary to get an end-to-end view of the responsible processes.
- Monitoring the infrastructure that supports the web and app servers to prevent additional problems that might affect the user experience with your web content. You might want to track metrics such as percent busy for compute and storage usage.
- Monitoring the quality of the content available on your websites. For example, are files up to date? Are broken links found and quickly corrected? Are the certificates current?
This use case is also included in the IT Essentials Learn app, which provides more information about how to implement the use case successfully in your IT maturity journey. In addition, these Splunk resources might help you understand and implement this use case:
- Conf Talk: Addressing customer issues with Splunk
- Blog: Decoding IIS logs
- Blog: Launching websites rapidly, without compromise
- Tech Talk: Splunk Fundamentals: Working with web server data Part 1
- Tech Talk: Splunk Fundamentals: Working with web server data Part 2
How to assess your results
Measuring impact and benefit is critical to assessing the value of IT operations. The following are example metrics that can be useful to monitor when implementing this use case:
- Response time: A reduction in outliers for the time taken by a server to deliver content to the requesting client.
- Count: A reduction in error status codes.
- Performance: A reduction in performance problems correlated with user load.