Web servers, such as Apache, Nginx and Microsoft IIS, have log files that capture the state of the server and any extensions that may have been loaded. Apache, for example, can have modules for security (mod_secure), program execution (mod_cgi), and proxy or gateway capabilities (mod_proxy). The most often captured web server data are the access_logs that record all requests processed by the server. These requests vary depending on the function of the websites that the server hosts. Access logs follow a common format called access common. Other formats build on the common format by adding additional fields. A popular example is access_combined, which is identical to common but with two additional fields, referrer and user-agent. Custom formats are possible, but common and combined are the most common.
Data visibility
The access_logs are the richest source to put into Splunk because they track all the interactions with each website hosted by the server. You can track client IP addresses, users, content served, bytes transferred, status codes, http methods, and more. For further details, see the documentation associated with the web server you are interested in.
Data application
When your Splunk deployment is ingesting web server data, you can use the data to achieve objectives related to the following use cases:
High-value fields
In the Common Information Model, web server data is typically mapped to the Web Data model. This data type has many available fields, but users typically derive the most value out of the fields listed here.
http_method
Http method sent from the client, typically GET and PUT.
http_referrer
Http request header that reports the site the client was referred from. It is found in the combined log.
http_user_agent
Http request header that contains identifying information the client reports about itself.
src
IP address of the client that made the request.
status
Status code the server sends back to the client. This valuable information reveals whether the request resulted in a successful response, a redirect, or an error.
time stamp
Time the client request was received.
user
User name of the remote user. This data point is not always available, and even when present, may be unreliable.
Known data sources and source types
Guidance for onboarding data can be found in the Splunk documentation, Getting Data In.
Data Source |
Sourcetype |
Recommend Add-Ons |
---|---|---|
Apache |
sourcetype=”access_common” |
Built into Splunk. Information about Apache access logs can be found by clicking here. For additional capabilities, see the Splunk Add-on for Apache Web Server |
NGINX |
sourcetype=”nginx:plus:access” sourcetype=”nginx:plus:kv” sourcetype=”nginx:plus:error” |
|
Microsoft IIS |
sourcetype="ms:iis:auto” sourcetype=”ms:iis:default” |
Comments
0 comments
Article is closed for comments.