Using ingest actions to filter AWS VPC Flow Logs
AWS VPC Flow Logs can pose challenges for Splunk Cloud Platform users because of their volume and complexity. These logs, while rich in information, can inundate cloud environments, producing a lot of noise and leading to inefficiencies. This article introduces a strategic approach to filter AWS VPC Flow Logs using ingest actions to ensure that only relevant data reaches your cloud environment, which will enhance query efficiency and speed up delivery of results.
Ingest actions offer a sophisticated toolkit for refining data at the point of ingestion. This feature allows you to selectively process logs based on predefined criteria, significantly improving data quality and operational efficiency.
For comprehensive details and guidelines on the prerequisites and capabilities of ingest actions, see Ingest actions requirements.
To follow the steps below, you'll need to ensure the Splunk Add-on for Amazon Web Services is installed and configured, as well as verify that you have access to ingest actions with appropriate role permissions.
When filtering AWS VPC Flow Logs, make sure you filter low-value data and keep high-value data. Let’s walk through a VPC Flow Log field format first to understand what you will be filtering out and why.
VPC Flow Logs fields format [source]
version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status
Focus on the log-status
field. This field tells you the status of the associated log, and based on the value, you can tell if the log is valuable or not.
Here are the values that log-status
field can have:
OK
- Data is logging normallyNODATA
- No network trafficSKIPDATA
- Flow log records were skipped
In this case, dropNODATA
and SKIPDATA
due to their low-fidelity nature, and keep the logs that are logging normally.
Configuring ingest actions
For Splunk Cloud Platform, perform these steps to create a ruleset:
- Navigate to Settings > Data > Ingest Actions.
- Click to create a New Ruleset and provide a Ruleset Name. Two examples of this are:
AWS_VPC_FLOW_LOGS_NODATA_RULESET
AWS_VPC_FLOW_LOGS_SKIPDATA_RULESET
- Identify and select the appropriate source for your logs, such as
aws:cloudwatchlogs:vpcflow
.- If you're using Splunk Data Manager, select Select source type from list and in the next option titled Source Type select the source type assigned in the input created from Data Manager. The ingest action rule will be applied to the specific source type assigned to the data coming in from the Data Manager, dropping events in this scenario.
- Select +Add Rule > Filter > Filter with regular expression to start the filtering process.
- Set Source Field as
_raw
, and in the Drop Events Matching Regular Expression field, paste the relevant pattern log-status value that aligns with the data you want to filter, for exampleNODATA
orSKIPDATA
. - Click Apply to apply the settings and review results.
- Ensure that the filter correctly selects the correct event that contains
NODATA
orSKIPDATA
. These events will be filtered out before being ingested. - Click Save to save the ruleset.
You now have successfully configured ingest actions to filter out unnecessary verbosity from AWS VPC Flow Logs and streamline your data before it enters Splunk Cloud Platform.
Next steps
These resources might help you understand and implement this guidance:
- Splunk Docs: Use ingest actions to improve the data input process