Reducing Palo Alto Networks log volume with the SPL2 template
Many organizations face challenges in managing the continuous stream of log data from Palo Alto Networks (PAN). Since this network data is in syslog format, it generates a high volume of logs, which can result in storage constraints, slower processing, and difficulty in identifying relevant information amidst the noise. Additionally, the increased log volume drives up management costs and complicates compliance with data retention policies.
Splunk Edge Processor and Splunk Ingest Processor are designed to handle syslog-based ingestion protocols, making them highly effective at managing complex and excessive data. They can be deployed as a comprehensive solution for processing syslog feeds, including PAN logs. With built-in capabilities to function as a syslog receiver, they can process, transform, and route log data to supported destinations.
With features like unnecessary field removal and redundant timestamp elimination, the SPL2 template for Palo Alto Networks provides enhanced control and flexibility in log management. The template also provides flexibility to let you decide what fields to keep or remove, and route the data to specific indexes, ensuring compatibility with Splunk Add-on for Palo Alto Networks, Palo Alto Networks Add-on for Splunk, and the Splunk Common Information Model (CIM).
The template description, displayed at the top of the pipeline template, includes details about template versioning and its compatibility with add-ons. Be sure to review it for a better understanding.
Prerequisites
Before you start using an SPL2 template to reduce log size, you should have the following:
- Splunk Cloud Platform with Splunk Edge Processor or Splunk Ingest Processor enabled
- A Splunk destination instance configured to index data after processing it through the SPL2 pipeline
Splunk Edge Processor and Splunk Ingest Processor are included with your Splunk platform. Learn more about the requirements to use them (Edge Processor or Ingest Processor) and how to request access if you do not already have it. If this is your first time using these features, see the getting started content (Edge Processor or Ingest Processor).
Solution
Getting data into EP/IP
You can send the data from Palo Alto to EP via syslog using either of these two methods:
- Send all data to a single port.
- Send specific data for a single source type to a single port.
For limitations and benefits of these two methods, review the reference documentation. Our template is compatible with both types of approach.
Create an Edge Processor Service
Follow the steps in this article to create an Edge Processor Service.
Create source types
- Log into the Data Management UI.
- Go to Edge Processor > Source types.
- Do one of the following, depending on the method you selected for getting data in:
- If you will use a single port for all source types, create a single source type named
pan:log
orpan_log
. - If you will use a single port for a single source type, create a source type for each PAN source type (
pan:threat
,pan:traffic
, etc).
- If you will use a single port for all source types, create a single source type named
Create ports in your Edge Processor
- Log into the Data Management UI.
- Go to Edge Processor > Shared settings.
- Under the Syslog section, add the port according to the method you chose.
For example:
Start sending data from Palo Alto Networks
- Follow the instructions in Configure the firewall to send logs from a PAN. After you have completed those instructions:
- Your Edge Processor should be receiving logs.
- If you need to send data to Ingest Processor, follow the instructions in refer to Use SC4S to get syslog data into an Edge Processor, which are also applicable for Ingest Processor.
Using the template
Before using the template, review this article to familiarize yourself with SPL2 template best practices. They will help you ensure a smooth transition of transformed events into production, mitigate potential risks, and help maintain the integrity, reliability, and efficiency of the production environment.
You can gain control of PAN logs via syslog, where your ultimate goal is to improve search performance. You can do so by:
- reducing event size
- removing unnecessary and “noisy” fields
- routing a full-fidelity copy of the data that is to be maintained for compliance purposes in AWS S3
All these steps reduce ingestion and storage costs. To accomplish these goals, the next sections walks you through the following:
- Create an SPL2 pipeline from the out-of-the-box template provided for reducing the log size for PAN.
- Modify the pipeline according to your needs. Take control of what fields you want to keep/remove.
- Apply the pipeline in your Edge Processor to transform the raw logs.
If you prefer to watch these steps in action, watch this video demo.
Use the template to create a pipeline
- Go to the Pipelines > Templates.
- Search for Palo Alto Network logs: Reduce log size.
- From the three dots menu, select Create Edge Processor pipeline from template.
- In the left column, there will be a list of functions that are used in the template.
- In the right column, there will be a source type that is already applied and some sample data. Sample includes not only
_raw
but alsohost
,source
andsourcetype
. The pipeline works with_raw
events.
- Click Preview Pipeline to check the results before actually using the pipeline for transforming the data.
- Select a destination from the right side console. This will send the transformed logs to the selected destination. Refer to See the results in the Splunk destination for information on how to create a destination.
- Click Save Pipeline and provide the appropriate name. This generates your pipeline.
Modify the template
There are two types of reduction functions available for every source type:
- The minimum reduction function removes a limited number of fields.
- The maximum reduction function removes a larger set of fields from the original event.
By default, the minimum reduction function is selected. If you want to switch to maximum reduction, you can choose from the pipeline.
If you want to retain any of the fields already listed in the min/max reduction function, you can remove its entry from the function. Likewise, to remove additional fields beyond those currently removed, add your entries to the respective function.
Make sure to test the pipeline in the preview to validate the latest changes you have made.
Apply the pipeline to the Edge Processor instance
- From the main UI, select Edge Processor.
- Select the Edge Processor that you created.
- Select Apply/remove pipelines and select the pipeline that you created.
See the results in the Splunk destination
In order to see the transformed logs, you need to create a destination Splunk instance.
- If you don’t have a destination already created, use Add or manage destinations to learn about the different destination types and how to create those.
- Create the required indexes in that Splunk destination. The pipeline will bifurcate the data into several indexes that are
netfw
,netproxy
,netops
,netauth
, andmain
. For more details, look into the_set_event_fields
function in the pipeline. - Make sure to select the destination that you have created into the destination of the SPL2 pipeline that you have applied in your Edge Processor.
After the data is passed through the Edge Processor, you can see the statistics of the reduction in the Edge Processors UI.
Resources
To ensure a smooth transition of transformed events into production and mitigate potential risks, it is essential to follow best practices and guidelines when using an SPL2 template. These practices should be followed to avoid any mishaps and to maintain the integrity, reliability, and efficiency of the production environment.
These additional Splunk resources might help you understand and implement this use case:
- Splunk Docs: SPL2 Search Manual
- Splunk Docs: About the Edge Processor solution
- Splunk Blog: Introducing Edge Processor: Next gen data transformation