Transforming Palo Alto Networks firewall data with Data Management Pipeline Builders

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Whether you’re filtering out verbose traffic logs or eliminating unnecessary events, you can use Splunk Edge Processor (customer-hosted) and Splunk Ingest Processor (Splunk-hosted Saas) to filter, transform, and optimize Palo Alto Networks (PAN) firewall logs before routing them to the Splunk platform or Amazon S3 for cost-effective storage.

With just a few clicks, you can apply pre-built templates to:

Reduce excessive log ingestion and optimize Splunk license consumption.
Enhance search performance by keeping only the most valuable security events.
Route processed data directly to the Splunk platform for real-time insights or to Amazon S3 for low-cost, long-term retention (with the option to query later via FS-S3 if needed).

Data required

Palo Alto Networks firewall data

Prerequisites

Before you start writing an SPL2 pipeline to process and transform incoming logs, configure Edge Processor to natively listen for events coming over syslog by:

Opening a port to listen for syslog traffic on the Edge Processor node;
Configuring your device/application to send syslog data to Edge Processor; and
Configuring Edge Processor to listen for syslog feed on the opened port.

You'll also need to verify that you have access to Splunk Data Management and can log in to your tenant (learn more).

How to use Splunk software for this use case

Watch this video or follow the written steps below to use a pre-built pipeline template to filter PAN logs.

Step 1: Confirm the current state of PAN logs

Run a quick search for incoming PAN logs using syslog. You'll likely see:

All logs share a generic source type (example: pan:firewall)
No proper field extractions
Minimal search-time knowledge (only metadata fields like _index, _sourcetype, etc.)

Step 2: Open Ingest Processor and create a pipeline

Go to Data Management > Ingest Processor > Receive Data.
Confirm that PAN firewall events are being ingested.
Navigate to Pipelines, and click Create Pipeline.

Step 3: Use a PAN template to classify logs

In the template picker, search for Palo Alto.
Select the PAN Classification Template (example: pan_classify_events_template).
Set the partition to the actual source type seen in your PAN logs (example: pan:firewall).

Step 4: Capture a live snapshot for previewing

Choose Live Capture and click Capture New Snapshot.
Name it something like pan_snapshot and set a short capture duration.
Proceed to the next step with the snapshot selected.

Step 5: Skip metrics, set destination index

Skip the metrics destination step unless you're routing to a metrics index.
Select the Splunk indexer destination for PAN logs (use an existing index or create one like pan_logs_index).
Leave other options as default.

Step 6: Preview and apply the pipeline

Click Preview to review how the template transforms your data. You’ll now see:

Correct source types applied
Index reassignment as per classification
Fields extracted using the installed PAN TA

Click Save, name it something like pan_classify, and confirm when prompted to apply the pipeline.

Step 7: Validate the classification results

Wait a few minutes for new data to flow in, then refresh your dashboard or search. You should now see:

PAN events split across correct source types
Field extractions visible in “Interesting Fields”
Events routed to proper indexes

Step 8: (Optional) Optimize log size with another template

Remove the classification-only pipeline if needed.
Create a new pipeline using the PAN Optimize Log Size Template (example: pan_optimize_log_size_template).
Set source type and snapshot as before.
Preview (if applicable), then click Save and Apply.

Step 9: Review optimization results

Open your log size analysis dashboard (or create one).
Compare event size before and after optimization.

Next steps

These resources might help you understand and implement this guidance:

Splunk Lantern Article: Getting started with Splunk Data Management Pipeline Builders
Splunk Adoption Hub: Data Management
Splunk Help: Quick references for SPL2 commands
Splunk Help: Get Syslog data into Edge Processor
Join the #dm-pipelines-builders on the Splunk Community Slack for direct support (request access: http://splk.it/slack)
Splunk OnDemand Services: Use these credit-based services for direct access to Splunk technical consultants with a variety of technical services from a pre-defined catalog. Most customers have OnDemand Services per their Success Plan. Engage the ODS team at ondemand@cisco.com if you would like assistance.