Getting started with Splunk Data Management Pipeline Builders

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Splunk’s Data Management Pipeline Builders are the latest innovation in data processing. They offer more efficient, flexible data transformation – helping you reduce noise, optimize costs, and gain visibility and control over your data in motion.

Splunk Data Management offers two pipeline builders with a choice of deployment model:

Edge Processor is a customer-hosted offering for greater control over data before it leaves your network boundaries. You can use it to filter, mask, and transform your data close to its source before routing the processed data to the environment of your choice.
Ingest Processor is a Splunk-hosted SaaS offering ideal for customers who are all-in on cloud and prefer that Splunk manage the infrastructure for them. In addition to filtering, masking and transforming data, it enables a new capability - converting logs to metrics. Ingest Processor is currently only available in Splunk Cloud Platform.

Both Edge Processor and Ingest Processor allow you to easily configure SPL2 based pipelines to filter, mask, transform and route data to destinations. They support most SPL2 based commands for pre-ingest data processing (for example, regex, eval, etc). Learn more about SPL2 profiles and view a command compatibility matrix by product for SPL2 commands and eval functions.

Data Management Pipeline Builders allow you to:

Filter: Easily filter low-value or noisy data, such as DEBUG logs, heartbeat messages or repetitive health check messages, and focus on data that matters the most.
Mask PII: Ensure organizational data compliance and data privacy by easily masking or encrypting Personally Identifiable Information (PII) data.
Enrich and extract: Enrich events with contextual data before sending to Splunk for high value search, monitoring and analysis for various ITOps and SecOps teams.
Route: Route different “slices” of data to the Splunk platform, Splunk Observability Cloud* and Amazon S3 for low-cost storage and have granular control over your data placement.
Logs-to-Metrics*: Transform your logs into real-time metrics for faster MTTD and MTTR.

*Currently applies to Ingest Processor only.

Benefits and value of Data Management Pipeline Builders

Reduce data noise and costs
Gain increased visibility into streaming data
More efficient, flexible data transformation
Accelerate MTTD with real-time metrics
Centralized control through cloud control plane
Leverage SPL2 for advanced data processing
A guided pipeline builder to simplify data routing
Computes at a much faster rate, with fewer compute resources required compared to ingest actions or heavyweight forwarders
Reduce search time

Edge Processor is included with your Splunk Cloud Platform or Splunk Enterprise subscription at no additional cost. The Ingest Processor “Essentials” tier is also included with Splunk Cloud Platform subscriptions. Learn more about the requirements to use Edge Processor or Ingest Processor and how to request access if needed.

How Splunk Edge Processor works

Splunk Edge Processor combines Splunk-managed cloud services, on-premises data processing software, and SPL2 to support data processing at the edge of your network. It allows you to ingest data into the Splunk platform, Amazon S3 or other systems. This service offering is delivered through the cloud control plane, with an edge processor node installed and managed in the customer infrastructure for data processing (i.e. data plane). Learn more about the Edge Processor system architecture.

Using simple-to-deploy nodes, Splunk Edge Processor allows you to filter, route and process data generated by Splunk Forwarders and other sources before it is ingested into Splunk Enterprise or Splunk Cloud Platform. You define where you want to deploy the edge processor nodes, as well as the Edge Processor node name, description, and tags.

Data Management Edge Processor Diagram 2024.png

When Splunk Edge Processor nodes are deployed, you control the destination to where your Edge Processors and pipelines send data. You can also configure a “default destination” per Edge Processor node to route unprocessed data. If you don't specify a default destination, Edge Processors will drop unprocessed data by default.

The statuses of the capabilities and limitations of Edge Processor (as of Splunk Cloud Platform version 9.2.2403) are:

Supported actions: Filtering, transforming, masking, routing (stateless, lightweight operations), lookups, and cryptographic functions
In the roadmap: Dedup, logs to metrics, metrics processing, and summarizing
Not supported: Data decryption

How Splunk Ingest Processor works

Splunk Ingest Processor combines Splunk-hosted cloud services and SPL2 to support processing of data that has been ingested into your Splunk Cloud Platform deployment. Ingest Processor is a cloud service offering that provides a centralized console for managing Ingest Processor pipelines.

By using Ingest Processor, you can process, manage and monitor your data ingest ecosystem from a Splunk-hosted cloud service. This requires no infrastructure setup, making it easy to get started. You can also collect, pre-process, and route metrics to the Splunk Observability Cloud for infrastructure and application monitoring.

Splunk Data Management Ingest Processor 2024.png

When to use which data processing capability

	Edge Processor	Ingest Processor	Ingest Actions
Capabilities	Filter, mask, and route data before indexing
Processing method	SPL2-based pipelines		UI over props and transforms
Availability	Splunk Cloud Platform (AWS) & Splunk Enterprise	Splunk Cloud Platform (AWS)	Splunk Cloud Platform (AWS/GCP) & Splunk Enterprise
Deployment model	Process data on customer-hosted edge using SPL2 processing engine	Process data using Splunk-managed SPL2 processing engine	Process data on HWF or Indexer using rulesets
Supported sources (ingest data from)	S2S, HEC, RawHEC, and Syslog	Any Splunk Cloud Platform (Victoria) input	Any Splunk supported data input
Data Preview	Copy/paste samples Upload samples	Live capture Copy/paste samples Upload Samples	Live capture Indexed
Supported destinations (route to)	Splunk Cloud Platform Splunk Enterprise Amazon S3	Splunk Cloud Platform Amazon S3 Splunk Observability Cloud	Splunk Cloud Platform Splunk Enterprise Amazon S3 File System
Cost	No additional costs with Splunk Enterprise or Splunk Cloud Platform	<500GB: No additional costs with Splunk Cloud Platform	No additional costs with Splunk Enterprise or Splunk Cloud Platform

How to get started

For Splunk Cloud Platform customers, login to your Splunk Cloud Platform and navigate to Splunk Data Management console to start using Edge Processor or Ingest Processor today.

If using Splunk Web UI, from the homepage, click Settings > Add data > Data Management Experience.
You can also directly navigate to the Data Management using the following link: https://px.scs.splunk.com/<your Splunk cloud tenant name>

For Splunk Enterprise customers, you can set up a dedicated machine or leverage an existing one in your management tier that can accommodate additional workloads. If you are a Splunk administrator, then after installing (or upgrading to) Splunk Enterprise 10.0, you’ll see “Data Management” in the apps list after you log in.

If you are the first user on your Edge Processor or Ingest Processor tenant, you need to complete the first-time setup instructions to allow your tenant to access Splunk Cloud Platform indexes for storing the logs and metrics passing through the processors.

First-time setup instructions for Edge Processor Splunk Cloud Platform.
First-time setup instructions for Edge Processor Splunk Enterprise.
First-time setup instructions for Ingest Processor.

The first-time setup instructions are very similar for both pipeline builders; however, since Splunk Edge Processor is customer-hosted, you will be required to setup the edge node in your environment, which Splunk has simplified to running one command on a Linux machine.

Next steps

Review the additional resources below, then click the Next step button below to learn to configure and deploy your Splunk Data Management Pipeline Builders with step-by-step guidance.

Join the #edge-processor on the Splunk Community Slack for direct support (SSO access: http://splk.it/slack)
Splunk Resource: Data Management resource hub
Splunk Blog: Introducing Edge Processor: Next gen data transformation
Splunk Tech Talk: Introducing Edge Processor
Splunk .Conf Talk: Getting data in more afficiently Using the Splunk Edge Processor (session slides)
Splunk Blog: Data preparation made aasy: SPL2 for Edge Processor
Splunk Blog: Addition of Syslog in Splunk Edge Processor supercharges security operations with Palo Alto Firewall log reduction
Splunk Blog: Splunk Edge Processor Enhancements offer greater data access and improve data management
Stay up-to-date with release notes for Edge Processor and Ingest Processor

Next step