Using Edge Processor to mask or truncate cardholder data for PCI DSS compliance
As a Splunk admin, your duties involve managing data ingestion, ensuring data security, and maintaining compliance with industry standards. Your organization uses Splunk Cloud Platform to monitor and analyze diverse data sources, including those that potentially contain cardholder information. It's essential to ensure that card data, such as the primary account number (PAN), cardholder name, expiry date, and other related information, is appropriately stored in Splunk Cloud Platform.
Splunk Edge Processor (EP) offers the capability to filter, mask, and transform your data close to its source before routing the processed data to external environments. This article shows you how to use Splunk Edge Processor to ensure that sensitive cardholder data is masked or truncated before being ingested into Splunk Cloud Platform, allowing you to maintain PCI DSS compliance. You'll complete the following steps to do this:
- Learn PCI DSS requirements to understand what PCI DSS compliance entails.
- Build the pipeline using Splunk Edge Processor.
- Test the pipeline to validate that the pipeline performs as intended.
- Monitor data ingestion to continuously verify output and metrics to ensure long-term compliance.
Whether you choose to filter, mask, or truncate data depends on the type of data being handled. You can learn more about this in the Learn PCI DSS requirements section of this article. This article deals specifically with the masking or truncation of data before ingestion into Splunk Cloud Platform. To learn how to filter out data completely before ingestion, see Using Edge Processor to filter out cardholder data for PCI DSS compliance.
Data required
- Financial data for credit card transactions
- A CSV or KV lookup file of credit card transactions
Prerequisites
- Splunk Cloud Platform version 9.xx + or higher
- A Splunk Edge Processor tenant with a paired Splunk Cloud Platform stack
- A Splunk Edge Processor instance running on a machine with an accessible URL
- A credit card data source (Or you can use this sample input script to generate synthetic data)
- Splunk Edge Processor pipelines support Search Processing Language 2 (SPL2). If the SPL2 syntax is new to you, review the SPL2 Search Reference documentation.
- Splunk Edge Processor supports Regular Expression 2 (RE2) syntax instead of PCRE syntax. RE2 and PCRE accept different syntax for named capture groups. For more information on this, see Regular expression syntax for Edge Processor pipelines.
- 
    For more information on sending data to Splunk Edge Processor, refer to the documentation to send data from a forwarder or using the HTTP Event Collector (HEC). 
- This example shows how to work with credit card data, with the example used showing how to mask or truncate a primary account number (PAN). However, this process can be similarly applied to any data that needs to be compliant with PCI DSS standards.
How to use Splunk software for this use case
Learn PCI DSS requirements
PCI DSS requirement 3 states that cardholder data elements like the primary account number (PAN), cardholder name, service code, and expiration date must be protected whenever they are stored, transmitted, or processed. This is in contrast to other sensitive authentication data elements like full magnetic stripe data, card verification value (CVV2), card validation code (CVC2), and card identification number (CID), as well as the PIN/PIN block. Storage of those elements is not permitted, even if encrypted.
In this example, we'll explore how to mask or truncate the PAN to comply with the requirement that this data is protected whenever it is stored, transmitted, or processed.
Whether you decide to truncate or mask the PAN depends on how the data is being used. You should truncate the PAN if you're only storing the data, whereas if you're looking to display the PAN, you'd mask it instead. The requirement and implementation practices for each are:
- Truncation
    - Requirement: Truncate PAN for storage.
- Implementation: Retain only the last four digits of the PAN and remove the middle six digits (e.g. XXXX-XXXX-XXXX-0123).
 
- Masking
    - Requirement: Mask PAN for display.
- Implementation: Ensure that systems and processes are configured to display only the first six and last four digits of the PAN, masking the digits in between (for example, 0123-45XX-XXXX-2345.
 
After you've identified the data that needs to be masked or truncated, you can begin to build your pipeline.
Build the pipeline
- Ensure that the correct source types are available for identifying card data. The table below outlines recommended source types, associated fields, and typical data sources for credit card data across financial applications.
    Suggested source type Fields Sources cc_transaction_logscc_number, pan_number, cc_type, ccv, expiry_data, pin, pin_block, transaction_type, cardholder_name, track1, track2, track3- Payment gateways
                    - Stripe, PayPal, Clover
 
- Banking systems
                    - Core banking platforms
 
- E-commerce platforms
                    - Shopify, Magento
 
- Point of Sale (POS) systems
                    - Retail transaction systems
 
 
- Payment gateways
                    
- Create an SPL2 pipeline to truncate or mask the data before it's stored or displayed. The pipeline below takes in credit card transaction data, does a regex match on the PAN to confirm that the data matches a known format (for example, Amex, Visa, or Mastercard credit card number formats), and drops those fields. After dropping fields from the data ingested, the pipeline then sends the cleaned data to an index in Splunk Cloud Platform.
    Pipeline definition (SPL2) $source function process_pan($source, $display: boolean = false)
 {
 return
 | from $source
 | eval pan_regex = case(
 cc_type == "Visa" or cc_type == "MasterCard" or cc_type == "UnionPay", "^[0-9]{16}$",
 cc_type == "Discover", "^[0-9]{17}$",
 cc_type == "AMEX", "^[0-9]{15}$"
 )
 | eval cc_regex = case(
 cc_type == "Visa" or cc_type == "MasterCard" or cc_type == "UnionPay", "^[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}$",
 cc_type == "Discover", "^[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{1}$",
 cc_type == "AMEX", "^[0-9]{4}[- ]?[0-9]{6}[- ]?[0-9]{5}$"
 )
 | eval is_pan_valid = if(pan_regex != "", match(pan_number, pan_regex), 0)
 | eval is_cc_valid = if(cc_regex != "", match(cc_number, cc_regex), 0)
 | where is_pan_valid=="True" or is_cc_valid=="True"
 | eval mask_pattern = case(
 (cc_type == "Visa" or cc_type == "MasterCard" or cc_type == "UnionPay") and $display,
 "^([0-9]{4}[- ]?[0-9]{2})[0-9]{4}([- ]?[0-9]{4})$",
 (cc_type == "Visa" or cc_type == "MasterCard" or cc_type == "UnionPay") and not $display,
 "^([0-9]{4}[- ]?)[0-9]{8}([- ]?[0-9]{4})$",
 cc_type == "Discover" and $display,
 "^([0-9]{4}[- ]?[0-9]{2})[0-9]{5}([- ]?[0-9]{4})$",
 cc_type == "Discover" and not $display,
 "^([0-9]{4}[- ]?)[0-9]{9}([- ]?[0-9]{4})$",
 cc_type == "AMEX" and $display,
 "^([0-9]{4}[- ]?[0-9]{2})[0-9]{4}([- ]?[0-9]{5})$",
 cc_type == "AMEX" and not $display,
 "^([0-9]{4}[- ]?)[0-9]{6}([- ]?[0-9]{5})$"
 )
 | eval masked_value = if(mask_pattern != "",
 replace(cc_field, mask_pattern, if($display, "$1XX-XXXX$2", "XXXX-XXXX-XXXX$2")),
 cc_field)
 | eval cc_field = masked_value
 | fields - is_pan_valid, is_cc_valid, pan_regex, cc_regex, mask_pattern, masked_value
 }$pipeline = | from $source
 | process_pan display=false
 | eval index="cc_transactions_index"
 | into $destination;sourcetype= cc_transactionsThe SPL2 above contains the following parameters: - $source- implicit; takes input from the preceding pipe.
- $display- Boolean tunable parameter that controls whether the PAN data should be truncated or masked.- True - Apply masking if PAN data will be stored and displayed.
- False - Apply truncation if PAN data will only be stored.
 
 The allowed credit card formats by card type are: - 15-digit card variations for Amex
- 16-digit card variations for Visa, Mastercard, Discover, and UnionPay
 Your final pipeline should look like this:  
- After you have constructed your pipeline, follow these instructions to save and apply your pipeline:
    - Test your pipeline rule. In the top right corner of the screen, click the blue Preview button.
- Set the Data destination to the appropriate index, for example: cc_transactions_index.
- To save the destination, click Apply.
- In the top right corner of the screen, click Save pipeline.
- Give your pipeline a suitable name, for example: cc_transactions_pci_clean_storage.
- To save your pipeline, click Save.
- To try out the new pipeline, click Pipelines on the top left of the page.
- Locate the pipeline you just created, click the three dots next to your new pipeline, and select Apply/remove.
- Select the Splunk Edge Processor you created earlier and click Save. You will see a brief message stating that your changes are being saved.
 
Test the pipeline
To verify that your pipeline has been successfully applied:
- Log into the Splunk platform and open the Search app.
- Run the following search and verify that you see the events coming from this pipeline:
    index=cc_transactions_index sourcetype=cc_transactions 
- Verify that no prohibited cardholder data elements (for example, full PAN, or unmasked numbers) appear in the results. Cross-check with the pipeline rules you implemented to ensure compliance.
Monitor data ingestion
To ensure your data ingestion pipelines comply with PCI DSS requirements, it is critical to perform ongoing monitoring. You should regularly review data metrics to ensure that incoming and outgoing data aligns with expected volumes and formats. For more information, see Monitor system health and activity for Edge Processor.
Next steps
To enhance your use cases and maintain compliance, the Splunk App for PCI Compliance with Splunk Enterprise or Splunk Enterprise Security can assist in addressing challenges related to PCI audits. This app, developed and supported by Splunk, is designed to help organizations meet PCI DSS 4.0 requirements. It evaluates and monitors the effectiveness and current status of PCI compliance technical controls in real-time. Additionally, it can identify and prioritize control areas that might need attention, allowing you to promptly address auditor reports or data requests.
The Splunk App for PCI Compliance provides businesses with the advanced tools to meet PCI DSS requirements and automate monitoring. As cyber threats change, utilizing the Splunk platform means being ready for any challenge while maintaining compliance.
In addition, these resources might help you understand and implement this guidance:
- Splunk Resource: Splunk PCI App Demo
- Splunk Lantern Article: Using Splunk Enterprise Security to ensure PCI compliance
- Splunk Lantern Article: Auditing with the Splunk App for PCI Compliance
- 
    
    

