Analyzing nested manufacturing QA data
Manufacturing Quality Assurance (QA) systems generate complex, hierarchical test data that can be challenging to analyze in traditional formats. Each product's test results can span hundreds to thousands of lines in a single event. This data contains both high-level test run information and detailed individual measurement results in a nested structure.
This article shows you how to configure the Splunk platform to ingest, parse, and transform nested manufacturing QA data into actionable insights. You'll learn how to handle multi-line events that contain both top-level metadata and repeating nested test measurements, then flatten this structure. This enables detailed analysis of individual measurements, correlation with product identifiers and test conditions, and QA performance trend monitoring.
Data required
About QA test data formatting
Manufacturing QA test results from automated test equipment arrive as long, multi-line events, each representing a single product's complete test results. These events can span hundreds to thousands of lines and contain two types of data:
- Top-level data attributes: Key-value pairs with overall test run information (timestamp, unit serial number, process location/ID). These serve as metadata for the processing unit.
- Nested data instances: Individual test measurement results enclosed in XML tags, JSON payloads "{ }", or other customer-specific formats. Each instance contains multiple readings and process run statuses. The number of instances varies by unit, creating a hierarchical structure beneath the top-level attributes.
Sample data structure
Start_Test
Start_Header
TIME_START 10/10/2025 08:07:00
UNIT_ID 1234567890
PROD_TYPE |ZTT177|01.04.98
VERSION 876543_00 | 876543
PROCESS_ID 11-56
OVERALL_RESULT PASS
End_Header
Start_Units
UNIT_NAME ANON_UNIT_NAME_1
{ANON_TEST_1, Bad material, 0.0 , 0.0 , 0.0 , 1.0 , PASS}
{ANON_TEST_2, CONT, 0.0 , 0.0 , 0.0 , 0.0 , PASS}
{ANON_TEST_3, CONT, 0.0 , 0.0 , 0.0 , 0.0 , PASS}
{ANON_TEST_4, 1, 0.0 Ohm, 15.0 Ohm, 10.0 Ohm, 100.0 Ohm, PASS}
{ANON_TEST_5, CONT1, 5.0 Ohm, 1.8967 Ohm, Open, 10.0 Ohm, PASS}
{ANON_TEST_6, CONT1, 5.0 Ohm, 1.217 Ohm, Open, 20.0 Ohm, PASS}
{ANON_TEST_7, CONT1, 5.0 Ohm, 1.1515 Ohm, Open, 20.0 Ohm, PASS}
{ANON_TEST_8, OPEN, 10.0 KOhm, 9.143 KOhm, 5.0 KOhm, Open, PASS}
{ANON_TEST_9, CONT, 3.0 Ohm, 1.1191 Ohm, Open, 5.0 Ohm, PASS}
{ANON_TEST_10, CONT, 2.0 Ohm, 1.0938 Ohm, Open, 3.0 Ohm, PASS}
{ANON_TEST_11, CONT, 2.0 Ohm, 1.1913 Ohm, Open, 3.0 Ohm, PASS}
{ANON_TEST_12, CONT, 2.0 Ohm, 0.9805 Ohm, Open, 3.0 Ohm, PASS}
{ANON_TEST_13, CONT, 2.0 Ohm, 1.0428 Ohm, Open, 3.0 Ohm, PASS}
End_Units
End_Test
The challenge lies in parsing these large events to extract both top-level context and detailed nested measurements.
How to use Splunk software for this use case
First, you'll apply some specific configurations to the props.conf and transforms.conf files within your environment. You'll then run a search to transform the data into a flat, tabular format suitable for analysis within the Splunk platform while preserving relationships between metadata and nested records.
props.conf configuration
The props.conf file controls how the Splunk platform processes raw data at ingestion. Add these settings under the [qa_data_nested_hierarchies] stanza:
[qa_data_nested_hierarchies] BREAK_ONLY_BEFORE = Start_Test MAX_EVENTS = 10000 TRUNCATE = 0 LINE_BREAKER = Start_Test TIME_PREFIX = TIME_START\s+ REPORT-qa_data_nested_kv_extract = qa_data_nested_kv_extract REPORT-qa_data_nested_01 = qa_data_nested_all
Setting explanations
BREAK_ONLY_BEFORE = Start_Test: Treats lines starting withStart_Testas new event boundaries. Ensures proper event breaking if multiple test blocks appear in one file.MAX_EVENTS = 10000: Sets the maximum number of lines per event.TRUNCATE = 0: Ensures that the Splunk platform ingests the entire event regardless of length. Prevents truncation of events with 5,000+ lines.LINE_BREAKER = Start_Test: Regular expression defining event break locations. Works withBREAK_ONLY_BEFOREto establish event boundaries.TIME_PREFIX = TIME_START\s+: Identifies the string preceding timestamps, enabling correct time parsing.REPORT-qa_data_nested_kv_extract = qa_data_nested_kv_extract: Applies theqa_data_nested_kv_extracttransform to extract initial key-value pairs.REPORT-qa_data_nested_01 = qa_data_nested_all: Applies theqa_data_nested_alltransform to extract nested data blocks.
transforms.conf configuration
The transforms.conf file defines custom transformations for events. Add these settings under the [qa_data_nested_kv_extract] and [qa_data_nested_all] stanzas:
[qa_data_nested_kv_extract]
KV_MODE = auto
REGEX = ^\s*(?<_KEY_1>[^\{]\S+)\s+(?<_VAL_1>\S+)$
[qa_data_nested_all]
REGEX = (?P\{[^\}]+\})
MV_ADD = true
Transform explanations
[qa_data_nested_kv_extract]: Extracts higher-level metadata attached to each nested record.KV_MODE = auto: Automatically extracts key-value pairs using theREGEXpattern.REGEX = ^\s*(?<_KEY_1>[^\{]\S+)\s+(?<_VAL_1>\S+)$: Captures key-value pairs at line beginnings, excluding lines starting with curly braces.
[qa_data_nested_all]: Extracts individual nested records.REGEX = (?P\{[^\}]+\}): Identifies and extracts all text within curly braces {...}. Places extracted content into theqa_data_nested_allfield.MV_ADD = true: Critical for nested data handling. Creates a multi-value field when multiple {...} blocks exist, preserving all nested instances for processing.
SPL commands for processing nested data
Run the following search to refine the data, flatten the nested structure, and extract individual fields from measurements. You can optimize it by specifying an index and adjusting the time range.
sourcetype="mfg_qa_nested_event"
| rex field=source "(?[^\/]+)\.(mdl|tdx)"
| stats values(*) AS * BY file_name
| table file_name SerialnumberDC UName BID TS FIX qa_data_nested_all
| mvexpand qa_data_nested_all
| rex field=qa_data_nested_all "{(?[^\,]+),(?[^\,]+),(?[^\,]+),(?[^\,]+),(?[^\,]+),(?[^\,]+),(?[^\,^\}]+)}"
| eval _raw=file_name+"|"+SerialnumberDC+"|"+UName+"|"+BID+"|"+TS+"|"+FIX+"|"+qa_data_nested_all
| fields - qa_data_nested_all _raw
Search explanation
The table provides an explanation of what each part of this search achieves. You can adjust this search based on the specifics of your environment.
| Splunk Search | Explanation |
|---|---|
|
|
Filters events from the manufacturing QA data source type. |
|
|
Extracts |
|
|
Groups events by |
|
|
Selects relevant fields for processing. qa_data_nested_all contains the raw nested data blocks extracted by transforms.conf. |
|
|
The pivotal flattening command. Takes the multi-value field |
|
|
Parses each |
|
|
Reconstructs the |
|
|
Removes the original |
Results
The search produces a flattened dataset where each row represents an individual measurement with preserved top-level attributes:
| # | file_name | SerialnumberDC | UName | BID | TS | FIX | field_1 | field_2 | field_3 | field_4 | field_5 | field_6 | field_7 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1010023314361000-... | 1010023314361000 | Unit_101... | 1010023314361000 | HWS|MTS300... | HWBA|I99;... | Panel SetSerialNumbers|MesItac_DoMerge | Wrong material? | 0.0 | 0.0 | 0.0 | 1.0 | PASS |
| 49 | 1010023314361000-... | 1010023314361000 | Unit_101... | 1010023314361000 | HWS|MTS300... | HWBA|I99;... | CONT DCDC_3V3 @P6 P102 P133 | CONT | 0.0 | 0.0 | 0.0 | 0.0 | PASS |
| 189 | 1010023314361000-... | 1010023314361000 | Unit_101... | 1010023314361000 | HWS|MTS300... | HWBA|I99;... | NET short test | 1 | 0.0 Ohm | 15.0 Ohm | 10.0 Ohm | 100.0 Ohm | PASS |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
This structure enables direct analysis of test parameters (field_1 to field_7) for specific products using the preserved header information (file_name, SerialnumberDC, BID, etc.).
Next steps
Now that your data has been restructured and transformed you can perform further analysis, such as:
- Statistical analysis: Calculate averages, standard deviations, or ranges for measurements (
field_3tofield_7) to identify deviations from norms. - Trend monitoring: Track test result changes over time for products or batches to detect performance degradation or process shifts.
- Anomaly detection: Create alerts for test failures (
field_7 = "FAIL") or out-of-spec measurements (field_4,field_5,field_6outside thresholds). - Root cause analysis: Correlate test results with operational data (machine logs, operator actions) to identify quality issue causes.
- Dashboard creation: Build interactive dashboards to visualize QA performance, highlight failing tests, and provide quality overviews.
In addition, these resources might help you understand and implement this guidance:
- Splunk Lantern Article: Analyzing nested JSON manufacturing QA data
- Splunk Lantern Article: Analyzing nested XML manufacturing QA data

