EDI: Finding repeated failure patterns
This article shows you how to build a search to look for patterns of repeated transmission or acknowledgment errors might indicate a systemic issue. This is a key performance indicator for companies that need to monitor electronic data interchange (EDI) transmission and acknowledgement.
EDI plays a critical role in ensuring that data flows seamlessly across various stakeholders—suppliers, manufacturers, logistics providers, and retailers—without manual intervention. It is a core technology for automating supply chain processes. By continuously monitoring EDI transmissions and acknowledgments, businesses can proactively identify and resolve issues, ensuring smooth communication between systems and uninterrupted supply chain operations.
KPI search example
Message send retries occur when an EDI document (for example, purchase orders or advance ship notices) fails to be acknowledged by the receiving system, prompting the sender to attempt resending the document. These retries can indicate issues such as communication failures, timeouts, or system errors that prevent the successful transmission or receipt of the document. Symptoms include delayed processing, missing acknowledgments (for example, EDI 997), and a buildup of queued messages waiting for confirmation.
Monitoring message retries is critical for maintaining smooth supply chain operations. Unchecked retries can lead to transmission delays, which in turn disrupt procurement, production, or shipping timelines. Additionally, repeated failed attempts might result in duplicate orders or transactions, causing confusion and operational inefficiencies. By monitoring retries, companies can proactively address communication issues before they escalate into more severe disruptions.
It is important to monitor for :
- High retry rates: A spike in message retries can indicate persistent transmission issues, such as network outages or system incompatibility.
- Repeated errors for specific documents: If a particular type of document (for example, 850) or trading partner frequently experiences retries, there might be configuration problems or partner-specific communication issues.
- Unresolved retries: Continuous retries without resolution might lead to message failures, signaling a need for immediate intervention.
Using Splunk SPL, it’s easy to identify recurring event patterns. First, we can calculate the total number of EDI codes found within each transaction ID. By aggregating the sum, any count greater than one indicates that there are repeated EDI events within the same transaction, signaling a potential loop.
index=supply_chain_edi sourcetype="edi:x12" source=edi_quantumline_customer_full NOT edi_code=997 | eval edi_ack_status_combo=edi_code+"-"+edi_code_ack+"-"+edi_ack_status | eval edi_code_groupby=if(isnull(edi_code_ack), edi_code, edi_code_ack) | strcat edi_code "-" edi_ack_status edi_event_pattern_combo | stats count AS evt_cnt BY edi_tr_id edi_code | search evt_cnt>2
The results show that EDI code 856 (Advance Ship Notice (ASN)) appears multiple times for certain transactions. This indicates that multiple ASNs were sent without receiving any acknowledgment, meaning the system is repeatedly attempting to send the same message, creating a loop.
As an additional step, we can now verify the event patterns for these two transactions. By using a sub-search, we can automatically filter for transactions that exhibit repetition. The sub-search is embedded within square brackets "[ ]", and the result of this search processes and returns the transaction IDs where repeated EDI codes have occurred. Here is the complete search with the sub-search applied.
index=supply_chain_edi sourcetype="edi:x12" NOT edi_code=997 [ search index=supply_chain_edi sourcetype="edi:x12" source=edi_quantumline_customer_full NOT edi_code=997 | eval edi_ack_status_combo=edi_code+"-"+edi_code_ack+"-"+edi_ack_status | eval edi_code_groupby=if(isnull(edi_code_ack), edi_code, edi_code_ack) | strcat edi_code "-" edi_ack_status edi_event_pattern_combo | stats count AS evt_cnt BY edi_tr_id edi_code | search evt_cnt>2 | table edi_tr_id ] | table _time edi_tr_id edi_code edi_type edi_ack_status edi_buyer edi_code_ack edi_cont_num edi_date edi_flag edi_requestor edi_responder edi_seller edi_sequence edi_time | sort edi_tr_id, _time
The result identifies two transactions with repeated EDI 856 (ASN) messages, indicating that the system is or was stuck in a loop sending these messages.
Search explanations
Splunk search | Explanation |
---|---|
index=supply_chain_edi sourcetype="edi:x12" NOT edi_code=997 |
Select EDI X12 data by selecting sourcetype of edi:x12 . |
[ search index=supply_chain_edi sourcetype="edi:x12" source=edi_quantumline_customer_full NOT edi_code=997 |
Sub search that looks for repeating EDI messages per transaction IDs. The result returns transaction IDs with repeating EDI messages, which indicates a loop in the process. |
| table _time edi_tr_id edi_code edi_type edi_ack_status edi_buyer edi_code_ack edi_cont_num edi_date edi_flag edi_requestor edi_responder edi_seller edi_sequence edi_time |
Show us the result in a table. we can clearly see EDIs that are in loop. |
| sort edi_tr_id, _time |
Sort by transaction ID. |
Next steps
When you have this search running in your Splunk platform, return to the Monitoring electronic data interchange transmission and acknowledgement use case to learn how to share the results with stakeholders and to find other KPIs you might want to measure. You can also review the Solution Accelerator for Supply Chain Optimization for more great use cases to help you use the Splunk platform to be successful in your supply chain operations.