Onboarding useable and purposeful security data
A common security data workflow for any company is to recognize that they have data, ingest the data into their software, and then maybe find a use case for the data six months later. The thought process here is that all data is valuable, but in reality, this process creates more problems than it solves. This goal-less ingestion creates a data graveyard where paying to store unnecessary data not only wastes resources, but leads to data onboarding fatigue with endless updates and maintenance, rather than delivering value.
To take yourself beyond this basic level 1 strategy into a more mature data onboarding process, use the five steps described in this article to refine your process.
Solution
Step 1: Ingest request intake
This is a standard first step at any organization, regardless of maturity level. Keep the following in mind for your request process:
- Keep the request form simple.
- Document and track every communication.
- Get a business justification.
- Prioritize the request. After reading the business justification, your request prioritization level as the platform owner might not be the same as that of the data owner.
- Respond.
Step 2: Onboarding questionnaire interviews
During this stage, you want to identify what you are onboarding and why, so that you can prove the value and prioritize the request appropriately.
- A call often works better than a form because people tend to give short, incomplete answers to forms so you'll often need to follow up with a call anyway.
- Focus your questions. The following are examples of what answers you should be seeking:
- What am I onboarding and why? Background and end state, what's been done already, standard or bespoke monitoring, is this a response to a specific threat or incident?
- What security effects do I need to achieve? How would you like us to respond to an alert/incident on your service?
- What is the one event that must be actioned immediately? What about this service would do the most damage if compromised? Which events should get people out of bed?
- Admin questions
- People: Who is the main contact? Service manager? Stakeholders? Responders?
- Timeline: Date onboarding is required by? Why are these times important?
- Access: Who can and cannot access this data?
- Technical questions
- Ask for service decompositions, high level designs, and network diagrams
- Describe existing monitoring, data aggregation options
- At a high level, identify preference, patterns, and quick wins
- Scale/volume: GB/day, EPS, number of devices?
- Planned changes, growth, projects
- How are they expecting to send data
- Administrative details are as important as technical
- Understand what’s important and why
- What are their expectations and desired end state
- Keep “solutionizing” to a minimum
Step 3: Threat modeling
At this stage, you begin to engage other functions around a security service, such as threat modeling. This helps you design and develop use cases to make sure that the data you onboard provides value.
- There are hundreds of models, so be sure to pick a suitable one for you. Here is an example from the National Cyber Security Centre.

- Use models to generate use cases (detections) and responses (who to involve and what actions to take) by decomposing each service.
- Document appropriately.
- If in doubt, use Standard Monitoring: Authentication, Network Traffic, Change and Endpoint Protection.
Step 4: Data source evaluation
For each data source identified during the threat modeling stage, you will need to gather the following information:
- Technology and type and format
- Get the facts
- Ingestion method/pattern (Whether an add-on exists or you need to build one)
- Normalization effort
Here is an example evaluation:
| Data Source Evaluation | |
|---|---|
| Technology details | Malvern Hills Network Firewalls MAL-OS Version 10.2 |
| Data source location | Two hardware firewalls located in primary and secondary data centers |
| Data volume estimate | 80 GB/day; ~12,000 events per second (EPS) at peak |
| Data format | Syslog (default key-value pair format |
| Proposed ingestion method | Firewalls will forward syslog data to the central Splunk Connect for Syslog (SC4S) server |
| Splunkbase TA available | Yes, but not a good one |
| Splunk supported | No |
| Needs CIM | Yes, Network traffic data model |
Step 5: Test, adjust, and promote
In an ideal world, you would have a test environment where you can test the ingestion process end-to-end, including the detections and responses.
- Test
- Data and compliance
- Role-based access control
- Documentation, make a checklist, including elements such as the following:
- Onboarding sheet
- Linked to use cases
- CIM compliant enough
- RBAC
- Known coverage
- Analysis briefed/trained
- Tested in pre-prod
- Risk owner happy
- Sent to live
- Onboarding complete
- Detections, use cases and incident response
- Promote to live
- Analyst walkthroughs - take them through the new data source, what use cases it's applicable to, any baseline monitoring you have
- Add to lifecycle review
Summary
- Start with "Why": Sort your use cases first to make sure the data you are bringing in adds value, and then worry about onboarding data.
- Document everything: But don’t get carried away. Make sure there's a reason for the documentation and that you have a maintenance plan for it. If possible, create the documentation inside the Splunk platform - such as dashboards and data inventories - so that analysts can access it easily.
- Make patterns: Onboarding patterns simplify your work. But also expect to break them. The larger your organization gets, the less likely every service you have will fit the pattern.
- CIM compliant enough: 100% compliance is unachievable, but compliant "enough" is usually very achievable. The goal is to make sure your data fits your use cases and you can take advantage of the out-of-the-box content in security tools.
- Connect data to a response: Every data source should be linked to a use case and a clear action.
Next steps
Now that you know the steps to take to achieve level 2 maturity in security data onboarding, watch the full .conf25 talk, From request to response: Mastering security data onboarding. In the talk, you'll learn about levels 3 and 4 and what it takes to further refine your onboarding process.
- Splunk Community: Best practices for defining source types
- Splunk Lantern: Conducting a SIEM use case development workshop
- Splunk Lantern: Conducting an insider threat workshop in your organization
- Splunk Lantern: Following data onboarding best practices

