Onboarding useable and purposeful security data

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

A common security data workflow for any company is to recognize that they have data, ingest the data into their software, and then maybe find a use case for the data six months later. The thought process here is that all data is valuable, but in reality, this process creates more problems than it solves. This goal-less ingestion creates a data graveyard where paying to store unnecessary data not only wastes resources, but leads to data onboarding fatigue with endless updates and maintenance, rather than delivering value.

To take yourself beyond this basic level 1 strategy into a more mature data onboarding process, use the five steps described in this article to refine your process.

Solution

Step 1: Ingest request intake

This is a standard first step at any organization, regardless of maturity level. Keep the following in mind for your request process:

Keep the request form simple.
Document and track every communication.
Get a business justification.
Prioritize the request. After reading the business justification, your request prioritization level as the platform owner might not be the same as that of the data owner.
Respond.

Step 2: Onboarding questionnaire interviews

During this stage, you want to identify what you are onboarding and why, so that you can prove the value and prioritize the request appropriately.

A call often works better than a form because people tend to give short, incomplete answers to forms so you'll often need to follow up with a call anyway.
Focus your questions. The following are examples of what answers you should be seeking:
- What am I onboarding and why? Background and end state, what's been done already, standard or bespoke monitoring, is this a response to a specific threat or incident?
- What security effects do I need to achieve? How would you like us to respond to an alert/incident on your service?
- What is the one event that must be actioned immediately? What about this service would do the most damage if compromised? Which events should get people out of bed?
- Admin questions
  - People: Who is the main contact? Service manager? Stakeholders? Responders?
  - Timeline: Date onboarding is required by? Why are these times important?
  - Access: Who can and cannot access this data?
- Technical questions
  - Ask for service decompositions, high level designs, and network diagrams
  - Describe existing monitoring, data aggregation options
  - At a high level, identify preference, patterns, and quick wins
  - Scale/volume: GB/day, EPS, number of devices?
  - Planned changes, growth, projects
  - How are they expecting to send data
Administrative details are as important as technical
Understand what’s important and why
What are their expectations and desired end state
Keep “solutionizing” to a minimum

Step 3: Threat modeling

At this stage, you begin to engage other functions around a security service, such as threat modeling. This helps you design and develop use cases to make sure that the data you onboard provides value.

There are hundreds of models, so be sure to pick a suitable one for you. Here is an example from the National Cyber Security Centre.
Use models to generate use cases (detections) and responses (who to involve and what actions to take) by decomposing each service.
Document appropriately.
If in doubt, use Standard Monitoring: Authentication, Network Traffic, Change and Endpoint Protection.

Step 4: Data source evaluation

For each data source identified during the threat modeling stage, you will need to gather the following information:

Technology and type and format
Get the facts
Ingestion method/pattern (Whether an add-on exists or you need to build one)
Normalization effort

Here is an example evaluation:

Data Source Evaluation
Technology details	Malvern Hills Network Firewalls MAL-OS Version 10.2
Data source location	Two hardware firewalls located in primary and secondary data centers
Data volume estimate	80 GB/day; ~12,000 events per second (EPS) at peak
Data format	Syslog (default key-value pair format
Proposed ingestion method	Firewalls will forward syslog data to the central Splunk Connect for Syslog (SC4S) server
Splunkbase TA available	Yes, but not a good one
Splunk supported	No
Needs CIM	Yes, Network traffic data model

Step 5: Test, adjust, and promote

In an ideal world, you would have a test environment where you can test the ingestion process end-to-end, including the detections and responses.

Test
- Data and compliance
- Role-based access control
- Documentation, make a checklist, including elements such as the following:
  - Onboarding sheet
  - Linked to use cases
  - CIM compliant enough
  - RBAC
  - Known coverage
  - Analysis briefed/trained
  - Tested in pre-prod
  - Risk owner happy
  - Sent to live
  - Onboarding complete
- Detections, use cases and incident response
Promote to live
- Analyst walkthroughs - take them through the new data source, what use cases it's applicable to, any baseline monitoring you have
- Add to lifecycle review

Summary

Start with "Why": Sort your use cases first to make sure the data you are bringing in adds value, and then worry about onboarding data.
Document everything: But don’t get carried away. Make sure there's a reason for the documentation and that you have a maintenance plan for it. If possible, create the documentation inside the Splunk platform - such as dashboards and data inventories - so that analysts can access it easily.
Make patterns: Onboarding patterns simplify your work. But also expect to break them. The larger your organization gets, the less likely every service you have will fit the pattern.
CIM compliant enough: 100% compliance is unachievable, but compliant "enough" is usually very achievable. The goal is to make sure your data fits your use cases and you can take advantage of the out-of-the-box content in security tools.
Connect data to a response: Every data source should be linked to a use case and a clear action.

Next steps

Now that you know the steps to take to achieve level 2 maturity in security data onboarding, watch the full .conf25 talk, From request to response: Mastering security data onboarding. In the talk, you'll learn about levels 3 and 4 and what it takes to further refine your onboarding process.

Splunk Community: Best practices for defining source types
Splunk Lantern: Conducting a SIEM use case development workshop
Splunk Lantern: Conducting an insider threat workshop in your organization
Splunk Lantern: Following data onboarding best practices