Data optimization
The amount of security log data is increasing, and the logs are becoming more intricate. Security analysts face difficulties in balancing the tradeoffs between value and cost. New options like distributed data storage and federated warehouses seem appealing, but challenges arise in efficiently routing data to suitable storage while ensuring access based on search demand. Many security monitoring tools are restricted in the type of data that can be ingested, which makes data ingestion cumbersome; they lack enrichment capabilities; and they don’t use a common data format to normalize data. These issues prevent teams from being able to perform end-to-end investigations.
The Splunk platform is a vital tool to help overcome these problems and maintain foundational visibility. When planning how to optimize your data sources for best use in the Splunk platform, you'll need to plan a few key tasks: normalizing your data, enriching it, and ensuring that your data availability and retention practices are robust.
Normalization
Data normalization is a way to ingest and store your data in the Splunk platform using a common format for consistency and efficiency. When ingesting a new data source or when reviewing existing data, consider whether it follows the same format for data of similar types and categories. When performing a search or scheduling searches, this will save time and increase performance.
Enrichment
Enrichment of your data in the Splunk platform provides a way to add additional context and information to speed up your mean time to respond (MTTR).
When trying to understand more about data or events, manually looking up additional information about that event takes time. The Splunk platform allows you to enrich the events and automatically add information from other sources, making that process much quicker.
For example, you can use threat intelligence feeds to enrich a notable event in Splunk Enterprise Security. By enriching your Splunk Enterprise Security notable event with information from threat intelligence, you can add valuable information into the event that will provide you with actionable insights without having to look them up manually. This improves your analyst workflows and speeds up your time to respond.
Data availability and retention
Data availability describes how often your data is available to be utilized. During an active security incident, not having the correct data or the correct time frame of data ready in the Splunk platform can have severe consequences. Furthermore, unexpected issues or interruptions in data management are inevitable, so your system should be able to work around these issues while still allowing you to access the data you need.
Establishing and maintaining a secure, successful Splunk deployment starts with having the right data. You'll need to plan and implement common framework structures around the system and the data itself in order for the right data to be in place. You should define your requirements and then develop policies that adhere to those requirements, while adhering to a structure-based framework for event management activities. These activities include event generation, transmission, storage, analysis, retention, and disposal.
What are the benefits of normalization, enrichment, and effective data availability and retention?
Correct data ingestion produces many benefits that make implementing other solutions easier. It allows your team to focus on the analysis and prioritization tasks that are most important to your organization. Enriching data within your security incident review provides valuable additional insight into the events and speeds up time to resolution. While some businesses are more time-sensitive than others, maintaining data availability is essential for the performance and business continuity of any organization. Some additional benefits are:
- Data can be formatted in a reliable and consistent way.
- Alerts and correlation rules can be easier to implement.
- Apps and add-ons can be easier to implement.
- Increased confidence and data integrity.
- Insights can be added automatically to security events and correlation searches.
- Manual analyst tasks can be automated.
- Security notable events can be aggregated into a single dashboard.
- Data silos can be broken down.
- Operational efficiency can be improved.
- Noise can be filtered from intelligence sources to automatically improve alert prioritization.
- Threat intelligence data can be shared across teams, tools, and sharing partners.
- Enrichment based on normalized intelligence can help you drive efficiencies.
- The data lifecycle can be better controlled with appropriate storage and management.
- Continuity of service can be improved, with highly available deployments allowing for "always on" access.
- The storage and overall cost of managing and maintaining data can be improved, based on usage or demand patterns.
- Data can be better secured, preventing data from being misused, and keeping it secure at rest and in transit.
What are data optimization best practices?
Splunk recommends the following best practices:
- Use the Common Information Model (CIM).
- Onboard data to the Splunk platform using Data Manager or forwarders.
- Use the OCSF-CIM add-on for Splunk.
- Create a data tiering strategy to organize data according to its age, frequency of search and the use case it supports.
- Improve data pipeline management by filtering, masking, transforming, and routing data with the Splunk Data Management pipeline builders (Edge Processor and Ingest Processor.)
- Create a data storage and retention policy for your Splunk Enterprise or Splunk Cloud Platform deployment.
Specific areas of your environment you might want to monitor to help ensure your data is properly available and being managed appropriately include:
- Idle connections. Idle connections use resources, congest networks, and impact system performance. Idle connections can also indicate problems and show gaps in data availability.
- Long-running queries, commands, or jobs. This applies not just to database queries or jobs, but also to commands and backups. These types of digital actions can be an indicator of poor system health, slow disk speeds, CPU or other resource contention, or deeper systemic problems.
- Disk input/output. Disk I/O typically refers to the input/output operations of the system related to disk activity. Tracking disk I/O can help identify bottlenecks, poor hardware configurations, improperly sized disks, or poorly tuned disk layouts for a given workload.
- Memory. Monitoring memory helps you look into traffic jams or leaks, identify improperly sized systems, understand loads, and see spikes in activity. In addition, knowing about memory-intensive patterns can help you anticipate availability demands.
- Disk space. Disk space monitoring is available in many forms, and utilizing it as a metric can prevent unnecessary problems and costly efforts to introduce more space.
- Errors and alerts. Errors, alerts, and recovery messages in logs are another good metric to consider. Adding log monitoring for FATAL, PANIC, and key ERROR messages can help you identify issues that your availability solution is frequently recovering from, such as system or application crashes, core dumps, or errors requiring system downtime.
What data optimization practices should I put in place?
These additional resources will help you implement this guidance:
- Product Tip: Managing data models in Enterprise Security
- Product Tip: Normalizing Enterprise Security data with technology add-ons
- Product Tip: Onboarding data to Splunk Enterprise Security
- Product Tip: Using the Splunk Enterprise Security assets and identities framework
- Product Tip: Properly securing Splunk indexes
- Getting Started Guide: Getting data into ES
- Getting Started Guide: Unified App: Use Case - Splunk Intelligence Management (TruSTAR)
- .Conf session: Data onboarding: Where do I begin?
- Splunk Success Framework: Data lifecycle management
- Identifying high-value assets and data sources
- Learn how to prepare for attacks that specifically target your organization's high value assets, preventing disruption to business continuity, reputational or regulatory risk.
- Masking IP addresses from a specific range
- With Edge Processor, there are multiple ways to mask IP addresses from your internal range in your web server data.
- Prescriptive Adoption Motion - Data sources and normalization
- Learn how to normalize log data, making it ready for correlation and use in Splunk Enterprise Security.
- Reducing PAN and Cisco security firewall logs with Splunk Edge Processor
- You can deploy Splunk Edge Processor as an end-to-end solution for handling syslog feeds such as PAN and Cisco logs, including the functionality to act as a syslog receiver, process and transform logs and route the data to supported destinations.
- Routing root user events to a special index
- Create a pipeline to filter specific events to a designated index.