Skip to main content
 
 
 
Splunk Lantern

Data optimization

 

The amount of security log data is increasing, and the logs are becoming more intricate. Security analysts face difficulties in balancing the tradeoffs between value and cost. New options like distributed data storage and federated warehouses seem appealing, but challenges arise in efficiently routing data to suitable storage while ensuring access based on search demand. Many security monitoring tools are restricted in the type of data that can be ingested, which makes data ingestion cumbersome; they lack enrichment capabilities; and they don’t use a common data format to normalize data. These issues prevent teams from being able to perform end-to-end investigations.

The Splunk platform is a vital tool to help overcome these problems and maintain foundational visibility. When planning how to optimize your data sources for best use in the Splunk platform, you'll need to plan a few key tasks: normalizing your data, enriching it, and ensuring that your data availability and retention practices are robust.

Normalization

Data normalization is a way to ingest and store your data in the Splunk platform using a common format for consistency and efficiency. When ingesting a new data source or when reviewing existing data, consider whether it follows the same format for data of similar types and categories. When performing a search or scheduling searches, this will save time and increase performance.

Enrichment

Enrichment of your data in the Splunk platform provides a way to add additional context and information to speed up your mean time to respond (MTTR).

When trying to understand more about data or events, manually looking up additional information about that event takes time. The Splunk platform allows you to enrich the events and automatically add information from other sources, making that process much quicker.

For example, you can use threat intelligence feeds to enrich a notable event in Splunk Enterprise Security. By enriching your Splunk Enterprise Security notable event with information from threat intelligence, you can add valuable information into the event that will provide you with actionable insights without having to look them up manually. This improves your analyst workflows and speeds up your time to respond.

Data availability and retention

Data availability describes how often your data is available to be utilized. During an active security incident, not having the correct data or the correct time frame of data ready in the Splunk platform can have severe consequences. Furthermore, unexpected issues or interruptions in data management are inevitable, so your system should be able to work around these issues while still allowing you to access the data you need.

Establishing and maintaining a secure, successful Splunk deployment starts with having the right data. You'll need to plan and implement common framework structures around the system and the data itself in order for the right data to be in place. You should define your requirements and then develop policies that adhere to those requirements, while adhering to a structure-based framework for event management activities. These activities include event generation, transmission, storage, analysis, retention, and disposal.

What are the benefits of normalization, enrichment, and effective data availability and retention?

Correct data ingestion produces many benefits that make implementing other solutions easier. It allows your team to focus on the analysis and prioritization tasks that are most important to your organization. Enriching data within your security incident review provides valuable additional insight into the events and speeds up time to resolution. While some businesses are more time-sensitive than others, maintaining data availability is essential for the performance and business continuity of any organization. Some additional benefits are:

  • Data can be formatted in a reliable and consistent way.
  • Alerts and correlation rules can be easier to implement.
  • Apps and add-ons can be easier to implement.
  • Increased confidence and data integrity.
  • Insights can be added automatically to security events and correlation searches.
  • Manual analyst tasks can be automated.
  • Security notable events can be aggregated into a single dashboard.
  • Data silos can be broken down.
  • Operational efficiency can be improved.
  • Noise can be filtered from intelligence sources to automatically improve alert prioritization.
  • Threat intelligence data can be shared across teams, tools, and sharing partners.
  • Enrichment based on normalized intelligence can help you drive efficiencies.
  • The data lifecycle can be better controlled with appropriate storage and management.
  • Continuity of service can be improved, with highly available deployments allowing for "always on" access.
  • The storage and overall cost of managing and maintaining data can be improved, based on usage or demand patterns.
  • Data can be better secured, preventing data from being misused, and keeping it secure at rest and in transit.

What are data optimization best practices?

Splunk recommends the following best practices:

Specific areas of your environment you might want to monitor to help ensure your data is properly available and being managed appropriately include:

  • Idle connections. Idle connections use resources, congest networks, and impact system performance. Idle connections can also indicate problems and show gaps in data availability.
  • Long-running queries, commands, or jobs. This applies not just to database queries or jobs, but also to commands and backups. These types of digital actions can be an indicator of poor system health, slow disk speeds, CPU or other resource contention, or deeper systemic problems.
  • Disk input/output. Disk I/O typically refers to the input/output operations of the system related to disk activity. Tracking disk I/O can help identify bottlenecks, poor hardware configurations, improperly sized disks, or poorly tuned disk layouts for a given workload.
  • Memory. Monitoring memory helps you look into traffic jams or leaks, identify improperly sized systems, understand loads, and see spikes in activity. In addition, knowing about memory-intensive patterns can help you anticipate availability demands.
  • Disk space. Disk space monitoring is available in many forms, and utilizing it as a metric can prevent unnecessary problems and costly efforts to introduce more space.
  • Errors and alerts. Errors, alerts, and recovery messages in logs are another good metric to consider. Adding log monitoring for FATAL, PANIC, and key ERROR messages can help you identify issues that your availability solution is frequently recovering from, such as system or application crashes, core dumps, or errors requiring system downtime.

What data optimization practices should I put in place?

These additional resources will help you implement this guidance: