Skip to main content
 
 
 
Splunk Lantern

Recovering from an incident using SOAR

 

During the recovery phase of an incident, it is important to ensure that the business gets back up and running quickly, as any incident has the potential to have catastrophic results. This is especially true when an organization doesn't have any recovery processes in place. Having a written recovery playbook and a SOAR process is essential to helping the organization continue to function. Since products like Splunk SOAR help automate workflows and speed the recovery process, this article discusses using a playbook and taking advantage of maturity models.

Recovery playbooks

Playbooks provide the digital equivalent to a traditional human incident response plan, so it's important to verify the actions associated with them before relying on them during an actual incident. Below are some examples:

  • Identify playbook gaps and weaknesses. Testing helps identify potential gaps or weaknesses that might be embedded in the playbook. Logic errors can be introduced when incorrect or invalid actions are introduced. The testing phase should involve as many expected real world scenarios as possible to ensure that these errors are eliminated before they are implemented.
  • Automated response interrogation. Misconfigured responses based on faulty logic can have dire consequences. For example, if a playbook is configured to delete malware and the wrong files get deleted, then business operations can be disrupted.
  • Phishing response. When a phishing attack is reported or automatically detected, the playbook guides you through the necessary steps of prebuilt actions. You can also customize actions to fit your organization. Additional enhancements include integration with a dispatch identifier reputational analysis playbook, which automatically enriches an incident with reputation details from VirusTotal and PhishTank.

Using SOAR maturity models

Security processes and workflows are constantly changing for security operations centers (SOC), and the Splunk SOAR maturity model can help make sense of the options that fit best for your organization. The four stages of the SOAR maturity model are as follows:

  1. Mostly reactive and highly manual. The most basic of the four models where SOAR is deployed and the most labor intensive. Since this is an ad-hoc approach, there is a heavy emphasis on alerting and triage. Common playbooks include ticket creation, and reputation management. A managed SOC is often used with this model because the organization might not have the people to support all roles. Security operations and engineering might be on the same team or be outsourced.
  2. Reactive/proactive. Stage two of the maturity journey is for organizations that require more people and processes to support the SOC. Security operations and security engineers often function in different internal or external teams. More advanced Splunk Enterprise Security features such as risk-based alerting (RBA) and the use of enterprise security content updates (ESCU) are employed. SOAR playbooks often used at this stage include endpoint alert enrichment and cloud resource management.
  3. Mostly proactive. As your organization matures, stage three of the maturity journey is used to enhance the capabilities of the SOC. These might include threat hunting, forensics, and malware analysis activities. This is the stage in which, after an incident, you conduct a lessons-learned analysis, which is discussed in the next section. Security operations are often a part of a computer security incident response team (CSIRT), with forensics, threat hunting, and purple teams. SOC team members develop Splunk Enterprise Security custom-built detections and use advanced RBA techniques. Common stage three SOAR playbooks include file removal and process termination.
  4. Full proactive. When your organization achieves this stage, security teams have a full understanding of the organization's operations and their ability to affect business continuity, and employ mature processes and procedures when supporting advanced investigations. Deep intelligence analysis is often used to assist in forensic investigations and attacks at all levels. Stage four playbooks include custom threat hunting, end-to-end phishing chains, quality assurance, and customized workflows.

Taking advantage of lessons learned

As part of the activities during an incident, responders create a report that should take into account what actually happened when performing the various steps in the incident response plan. The goal is to learn from the incident, so every facet of the investigation is covered in detail. The report should contain the vulnerabilities that attackers used as well as the steps that responders used to mitigate the problem. Responders might also include other supporting areas of the response, such as the effectiveness of the response team and the performance of tools used. Finally, provide in detail any recommendations for improvement.

The post-incident wrap up should be conducted with everyone that was involved in the incident. Primary stakeholders should also be present so that any opportunities for improvement to policies or procedures can be addressed.

Next steps

These resources might help you understand and implement this guidance:

Splunk OnDemand Services: Use these credit-based services for direct access to Splunk technical consultants with a variety of technical services from a pre-defined catalog. Most customers have OnDemand Services per their license support plan. Engage the ODS team at ondemand@splunk.com if you would like assistance.