Incident response is not about NIST 800-61’s governance-focused “Incident Response Process.” It’s about bytes over the wire, hands on keyboard, tactical steps that responders use during an investigation. As information security programs embrace detection and response, they are building repeatable processes for incident response. As with any process, inefficiencies must be identified and ruthlessly optimized.
What is today’s incident response workflow? When your network device throws an alert, a user calls the helpdesk or the call comes in from law enforcement, what do you do?
Step 1: Initial triage: log review & context – If you’re lucky, a related indicator is captured in a log file and those logs are centralized and searchable in a SIEM. If you don’t have those logs, they’re not searchable or the indicator is not covered by your logging. Your only option is to move onto step two.
Step 2: Deep investigation – After initial triage, responders need to answer four questions:
- Where did it come from?
- What did it do?
- Is it still on my network?
- How do I stop it from happening again?
We typically answer these questions with a continuum of digital forensics techniques that provide better answers with commensurately more time spent: live response, memory forensics or disk forensics.
Step 3: Cleanup – After investigation, it’s time to clean up. It may be a single machine or a larger remediation event coordinated across multiple systems. Most organizations simply reimage the computers.
Incident Response is Broken
For many years, we have relied on these techniques as the core of incident response. We have worked hard to make the process more efficient both on single machines and at “enterprise scale” across thousands of machines at a time.
Optimizing this process to be efficient enough for daily operations is daunting for many information security teams. There are fundamental, insurmountable problems:
- The high cost of digital forensics: Even if you have forensics experts in house, the acquisition and analysis of memory and disk images is too time-consuming to be part of your day-to-day operations.
- Increasing globalization and the erosion of the network perimeter: Most responders depend on unfiltered network access to the devices under investigation. If the machine is across the world or in a coffee shop down the street, the investigation becomes more challenging.
- “Unknown unknowns”: By definition, incident response is investigating events that followed a then-unknown profile. If those events had followed a known pattern, the prevention layers of the security stack would have stopped the activity.
As an industry, we cannot continue to rely on digital forensics as the only tool in the response tool bag. Continued evolutionary efficiency improvements in digital forensics will never be sufficient. We must reimagine incident response to match the scale of today’s enterprise operations.
Reimagining a New Incident Response
Too many IR professionals artificially limit their vision, biased by what we do today. Give yourself more freedom, unhindered by traditional practices. Unconstrained, how would you answer these four questions better, faster and cheaper?
- Given a list of malware hashes, how do you know if they were on your network?
- Given a domain name of a confirmed malware C2 server, how do you know if they were on your network?
- Given a list of registry keys used by malware, how do you know if it was active on your network?
- Given a malicious process on a computer, how do you know where it came from? How long has it been here? What did it do? How widespread is it? Is it still active right now?
Responses usually come in two flavors:
- Scanning – “I would scan my network and look for all instances of X. Then I would use digital forensics to answer the remaining questions on the few machines I discover.” This approach suffers from three challenges: it takes weeks to scan large enterprises, only gives you a snapshot in time and is limited by the high cost of digital forensics.
- Signatures – “I would configure my systems to watch for X, and then record all information about it.” This approach suffers from the same problem antivirus software has had for decades. Signature-based detection will never revolutionize our processes.
When asked to take their ideas to the next step and propose solutions that answer these questions and overcome the challenges of scanning and signatures, responders rapidly converge on two themes:
- Continuously record everything – If we can’t use signatures to define threats in advance, there is only one choice: record everything, all the time and store it for as long as practical. When the industry catches up to the attackers, we can review historical activity.
- Centralized storage – If it takes too long to talk to all the computers in your network, there is only one choice: centralize all the records in advance. Searches can be completed in seconds on a centralized index.
These are lessons we learned 20 years ago about the network, but have never applied to endpoints. We need the same continuously recorded and centrally stored data we rely on at the network level from our endpoints.
With specialized logging on the computers in the enterprise, traditional log aggregation and indexing, incident responders can avoid digital forensics altogether by having the information they need in advance and take the first step in reimagining the very nature of incident response.