The IR process
There is a general path that cybersecurity incidents follow during their lifetime. If the organization has a mature IR capability, it will have taken measures to ensure it is prepared to address an incident at each stage of the process. Each incident starts with the first time the organization becomes aware of an event or series of events indicative of malicious activity. This detection can come in the form of a security control alert or an external party informing the organization of a potential security issue. Once alerted, the organization moves through analyzing the incident through containment measures to bring the information system back to normal operations. There is no set IR process. One standard that is widely used is the National Institute of Standards and Technology (NIST) IR process. The following diagram, taken from the NIST Special Publication (SP) 800-61 shows how the NIST process flows in a cycle, with preparation as the starting point. A closer examination reveals that every incident is used to better prepare the organization for future incidents as the post-incident activity, and is utilized in preparation for the next incident:
Figure 1.1 – NIST IR process
The IR process can be broken down into the following six distinct phases, each with a set of actions the organization can take to address the incident:
- Preparation: Without good preparation, any subsequent IR is going to be disorganized and has the potential to make the incident worse. One of the critical components of preparation is the creation of an IR plan. Once a plan is in place with the necessary staffing, ensure that personnel detailed with IR duties are properly trained. This includes processes, procedures, and any additional tools necessary for the investigation of an incident. In addition to a plan, tools such as forensics hardware and software should be acquired and incorporated into the overall process. Finally, regular exercises should be conducted to ensure that the organization is trained and familiar with the process.
- Detection: The detection of potential incidents is a complex endeavor. Depending on the size of the organization, they may have over 100 million separate events per day. These events can be records of legitimate actions taken during the normal course of business or be indicators of potentially malicious activity. Couple this mountain of event data with other security controls constantly alerting to activity and you have a situation where analysts are inundated with data and must subsequently sift out the valuable pieces of signal from the vastness of network noise. Even today’s cutting-edge Security Information and Event Management (SIEM) tools lose their effectiveness if they are not properly maintained with regular updates of rulesets that identify which events qualify as a potential incident. The detection phase is that part of the IR process where the organization first becomes aware of a set of events that possibly indicates malicious activity. These events that have been detected and are indicative of malicious behavior are then classified as an incident. For example, a security analyst may receive an alert that a specific administrator account was in use when the administrator was on vacation. Detection may also come from external sources. An internet service provider (ISP) or law enforcement agency may detect malicious activity originating in an organization’s network and contact them to advise them of the situation. In other instances, users may be the first to indicate a potential security incident. This may be as simple as an employee contacting the help desk and informing a help desk technician that they received an Excel spreadsheet from an unknown source and opened it. They are now complaining that their files on the local system are being encrypted. In each case, an organization would have to escalate each of these events to the level of an incident (which we will cover a little later in this chapter) and begin the reactive process to investigate and remediate.
- Analysis: Once an incident has been detected, personnel from the organization or a trusted third party will begin the analysis phase. In this phase, personnel begins the task of collecting evidence from systems such as running memory, log files, network connections, and running software processes. Depending on the type of incident, this collection can take from as little as a few hours to several days. Once evidence is collected, it then needs to be examined. There are a variety of tools to conduct this analysis, many of which are explored in this book. With these tools, analysts attempt to ascertain what happened, what it affected, whether any other systems were involved, and whether any confidential data was removed. The ultimate goal of the analysis is to determine the root cause of the incident and reconstruct the actions of the threat actor, from initial compromise to detection.
- Containment: Once there is a solid understanding of what the incident is and which systems are involved, organizations can then move into the containment phase. In this phase, organizations take measures to limit the ability of threat actors to continue compromising other network resources, communicating with command and control (C2) infrastructures, or exfiltrating confidential data. Containment strategies can range from locking down ports and Internet Protocol (IP) addresses on a firewall to simply removing the network cable from the back of an infected machine. Each type of incident involves its own containment strategy, but having several options allows personnel to stop the bleeding at the source if they are able to detect a security incident before or while threat actors are pilfering data.
- Eradication and recovery: During the eradication phase, the organization removes the threat actor from the impacted network. In the case of a malware infection, the organization may run an enhanced anti-malware solution. Other times, infected machines must be wiped and reimaged. Other activities include removing or changing compromised user accounts. If an organization has identified a vulnerability that was exploited, vendor patches are applied or software updates are made. Recovery activities are very closely aligned with those that may be found in an organization’s business continuity (BC) or disaster recovery (DR) plans. In this phase of the process, organizations reinstall fresh operating systems or applications. They will also restore data on local systems from backups. As a due diligence step, organizations will also audit their existing user and administrator accounts to ensure that no accounts have been enabled by threat actors. Finally, a comprehensive vulnerability scan is conducted so that the organization is confident that any exploitable vulnerabilities have been removed.
- Post-incident activity: At the conclusion of the incident process is a complete review of the incident with all principal stakeholders. Post-incident activity includes a complete review of all actions taken during the incident. What worked and—more importantly—what did not work are important topics for discussion. These reviews are important because they may highlight specific tasks and actions that had either a positive or negative impact on the outcome of the IR. It is during this phase of the process that a written report is completed. Documenting the actions taken during the incident is critical to capture both what occurred and whether the incident will ever see the inside of a courtroom. For documentation to be effective, it should be detailed and show a clear chain of events with a focus on the root cause, if it was determined. Personnel involved in the preparation of this report should realize that stakeholders outside of IT might read this report. As a result, technical jargon or concepts should be explained.
Finally, the organizational personnel should update their own IR processes with any new information developed during the post-incident debrief and reporting. This incorporation of lessons learned is important as it makes future responses to incidents more effective.
The role of digital forensics
There is a misconception that is often held by people unfamiliar with the realm of IR, which is that IR is merely a digital forensics issue. As a result, they will often conflate the two terms. While digital forensics is a critical component of IR (and for this reason, we have included a number of chapters in this book that address digital forensics), there is more to addressing an incident than examining hard drives. It is best to think of forensics as a supporting function of the overall IR process. Digital forensics serves as the mechanism for understanding the technical aspects of an incident, potentially identifying the root cause, and discovering unidentified access or other malicious activity. For example, some incidents such as denial-of-service (DoS) attacks will require little to no forensic work. On the other hand, a network intrusion involving the compromise of an internal server and C2 traffic leaving the network will require extensive examination of logs, traffic analysis, and examination of memory. From this analysis, the root cause may be derived. In both cases, the impacted organization would be able to connect with the incident, but forensics plays a much more important role in the latter case.
IR is an information security function that uses the methodologies, tools, and techniques of digital forensics but goes beyond what digital forensics provides by addressing additional elements. These elements include containing possible malware or other exploits, identifying and remediating vulnerabilities, and managing various technical and non-technical personnel. Some incidents may require the analysis of host-based evidence or memory while others may only require a firewall log review but, in each, the responders will follow the IR process.