Understanding forensic imaging
Imaging a storage drive is a process where details matter. This section provides a solid foundation on forensic imaging, how it is accomplished, the various types of digital imaging processes, and the various proprietary file formats.
Having a solid understanding of the facets of forensic imaging is important for incident response analysts. Understanding the tools, techniques, and procedures involved ensures that evidence is handled properly and that analysts have confidence in the evidence they’ve acquired. In addition, understanding the necessary terminology allows analysts to accurately prepare reports and testify as to their findings if the need arises.
Image versus copy
One of the first concepts that should be understood is the difference between forensic imaging and copying. Copying files from a suspect hard drive or another medium only provides analysts with the actual data associated with that file. Imaging, on the other hand, allows the analyst to capture the entire drive. This includes areas such as slack space, unallocated space, and possibly accessing deleted files. Imaging also maintains metadata on the volume, including file timestamps. This becomes critical if a timeline analysis is conducted to determine when specific files were accessed or deleted.
Often, the terms cloning and imaging are utilized in place of each other. This is a common improper use of terminology in the IT realm. When cloning a drive, a one-to-one copy of the drive is made. This means that the drive can then be inserted into a system and booted. Cloning a drive is often done to make a fully functional backup of a critical drive. While a cloned drive contains all the necessary files, it is cumbersome to work with, especially with forensic tools. As a result, an image file is taken. An image of a drive contains all the necessary files; its configuration permits a detailed examination while utilizing forensic tools.
Logical versus physical volumes
The second concept that needs to be understood is the types of volumes that can be imaged. Volumes can be separated into physical or logical volumes. Physical volumes can be thought of as containing the entirety of a hard drive. This includes any partitions, as well as the master boot record (MBR). When imaging a physical volume, the analyst captures all of this data. In contrast, a logical volume is a part of the overall hard drive. For example, in a hard drive that is divided into the MBR and two partitions, a logical volume would be the D:
drive. When imaging a logical volume, the analyst would only capture data contained within the D:
drive.
The following diagram illustrates data that is captured while imaging either a physical or logical volume:
Figure 8.1 – Physical versus logical volumes
The type of incident that is being investigated largely dictates the type of imaging that is conducted. For example, if an analyst can identify a potentially malicious file being executed from the D:
drive and is intent on only capturing that data, it might be faster to acquire a logical image of only that volume. Furthermore, a logical acquisition may be necessary in cases where Full Disk Encryption (FDE) is being used. Without the encryption key, logically acquiring files while the system is running is often the only option that’s available.
The one key drawback that a logical image has is that it will not capture unallocated data or data that is not part of the filesystem. Deleted files and other trace evidence will not be part of a logical image. In cases where an activity such as employee misconduct is suspected, the analyst will need to trace as much activity as possible, so a full image of the physical volume will be conducted. Time isn’t a necessary factor here.
In Chapter 5, we discussed the acquisition of evidence such as log files and running memory from a live or powered-up system. In much the same way, incident response analysts have the capability to obtain a logical volume from a running system. This technique is referred to as live imaging. Live imaging may be the best option if a potentially compromised system cannot be taken offline—say, in a high availability (HA) production server—and potential evidence is located within a logical volume.
Dead imaging is performed on a system that has been powered down and the hard drive removed. In this type of imaging, the analyst can capture the entire disk, including all the volumes and the MBR. This may become necessary in incidents where analysts want to ensure that they capture the entirety of the source evidence so that there is no location that hasn’t been examined.
Types of image files
Another aspect of forensic imaging that an analyst should have knowledge of is the types of image files that can be created and leveraged during an investigation. There are several types of image files, some of which are very specialized, but for the purposes of this book, we will focus on the three most common types of evidence files that analysts will most likely create and work with during an incident:
- Raw images: A raw image file contains only the data from the imaged volume. No additional data is provided in this type of image, although some imaging tools, such as FTK Imager, include a separate file with imaging information. Raw image outputs include the
.raw
,.img
, and.dd
extensions. Some software, such as the Linuxdd
command, provides a flexible option when speed and compatibility with forensic tools may be an issue. - Advanced Forensics File Format (AFF4): AFF4 is an open source format for image files. First proposed in 2009, the format is used to support several tools, such as Google Rapid Response (GRR).
- EnCase evidence files: An EnCase evidence file, or E01 or EX01 file, is a proprietary file format that was developed by OpenText as part of its EnCase forensic tools in 1998. This format was based on the Expert Witness Format (EWF), which was found in ASR Data’s Expert Witness Compression Format. The E01 file contains metadata about the image. The metadata that is contained in both the header and footer captures and stores information about the drive type, operating system, and timestamps. Another key feature of an E01 file is the inclusion of a Cyclical Redundancy Check (CRC). This CRC is a file integrity verification that takes place after every 64 KB of data is written to the image file. The CRC ensures the integrity of the preceding block of data over the entire image file. Finally, an E01 file contains a Message Digest 5 (MD5) hash within the footer of the file. The following diagram illustrates which components of an E01 file are created during the imaging process:
Figure 8.2 – E01 file format
SSD versus HDD
Another key facet of imaging that incident response analysts need to understand is how to image specific storage media—specifically, understanding the difference between a hard disk drive (HDD) and a solid-state drive (SSD). Understanding this difference has become critical, with SSDs being much more common, especially in endpoints such as laptops and desktop computers.
The main difference between the two goes down to the smallest details of how information is stored. Traditional spinning HDDs store information by changing magnetic polarity on actual spinning disks. Therefore, digital forensic and incident response analysts need to be aware of magnetic fields when handling evidence and why dropping an HDD would often prove fatal for the disk.
The one aspect of HDDs that is of interest to digital forensic examiners is how data is handled. The data is written to the disk and stays within the sectors that have been assigned to the data. When a user deletes a file, it is not removed. The data may be complete or partially overwritten, and with proper imaging tools, this data can be located and reconstructed for analysis.
Because of how HDDs work and the potential for reconstruction of data, it is a good practice to create a physical disk image. This involves powering down the system by removing the source of power, not through the operating system shutdown. This preserves the state of the drive now the analyst has accessed it. In those circumstances where this is not possible, a logical image can be taken from a live and powered-up system.
SSDs, on the other hand, hold information based on the state of electrons in cells in the SSD. This type of data storage uses instructions found on the printed circuit board (PCB) that controls how data is handled on the SSD. The first set of instructions is garbage collection. When a user deletes a file, the operating system sends instructions to the chipset indicating that the file is marked for deletion, and the PCB can reset the electrons in that space to neutral, thereby removing the data. This operating system instruction is often referred to as TRIM. For example, entering the fsutil behavior query disabledeletenotify
command into Windows’ Command Prompt will produce the following output if the OS is using an SSD:
Figure 8.3 – TRIM operations enabled
Another feature of SSD chipsets that is of concern to digital forensic and incident response analysts is wear leveling. The electrons in an SSD have a finite lifespan and cannot be continually turned on and off. If the operating system only uses the first 100 GB of the drive, it may wear that section out, making the drive useless. To prevent this, SSDs make use of wear leveling, where data is continually moved to different locations on the drive, thereby reducing the potential of making the drive unusable.
These two features mean that traditional write blockers that are used to ensure that no changes are made to the disk during imaging do not work, as the PCB manages how data is written and leveled on the SSD. While imaging is possible, the analysts cannot ensure that no changes were made to the disk during the process. To limit these changes, analysts should image an SSD in the condition that they found it. If the system is powered off, remove the drive from the system and image. If the system is on, it has to be imaged live.