CH10: Anti-Forensics & Data Hiding

Chapter Overview

In Chapter 9, we explored how attackers hide malicious payloads within email using encoding schemes like Base64. Now, we expand that concept to the entire operating system.

In previous chapters, we operated under the assumption that the evidence was waiting to be found. We assumed that if we looked in the right place—the MFT, the Registry, or the Prefetch folder—the truth would reveal itself. However, sophisticated suspects rarely play fair.

Anti-Forensics is the practice of manipulating, erasing, or obfuscating data to defeat digital investigation. It is a digital game of cat and mouse. As forensic tools become more powerful, criminals develop new methods to hide their tracks. This chapter explores the "Counter-Forensics" landscape, from the mathematical wall of encryption and the art of steganography to the subtle manipulation of file timestamps and file system structures. As an investigator, your goal is not just to find data, but to recognize the absence of data or the anomalies that suggest someone is trying to hide.

Learning Objectives

By the end of this chapter, students will be able to:

Define Anti-Forensics and categorize techniques into destruction, hiding, and obfuscation.
Differentiate between Full Disk Encryption (FDE) and Container-based encryption (e.g., Veracrypt).
Utilize entropy analysis tools like DensityScout to identify hidden encrypted containers.
Explain Steganography carrier methods (Image, Audio, Video) and the tools used to create them.
Perform forensic password cracking using John the Ripper and Hashcat, including hash identification and syntax construction.
Demonstrate how Alternate Data Streams (ADS) in NTFS are used to hide files behind other files.
Perform recursive searches using Command Prompt and PowerShell to locate hidden ADS artifacts.
Analyze file headers to detect File Extension Mismatches (e.g., an EXE renamed as a JPG).
Identify artifacts of log wiping, specifically Windows Event ID 1102.
Identify Timestomping by comparing the $STANDARD_INFORMATION and $FILE_NAME attributes in the MFT.

10.1 Introduction: The Digital Arms Race

When a user deletes a file, they usually want it gone. When a criminal "wipes" a drive, they want to ensure you cannot find it. Anti-forensics encompasses any technique used to compromise the availability or usefulness of evidence.

For the investigator, anti-forensics presents a paradox: the act of hiding data often leaves more evidence than the data itself. A pristine, empty hard drive in a house full of technology is suspicious. A computer with perfect, linear timestamps and no "slack space" noise suggests manipulation.

We categorize these threats into three pillars:

Data Destruction: Rendering data unrecoverable (Wiping).
Data Hiding: Making data invisible to standard tools (Steganography, ADS, Hidden Partitions).
Data Obfuscation: Making data unreadable (Encryption).

10.2 Encryption: The Mathematical Barrier

Encryption is the most effective anti-forensic tool available. If implemented correctly, it transforms evidence into high-entropy noise that is mathematically impossible to break without a key.

10.2.1 Full Disk Encryption (FDE)

FDE encrypts the entire physical drive (sector by sector). The operating system cannot boot, and the file system cannot be mounted without the pre-boot authentication (PIN, Password, or USB key).

BitLocker (Windows): The industry standard for Windows. It often utilizes the TPM (Trusted Platform Module) chip on the motherboard to store keys.
- Forensic Impact: If you pull the plug on a running BitLocker machine, the data is locked. This is why Live Memory Acquisition (Chapter 8) is critical—you might capture the decryption key in RAM before powering down.
FileVault (macOS): Similar to BitLocker but for Apple devices. Modern Macs use the T2 or M-series chips to enforce hardware-based encryption that is extremely difficult to bypass.

10.2.2 Container Encryption (The "Hidden Vault")

Unlike FDE, which locks the front door, container encryption creates a hidden safe inside the house. Tools like VeraCrypt (formerly TrueCrypt) create a single file that acts as a virtual encrypted disk.

To the naked eye, a 5GB VeraCrypt container might look like a random video_data.dat file. When the user mounts it with a password, it appears as a new drive letter (e.g., Z:) where they store illegal material. When unmounted, it is just a blob of unreadable data.

10.2.3 Detection: Entropy and DensityScout

What is Entropy? Before we can detect encrypted files, we must understand the concept of Shannon Entropy. In digital forensics, entropy is simply a measure of randomness.

Imagine looking at a page of a book written in English. The letters are not random; they follow rules. "Q" is almost always followed by "U"; "The" is a common word; sentences end with periods. Because there is a predictable structure, we say this data has Low Entropy.

Now, imagine a file filled with completely random, garbled characters generated by a dice roll. There is no pattern, no grammar, and no predictability. This data has High Entropy.

Entropy and Encryption: Encryption algorithms are designed to take structured data (Low Entropy) and scramble it so thoroughly that it becomes mathematically indistinguishable from random noise (High Entropy).

Scale: Entropy is measured on a scale of 0 to 8.
Standard Text: Usually scores between 3.5 and 4.5.
Compressed Data (ZIP/JPG): Compressed files remove repetitive patterns, so they score higher, often around 7.0.
Encrypted Data: A perfectly encrypted file will have an entropy score extremely close to the maximum, typically 7.99 to 8.0.

The Detective's Tool: DensityScout As an investigator, you cannot manually calculate the math for every file on a hard drive. Instead, you use automated tools like DensityScout to scan the drive for you.

The Process: You point DensityScout at a suspect directory (or the entire drive). It recursively scans every file and calculates its entropy score.
The Filter: You tell the tool to only show files with an entropy higher than 7.5.
The Result: If DensityScout returns a file named system_backup.dat with a score of 7.999, but the file header doesn't match a known compressed format (like ZIP), you have almost certainly found an encrypted container.

10.3 Steganography: Hiding in Plain Sight

Cryptography protects the content of a message (making it unreadable). Steganography protects the existence of the message (making it invisible). The goal is to hide a secret file inside a "carrier" file so that no one even knows to look for it.

10.3.1 Carrier Methods

While images are the most famous carrier, modern steganography can utilize various media types to obscure data.

Images (BMP, PNG, JPG): This is the most common form.
- Technique: Least Significant Bit (LSB) Insertion. Digital images are made of pixels, and each pixel has a color value (e.g., RGB). By changing the very last bit of a pixel's color code, the color shifts so slightly that the human eye cannot detect it. However, a computer reading the bits can extract the hidden message.
Audio (WAV, MP3): Audio files are excellent carriers due to their size.
- Technique: Similar to images, LSB insertion replaces the last bit of the audio samples. In high-bitrate files (like WAV), this introduces a barely audible "hiss" that is indistinguishable from background noise to the casual listener.
Video (MP4, AVI): The "Holy Grail" of carriers.
- Technique: Because video is a sequence of thousands of images (frames) combined with audio, the storage capacity is massive. Attackers can hide entire gigabytes of stolen data within a single movie file by distributing the payload across I-frames or the audio track.

10.3.2 Tools and Encryption

Steganography tools do not just "stuff" the file inside; they also protect it. Modern tools almost always include Encryption as a prerequisite. Even if an investigator detects the steganography, they still need a password to extract the payload.

Common Tools:

Steghide: A classic command-line tool that embeds data in images (JPEG, BMP) and audio (WAV). It uses a password to encrypt the data before hiding it.
OpenStego: A GUI-based tool popular for its ease of use. It supports rigorous encryption algorithms (like AES-256) alongside the hiding process.
Xiao Steganography: A Windows tool often found in academic or beginner environments that supports BMP, WAV, and MP3.

10.3.3 Steganalysis (Detection)

Detecting steganography is difficult because you are looking for a needle in a haystack.

Visual/Auditory Attacks: Tools can filter the "Most Significant Bits" to render only the LSBs. If the LSBs show a pattern (rather than random static), hidden data is likely present.
Statistical Analysis: Tools like StegExpose analyze the statistical profile of a file. A normal photo has smooth color transitions; a photo with hidden data often has artificial "spikes" in its color histogram.

10.4 Forensic Password Cracking

When you encounter an encrypted ZIP file, a password-protected PDF, or a hashed password dump from a database, your investigation hits a wall. To proceed, you must break the lock. This process is Password Cracking.

It is important to understand that we are rarely "decrypting" the data directly. Instead, we are taking a guess, hashing that guess, and comparing it to the stolen hash. If they match, we know the password.

10.4.1 Step 1: Identify the Hash

Before you can crack a lock, you must know what kind of key it uses. You cannot feed an MD5 hash into a tool expecting SHA-256.

Online Resources: The most effective method is often checking the Hashcat Wiki example page. It lists thousands of hash types with examples.
Tools: Utilities like hash-identifier (built into Kali Linux) can analyze the length and character set of a string to guess the format. You can also use an online identifier at https://hashes.com/en/tools/hash_identifier.

The "Echo" Preparation: Cracking tools usually require the hash to be in a file. You will often use the command line to prepare this:

echo '5f4dcc3b5aa765d61d8327deb882cf99' > hash.txt

10.4.2 John the Ripper (The CPU King)

John the Ripper (JtR) is one of the oldest and most versatile cracking tools. While JtR does have an auto-detect feature, relying on it is considered poor forensic practice. It frequently misidentifies hashes (e.g., confusing many different types of raw hex) which can lead to hours of wasted processing time.

The Golden Rule: Always explicitly tell John which algorithm to use.

Identifying the Format Name: First, you need to know how John names the hash type you identified in Step 1. You can list all supported formats using:

john --list=formats

Tip: If you know it is an MD5 hash, you can grep the list: john --list=formats | grep md5

Constructing the Command: Once you have the format name (e.g., Raw-MD5), you construct your command using the --format flag. This ensures accuracy and speed.

Syntax: john --format=[FormatName] --wordlist=[PathToWordlist] [HashFile]

Example:

john --format=Raw-MD5 --wordlist=/usr/share/wordlists/rockyou.txt hash.txt

Viewing Results: John stores cracked passwords in its own database (so it doesn't have to crack them again). To see what you have cracked:

john --show hash.txt

10.4.3 Hashcat (The GPU Beast)

Hashcat is the industry standard for high-performance cracking. It utilizes the GPU (Graphics Processing Unit) rather than the CPU. Because GPUs are designed to do math for millions of pixels simultaneously, they can try millions (or billions) of passwords per second.

Hashcat is more manual than John; you must tell it exactly what it is looking at.

Key Flags:

-m: Hash Mode. This is a numeric code identifying the hash type (e.g., 0 = MD5, 1000 = NTLM).
-a: Attack Mode.
- 0: Dictionary (Wordlist).
- 3: Brute Force (Mask).

Example 1: Dictionary Attack Crack an MD5 hash (-m 0) using a dictionary (-a 0) and the RockYou wordlist.

hashcat -m 0 -a 0 hash.txt /usr/share/wordlists/rockyou.txt

Example 2: Brute Force Mask Attack Crack an NTLM hash (-m 1000) using Brute Force (-a 3). We assume the password is exactly 6 characters long and lowercase (?l?l?l?l?l?l).

hashcat -m 1000 -a 3 hash.txt ?l?l?l?l?l?l

* Note: Brute force is guaranteed to work eventually, but as password length increases, the time required grows exponentially.

10.5 Data Hiding Techniques

Criminals often use the file system's own features against the investigator.

10.5.1 Alternate Data Streams (ADS)

Alternate Data Streams (ADS) are a feature of the NTFS file system designed for compatibility, but frequently abused to hide malware. In NTFS, files can have multiple "streams" of data attached to a single filename.

The Command: An attacker can hide malware.exe behind readme.txt:

type malware.exe > readme.txt:hidden.exe

Windows Explorer will show readme.txt as a normal file with no size change, but the malware is attached.

Forensic Detection: Standard dir commands hide this. You must use the /r switch to "Report" streams.

dir /r

Recursive Searching: To find ADS artifacts across an entire drive, you need to search Recursively (meaning the tool automatically dives into every sub-folder, and every sub-folder inside those).

CMD: dir /r /s (The /s flag enables recursion).
PowerShell: Get-ChildItem -Recurse | Get-Item -Stream *

10.5.2 File Extension Mismatch

This is the simplest form of hiding: renaming a file from stolen_data.xlsx to family_photo.jpg.

Forensic Detection: Investigators rely on Magic Bytes (File Signatures).

A JPEG always starts with hex FF D8 FF.
A ZIP (or Office document) starts with 50 4B 03 04.
Forensic tools automatically flag files where the internal header does not match the external extension.

10.5.3 Slack Space Hiding

Files are stored in fixed-size Clusters (usually 4096 bytes). If a file is only 2000 bytes, the remaining 2096 bytes in that cluster are "Slack Space." Tools can hide fragmented data in these gaps. Since the OS ignores slack space, investigators must perform physical bit-level keyword searches to recover this data.

10.6 Wiping and Destruction

10.6.1 The Recycle Bin vs. Wiping

Deletion: Marks the file entry as "Unallocated" but leaves the data on the disk. Easily recoverable.
Wiping: Actively overwrites the specific sectors with zeros or random characters (e.g., DoD 5220.22-M standard).

10.6.2 Detecting Wiping

Cleanliness: A normal drive has "noise" in unallocated space. A wiped drive has perfect zeros or uniform random patterns.
Registry: Look for traces of tools like eraser.exe or CCleaner in the Registry or Prefetch.

10.6.3 Log Wiping (Clearing Tracks)

Attackers may try to cover their tracks by clearing Windows Event Logs.

The Command: wevtutil cl Security
The Artifact: Paradoxically, clearing the log creates a new event. Event ID 1102 ("The audit log was cleared") is a massive red flag indicating intentional anti-forensic activity.

10.7 Timestomping: Manipulating History

Suspects may alter the "Date Modified" timestamp to provide an alibi.

10.7.1 The Two Timestamps of NTFS

NTFS stores timestamps in two MFT attributes:

$STANDARD_INFORMATION ($SI): Exposed to the user (and easily modified).
$FILE_NAME ($FN): Managed by the kernel (difficult to modify).

10.7.2 Detecting the Manipulation

If an attacker uses a tool to backdate a file to 2010, they usually only change the $SI attribute. If you see a file where $SI says 2010 but $FN says 2023, you have detected Timestomping.

10.8 Chapter Summary

Anti-forensics transforms a digital investigation from a treasure hunt into a strategic battle. While tools like encryption, steganography, and wiping can effectively destroy or hide data, the very act of using them leaves a forensic footprint.

We learned that Steganography uses carriers like images and audio to hide data, often protecting it with encryption. We explored Password Cracking, understanding that tools like John the Ripper (CPU) and Hashcat (GPU) require us to first identify the hash type and then launch dictionary or brute-force attacks. Finally, we examined how criminals hide data in ADS, Slack Space, or by Timestomping metadata, and how investigators can leverage anomalies in the file system to uncover the truth.

Key Terms Review

Anti-Forensics: Techniques to compromise digital evidence.
Entropy: Measure of randomness; used to detect encryption.
Steganography: Hiding data within carrier files (Image, Audio, Video).
LSB (Least Significant Bit): The common technique for embedding data in media files.
John the Ripper / Hashcat: Standard tools for forensic password cracking.
Dictionary Attack: Attempting to crack passwords using a wordlist (e.g., RockYou.txt).
ADS (Alternate Data Streams): Hiding data behind a filename in NTFS.
Event ID 1102: Indicator that Windows logs were manually cleared.