Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Fuzzing Against the Machine

You're reading from   Fuzzing Against the Machine Automate vulnerability research with emulated IoT devices on QEMU

Arrow left icon
Product type Paperback
Published in May 2023
Publisher Packt
ISBN-13 9781804614976
Length 238 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (3):
Arrow left icon
Antonio Nappa Antonio Nappa
Author Profile Icon Antonio Nappa
Antonio Nappa
Eduardo Blázquez Eduardo Blázquez
Author Profile Icon Eduardo Blázquez
Eduardo Blázquez
Eduardo Blazquez Eduardo Blazquez
Author Profile Icon Eduardo Blazquez
Eduardo Blazquez
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. Part 1: Foundations
2. Chapter 1: Who This Book is For FREE CHAPTER 3. Chapter 2: History of Emulation 4. Chapter 3: QEMU From the Ground 5. Part 2: Emulation and Fuzzing
6. Chapter 4: QEMU Execution Modes and Fuzzing 7. Chapter 5: A Famous Refrain: AFL + QEMU = CVEs 8. Chapter 6: Modifying QEMU for Basic Instrumentation 9. Part 3: Advanced Concepts
10. Chapter 7: Real-Life Case Study: Samsung Exynos Baseband 11. Chapter 8: Case Study: OpenWrt Full-System Fuzzing 12. Chapter 9: Case Study: OpenWrt System Fuzzing for ARM 13. Chapter 10: Finally Here: iOS Full System Fuzzing 14. Chapter 11: Deus Ex Machina: Fuzzing Android Libraries 15. Chapter 12: Conclusion and Final Remarks
16. Index 17. Other Books You May Enjoy

Getting a primer

Vulnerability analysis and software exploitation are related and well-known topics in the area of cybersecurity. The purpose of this book is to look for security bugs in embedded firmware through emulation and later search for a way to exploit (take advantage of) these vulnerabilities. There are various types of security flaws. The most known and often exploitable bug is known as the buffer overflow, where an incorrect bound check makes a program buffer and becomes filled with user-provided data, and in some cases allows that user to execute code inside of the process memory. In the cybersecurity world, the code that’s injected and run through the exploitation of that vulnerability is known as shellcode. While it’s possible to run a shell to run commands, this isn’t always the only option, as it’s possible to be creative and execute different codes to put a foot inside of a machine.

Not all bugs are created equal

A bug is a software flaw. In many cases, bugs do not lead to security breaches or exploits. They just exhibit a behavior that is not expected by the user or the developer. In other cases, a bug may also be a software vulnerability, meaning that it may generate security issues, such as data leakages, denial of service, or exploitation. Exploiting a vulnerability normally leads to privilege escalation or to taking control of the CPU to execute arbitrary code.

Since the first document that explained this process was published (http://phrack.org/issues/49/14.html#article), many countermeasures have been created to stop an attacker who could exploit a vulnerability if one was found in a program. Protections help us avoid massive exploitations of buffer overflow vulnerabilities. However, many other flaws exist:

  • Program logic errors (a mistake during the development phase can a cause program to end in an undefined/unexpected state)
  • Buffer overread (where an improper bound check allows an attacker to have access to unauthorized program data)
  • Format string vulnerabilities (https://www.win.tue.nl/~aeb/linux/hh/formats-teso.html)
  • Heap overflow (an evolution of the buffer overflow in the heap), and many other kinds of vulnerabilities

While the process of searching for these vulnerabilities is hard and tedious due to the time it can take to manually find one, there are different techniques to help security researchers automatically discover some types of vulnerabilities, and in the case of this book, we will cover those that involve the use of a tool called a fuzzer. These kinds of tools take advantage of vulnerabilities such as the incorrect handling of user-provided data in programs to find an input that makes a program crash. The fuzzer will then run the program, giving different inputs and monitoring them to know when that program crashes. To improve the success of the fuzzing process, these programs take a set of inputs and mutate them (for example, changing some bits in the case of some file structures) to give a weird input to the program that it will not be able to handle and will make it crash, where this could or couldn’t be used to take advantage of the vulnerability (sadly, not all vulnerabilities are exploitable).

The utility belt

We have already roughly mentioned what we’ll see in each part of this book, as well as what tools we will use throughout. We will use this section to move a step forward and provide a better overview of the tools we will use, as well as install them (we will not deep-dive into these tools as they will be part of future chapters).

Git, Python3, build-essential

Git is a software version control system that helps keep track of code modifications, which allows us to store our code in a remote server. One of the main servers that contains Git repositories is GitHub. Everybody can upload their artifacts and share them with other people.

Python was created in 1991 by Guido Van Rossum and has exploded as a prototyping language in the last decade thanks to the myriad of libraries written in this language. Without any doubt, Python represents a milestone in computer science because it made programming accessible and readable to everyone. The build-essential package is a basic collection of packages that help compile software in Ubuntu/Debian Linux distributions. Often, Python3 comes already installed and git can be installed with a package manager; for example:

  • Arch: pacman -S git python3 make gcc cmake g++
  • Debian/Ubuntu: apt-get install git python3 build-essential
  • RHEL/CentOS: yum install git python3 make gcc cmake g++
    • Also, for build essentials in RHEL/CentOS, you can use dnf group install "C Development Tools and Libraries" "Development Tools"
  • SUSE: zypper install git python3 make gcc cmake g++

QEMU

QEMU is a piece of software that aims to provide users with a tool where they can emulate different systems, as well as some system peripherals. QEMU uses an intermediate representation (IR) to represent these operations, and through binary translation, it will transform the instructions of the given system or binary into the IR and compile those instructions into the current architecture-supported instructions (just-in-time mode, faster), or it will interpret those IR instructions on its own interpreter (interpreter mode, slower).

To use QEMU, we have two options. The first and simplest one is to use a package manager. The command that’s used will depend on the system that we are using. If we look at the QEMU web page, we will see that they provide different sets of commands, depending on the system:

  • Arch: pacman -S qemu
  • Debian/Ubuntu: apt-get install qemu
  • RHEL/CentOS: yum install qemu-kvm
  • SUSE: zypper install qemu

In our case, we will make use of an Ubuntu system, so we will use the commands for Debian/Ubuntu. Therefore, the command will be super user: sudo apt-get install qemu or sudo apt install qemu.

The other option is to download the QEMU source. This can be downloaded from its download web page or directly from git. In both cases, we will compile and install the tool. Sometimes, this option can be a better fit for us if we want to decide what to install or not during the installation phase.

If we decide to download from its web page (to download the last version, 6.2), we can use the following code:

wget https://download.qemu.org/qemu-6.2.0.tar.xz
tar xvJf qemu-6.2.0.tar.xz
cd qemu-6.2.0
./configure
make
make install

Alternatively, if we want to download using git (this will download the last version in the master), we can do the following:

git clone https://gitlab.com/qemu-project/qemu.git
cd qemu
git submodule init
git submodule update --recursive
./configure
make
make install

AFL/AFL++

American Fuzzy Lop (AFL) (https://lcamtuf.coredump.cx/afl/) has become the de facto standard for program fuzzing and vulnerability research. Michal Zalewski (https://lcamtuf.coredump.cx/silence/), a famous Google security engineer, developed AFL for internal purposes at Google, which, as a company, owns trillions of lines of code and among them, potentially thousands of vulnerabilities. The approach of AFL follows a genetic algorithm that makes the initial program input evolve and makes AFL smart. Moreover, it offers a suite for analyzing crash dumps that are generated by the program that is being fuzzed. AFL helped users find thousands of vulnerabilities, even in famous software such as MySQL, Adobe Reader, VLC, and IDA Pro, as well as several browsers.

AFL++ has been presented as an evolution of AFL and includes patches to hook in a full system emulator (QEMU) or to instrument a binary (QEMU user mode). In this book, we will start with AFL++ and apply some patches that come from other projects to show how flexible it is to have a fuzzing suite embedded with an emulator to hunt for vulnerabilities in embedded firmware. The following is an example of how to install AFL. Throughout this book, we will provide all the instructions we will need to install what is needed for every specific exercise:

git clone https://github.com/google/AFL.git
cd AFL && make

The Ghidra disassembler

Ghidra is a powerful free alternative to IDA Pro. This software was previously owned by the NSA and it was released publicly in 2019. It’s extremely portable since its UI and most of the disassembler internals are written in Java, and it is not dependent on any specific architecture. However, the internal components are compiled natively for the different architectures. This marks a huge difference from other disassemblers because the Java UI makes Ghidra very versatile. Also, Ghidra includes a free decompiler for various architectures, which will be useful when analyzing difficult code.

Installing Ghidra

First of all, as stated previously, Ghidra is written in Java, so we will need to install the Java 11 SDK.

For Linux, follow these steps:

  1. Download the JDK:
    wget https://corretto.aws/downloads/latest/amazon-corretto-11-x64-linux-jdk.tar.gz
  2. Extract the JDK distribution (the .tar.gz file) to your desired location, and add the JDK’s bin directory to your PATH: directory.
  3. Extract the JDK:
    tar xvf <JDK distribution .tar.gz>
  4. Open ~/.bashrc with an editor of your choice; for example, see the following:
    vi ~/.bashrc
  5. At the very end of the file, add the JDK bin directory to the PATH variable:
    export PATH=<path of extracted JDK dir>/bin:$PATH
  6. Save the file.
  7. Restart any open Terminal windows for changes to take effect.

Once the JDK is installed, we will download Ghidra from https://ghidra-sre.org/ and download the ghidra_10.1.2_PUBLIC_20220125.zip file or a more recent version if there is one. Unzip the archive and execute ghidraRun to start the application. Ghidra keeps consistent on its commands, so newer versions will fit what we see in this book. If you are hungry for knowledge about this tool, we recommend reading Ghidra Software Reverse Engineering for Beginners from Packt (https://www.packtpub.com/product/ghidra-software-reverse-engineering-for-beginners/9781800207974). We will also install GNU Debugger, gdb, with some plugins and for different architectures. This tool can help you analyze executables while they’re running. Normally, Ghidra is mostly used for static analysis.

GDB Multiarch and GEF/Pwndbg

GDB is the default debugger on Linux systems. It is a command-line debugger, and we can use it to debug binaries from architectures different from our current one. To do this, we need to install the multiarch version. We will also install a couple of plugins that improve the view of the tool since gdb without plugins can be tough at the beginning. The scripts will show the views from the stack, the registers, and the assembly code at every moment. Throughout this book, we will learn how to use gdb for debugging purposes. The installation commands for the different environments are as follows:

  • Arch: pacman -S gdb-multiarch
  • Debian/Ubuntu: apt-get install gdb-multiarch
  • SUSE: zypper install gdb-multiarch

Then, download or clone https://github.com/apogiatzis/gdb-peda-pwndbg-gef and, from its main directory, execute install.sh.

Avatar2

The Eurecom institute in South France often hosts very talented students and researchers. This is where Avatar2 was designed by Marius Muench, Dario Nisi, Aurelienne Francillon, and Davide Balzarotti. It’s a Python framework that helps orchestrate embedded systems with the help of QEMU. It contains code to patch memory, emulate peripherals, and mock interfaces to bring firmware to a specific state. Some recent Samsung baseband vulnerabilities (disclosed in September 2020) were discovered thanks to Avatar2, AFL, and QEMU. These vulnerabilities were extremely critical and led to remote code execution within the connection processor (CP) of Samsung phones.

You have been reading a chapter from
Fuzzing Against the Machine
Published in: May 2023
Publisher: Packt
ISBN-13: 9781804614976
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image