Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Learning Linux Binary Analysis
Learning Linux Binary Analysis

Learning Linux Binary Analysis: Learning Linux Binary Analysis

Arrow left icon
Profile Icon "elfmaster" O'Neill
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (10 Ratings)
Paperback Feb 2016 282 pages 1st Edition
eBook
$27.98 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon "elfmaster" O'Neill
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8 (10 Ratings)
Paperback Feb 2016 282 pages 1st Edition
eBook
$27.98 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$27.98 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Learning Linux Binary Analysis

Chapter 1. The Linux Environment and Its Tools

In this chapter, we will be focusing on the Linux environment as it pertains to our focus throughout this book. Since this book is focused about Linux binary analysis, it makes sense to utilize the native environment tools that come with Linux and to which everyone has access. Linux comes with the ubiquitous binutils already installed, but they can be found at http://www.gnu.org/software/binutils/. They contain a huge selection of tools that are handy for binary analysis and hacking. This is not another book on using IDA Pro. IDA is hands-down the best universal software for reverse engineering of binaries, and I would encourage its use as needed, but we will not be using it in this book. Instead, you will acquire the skills to hop onto virtually any Linux system and have an idea on how to begin hacking binaries with an environment that is already accessible. You can therefore learn to appreciate the beauty of Linux as a true hackers' environment for which there are many free tools available. Throughout the book, we will demonstrate the use of various tools and give a recap on how to use them as we progress through each chapter. Meanwhile, however, let this chapter serve as a primer or reference to these tools and tips within the Linux environment. If you are already very familiar with the Linux environment and its tools for disassembling, debugging, and parsing of ELF files, then you may simply skip this chapter.

Linux tools

Throughout this book, we will be using a variety of free tools that are accessible by anyone. This section will give a brief synopsis of some of these tools for you.

GDB

GNU Debugger (GDB) is not only good to debug buggy applications. It can also be used to learn about a program's control flow, change a program's control flow, and modify the code, registers, and data structures. These tasks are common for a hacker who is working to exploit a software vulnerability or is unraveling the inner workings of a sophisticated virus. GDB works on ELF binaries and Linux processes. It is an essential tool for Linux hackers and will be used in various examples throughout this book.

Objdump from GNU binutils

Object dump (objdump) is a simple and clean solution for a quick disassembly of code. It is great for disassembling simple and untampered binaries, but will show its limitations quickly when attempting to use it for any real challenging reverse engineering tasks, especially against hostile software. Its primary weakness is that it relies on the ELF section headers and doesn't perform control flow analysis, which are both limitations that greatly reduce its robustness. This results in not being able to correctly disassemble the code within a binary, or even open the binary at all if there are no section headers. For many conventional tasks, however, it should suffice, such as when disassembling common binaries that are not fortified, stripped, or obfuscated in any way. It can read all common ELF types. Here are some common examples of how to use objdump:

  • View all data/code in every section of an ELF file:
    objdump -D <elf_object>
    
  • View only program code in an ELF file:
    objdump -d <elf_object>
    
  • View all symbols:
    objdump -tT <elf_object>
    

We will be exploring objdump and other tools in great depth during our introduction to the ELF format in Chapter 2, The ELF Binary Format.

Objcopy from GNU binutils

Object copy (Objcopy) is an incredibly powerful little tool that we cannot summarize with a simple synopsis. I recommend that you read the manual pages for a complete description. Objcopy can be used to analyze and modify ELF objects of any kind, although some of its features are specific to certain types of ELF objects. Objcopy is often times used to modify or copy an ELF section to or from an ELF binary.

To copy the .data section from an ELF object to a file, use this line:

objcopy –only-section=.data <infile> <outfile>

The objcopy tool will be demonstrated as needed throughout the rest of this book. Just remember that it exists and can be a very useful tool for the Linux binary hacker.

strace

System call trace (strace) is a tool that is based on the ptrace(2) system call, and it utilizes the PTRACE_SYSCALL request in a loop to show information about the system call (also known as syscalls) activity in a running program as well as signals that are caught during execution. This program can be highly useful for debugging, or just to collect information about what syscalls are being called during runtime.

This is the strace command used to trace a basic program:

strace /bin/ls -o ls.out

The strace command used to attach to an existing process is as follows:

strace -p <pid> -o daemon.out

The initial output will show you the file descriptor number of each system call that takes a file descriptor as an argument, such as this:

SYS_read(3, buf, sizeof(buf));

If you want to see all of the data that was being read into file descriptor 3, you can run the following command:

strace -e read=3 /bin/ls

You may also use -e write=fd to see written data. The strace tool is a great little tool, and you will undoubtedly find many reasons to use it.

ltrace

library trace (ltrace) is another neat little tool, and it is very similar to strace. It works similarly, but it actually parses the shared library-linking information of a program and prints the library functions being used.

Basic ltrace command

You may see system calls in addition to library function calls with the -S flag. The ltrace command is designed to give more granular information, since it parses the dynamic segment of the executable and prints actual symbols/functions from shared and static libraries:

ltrace <program> -o program.out

ftrace

Function trace (ftrace) is a tool designed by me. It is similar to ltrace, but it also shows calls to functions within the binary itself. There was no other tool I could find publicly available that could do this in Linux, so I decided to code one. This tool can be found at https://github.com/elfmaster/ftrace. A demonstration of this tool is given in the next chapter.

readelf

The readelf command is one of the most useful tools around for dissecting ELF binaries. It provides every bit of the data specific to ELF necessary for gathering information about an object before reverse engineering it. This tool will be used often throughout the book to gather information about symbols, segments, sections, relocation entries, dynamic linking of data, and more. The readelf command is the Swiss Army knife of ELF. We will be covering it in depth as needed, during Chapter 2, The ELF Binary Format, but here are a few of its most commonly used flags:

  • To retrieve a section header table:
    readelf -S <object>
    
  • To retrieve a program header table:
    readelf -l <object>
    
  • To retrieve a symbol table:
    readelf -s <object>
    
  • To retrieve the ELF file header data:
    readelf -e <object>
    
  • To retrieve relocation entries:
    readelf -r <object>
    
  • To retrieve a dynamic segment:
    readelf -d <object>
    

ERESI – The ELF reverse engineering system interface

ERESI project (http://www.eresi-project.org) contains a suite of many tools that are a Linux binary hacker's dream. Unfortunately, many of them are not kept up to date and aren't fully compatible with 64-bit Linux. They do exist for a variety of architectures, however, and are undoubtedly the most innovative single collection of tools for the purpose of hacking ELF binaries that exist today. Because I personally am not really familiar with using the ERESI project's tools, and because they are no longer kept up to date, I will not be exploring their capabilities within this book. However, be aware that there are two Phrack articles that demonstrate the innovation and powerful features of the ERESI tools:

Useful devices and files

Linux has many files, devices, and /proc entries that are very helpful for the avid hacker and reverse engineer. Throughout this book, we will be demonstrating the usefulness of many of these files. Here is a description of some of the commonly used ones throughout the book.

/proc/<pid>/maps

/proc/<pid>/maps file contains the layout of a process image by showing each memory mapping. This includes the executable, shared libraries, stack, heap, VDSO, and more. This file is critical for being able to quickly parse the layout of a process address space and is used more than once throughout this book.

/proc/kcore

The /proc/kcore is an entry in the proc filesystem that acts as a dynamic core file of the Linux kernel. That is, it is a raw dump of memory that is presented in the form of an ELF core file that can be used by GDB to debug and analyze the kernel. We will explore /proc/kcore in depth in Chapter 9, Linux /proc/kcore Analysis.

/boot/System.map

This file is available on almost all Linux distributions and is very useful for kernel hackers. It contains every symbol for the entire kernel.

/proc/kallsyms

The kallsyms is very similar to System.map, except that it is a /proc entry that means that it is maintained by the kernel and is dynamically updated. Therefore, if any new LKMs are installed, the symbols will be added to /proc/kallsyms on the fly. The /proc/kallsyms contains at least most of the symbols in the kernel and will contain all of them if specified in the CONFIG_KALLSYMS_ALL kernel config.

/proc/iomem

The iomem is a useful proc entry as it is very similar to /proc/<pid>/maps, but for all of the system memory. If, for instance, you want to know where the kernel's text segment is mapped in the physical memory, you can search for the Kernel string and you will see the code/text segment, the data segment, and the bss segment:

  $ grep Kernel /proc/iomem
  01000000-016d9b27 : Kernel code
  016d9b28-01ceeebf : Kernel data
  01df0000-01f26fff : Kernel bss

ECFS

Extended core file snapshot (ECFS) is a special core dump technology that was specifically designed for advanced forensic analysis of a process image. The code for this software can be found at https://github.com/elfmaster/ecfs. Also, Chapter 8, ECFS – Extended Core File Snapshot Technology, is solely devoted to explaining what ECFS is and how to use it. For those of you who are into advanced memory forensics, you will want to pay close attention to this.

Linker-related environment points

The dynamic loader/linker and linking concepts are inescapable components involved in the process of program linking and execution. Throughout this book, you will learn a lot about these topics. In Linux, there are quite a few ways to alter the dynamic linker's behavior that can serve the binary hacker in many ways. As we move through the book, you will begin to understand the process of linking, relocations, and dynamic loading (program interpreter). Here are a few linker-related attributes that are useful and will be used throughout the book.

The LD_PRELOAD environment variable

The LD_PRELOAD environment variable can be set to specify a library path that should be dynamically linked before any other libraries. This has the effect of allowing functions and symbols from the preloaded library to override the ones from the other libraries that are linked afterwards. This essentially allows you to perform runtime patching by redirecting shared library functions. As we will see in later chapters, this technique can be used to bypass anti-debugging code and for userland rootkits.

The LD_SHOW_AUXV environment variable

This environment variable tells the program loader to display the program's auxiliary vector during runtime. The auxiliary vector is information that is placed on the program's stack (by the kernel's ELF loading routine), with information that is passed to the dynamic linker with certain information about the program. We will examine this much more closely in Chapter 3, Linux Process Tracing, but the information might be useful for reversing and debugging. If, for instance, you want to get the memory address of the VDSO page in the process image (which can also be obtained from the maps file, as shown earlier) you have to look for AT_SYSINFO.

Here is an example of the auxiliary vector with LD_SHOW_AUXV:

$ LD_SHOW_AUXV=1 whoami
AT_SYSINFO: 0xb7779414
AT_SYSINFO_EHDR: 0xb7779000
AT_HWCAP: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2
AT_PAGESZ: 4096
AT_CLKTCK: 100
AT_PHDR:  0x8048034
AT_PHENT: 32
AT_PHNUM: 9
AT_BASE:  0xb777a000
AT_FLAGS: 0x0
AT_ENTRY: 0x8048eb8
AT_UID:  1000
AT_EUID: 1000
AT_GID:  1000
AT_EGID: 1000
AT_SECURE: 0
AT_RANDOM: 0xbfb4ca2b
AT_EXECFN: /usr/bin/whoami
AT_PLATFORM: i686
elfmaster

The auxiliary vector will be covered in more depth in Chapter 2, The ELF Binary Format.

Linker scripts

Linker scripts are a point of interest to us because they are interpreted by the linker and help shape a program's layout with regard to sections, memory, and symbols. The default linker script can be viewed with ld -verbose.

The ld linker program has a complete language that it interprets when it is taking input files (such as relocatable object files, shared libraries, and header files), and it uses this language to determine how the output file, such as an executable program, will be organized. For instance, if the output is an ELF executable, the linker script will help determine what the layout will be and what sections will exist in which segments. Here is another instance: the .bss section is always at the end of the data segment; this is determined by the linker script. You might be wondering how this is interesting to us. Well! For one, it is important to have some insights into the linking process during compile time. The gcc relies on the linker and other programs to perform this task, and in some instances, it is important to be able to have control over the layout of the executable file. The ld command language is quite an in-depth language and is beyond the scope of this book, but it is worth checking out. And while reverse engineering executables, remember that common segment addresses may sometimes be modified, and so can other portions of the layout. This indicates that a custom linker script is involved. A linker script can be specified with gcc using the -T flag. We will look at a specific example of using a linker script in Chapter 5, Linux Binary Protection.

Summary

We just touched upon some fundamental aspects of the Linux environment and the tools that will be used most commonly in the demonstrations from each chapter. Binary analysis is largely about knowing the tools and resources that are available for you and how they all fit together. We only briefly covered the tools, but we will get an opportunity to emphasize the capabilities of each one as we explore the vast world of Linux binary hacking in the following chapters. In the next chapter, we will delve into the internals of the ELF binary format and cover many interesting topics, such as dynamic linking, relocations, symbols, sections, and more.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Grasp the intricacies of the ELF binary format of UNIX and Linux
  • Design tools for reverse engineering and binary forensic analysis
  • Insights into UNIX and Linux memory infections, ELF viruses, and binary protection schemes

Description

Learning Linux Binary Analysis is packed with knowledge and code that will teach you the inner workings of the ELF format, and the methods used by hackers and security analysts for virus analysis, binary patching, software protection and more. This book will start by taking you through UNIX/Linux object utilities, and will move on to teaching you all about the ELF specimen. You will learn about process tracing, and will explore the different types of Linux and UNIX viruses, and how you can make use of ELF Virus Technology to deal with them. The latter half of the book discusses the usage of Kprobe instrumentation for kernel hacking, code patching, and debugging. You will discover how to detect and disinfect kernel-mode rootkits, and move on to analyze static code. Finally, you will be walked through complex userspace memory infection analysis. This book will lead you into territory that is uncharted even by some experts; right into the world of the computer hacker.

Who is this book for?

If you are a software engineer or reverse engineer and want to learn more about Linux binary analysis, this book will provide you with all you need to implement solutions for binary analysis in areas of security, forensics, and antivirus. This book is great for both security enthusiasts and system level engineers. Some experience with the C programming language and the Linux command line is assumed.

What you will learn

  • Explore the internal workings of the ELF binary format
  • Discover techniques for UNIX Virus infection and analysis
  • Work with binary hardening and software anti-tamper methods
  • Patch executables and process memory
  • Bypass anti-debugging measures used in malware
  • Perform advanced forensic analysis of binaries
  • Design ELF-related tools in the C language
  • Learn to operate on memory with ptrace

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Feb 29, 2016
Length: 282 pages
Edition : 1st
Language : English
ISBN-13 : 9781782167105
Category :
Languages :
Concepts :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Feb 29, 2016
Length: 282 pages
Edition : 1st
Language : English
ISBN-13 : 9781782167105
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 152.97
Practical Linux Security Cookbook
$48.99
Learning Linux Shell Scripting
$54.99
Learning Linux Binary Analysis
$48.99
Total $ 152.97 Stars icon

Table of Contents

10 Chapters
1. The Linux Environment and Its Tools Chevron down icon Chevron up icon
2. The ELF Binary Format Chevron down icon Chevron up icon
3. Linux Process Tracing Chevron down icon Chevron up icon
4. ELF Virus Technology – Linux/Unix Viruses Chevron down icon Chevron up icon
5. Linux Binary Protection Chevron down icon Chevron up icon
6. ELF Binary Forensics in Linux Chevron down icon Chevron up icon
7. Process Memory Forensics Chevron down icon Chevron up icon
8. ECFS – Extended Core File Snapshot Technology Chevron down icon Chevron up icon
9. Linux /proc/kcore Analysis Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8
(10 Ratings)
5 star 90%
4 star 0%
3 star 10%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Yuki Dec 11, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Fantastic book,Can anyone recommend a Windows PE format book covers similar things?Thanks.
Amazon Verified review Amazon
Kyle Huser Jul 22, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
It was gift
Amazon Verified review Amazon
Dan W Aug 01, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Chapter 2 is worth the price alone. Engaging discussion of a subject that can be difficult to make interesting.
Amazon Verified review Amazon
SonChakr Jul 09, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Arguably, one of the best books that a pragmatic developer must read, that is, if he is associated with Unix / Linux platforms.Unlike what the name suggests, the book isn't just useful to a compiler / linker developer or someone merely interested in hacking executable binaries, but points at and describes many tools that can prove instrumental, in triaging defects or even understanding and controlling behavior of (native) applications on the aforesaid platforms.
Amazon Verified review Amazon
Amazon Customer Jun 07, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.