Linker-related environment points
The dynamic loader/linker and linking concepts are inescapable components involved in the process of program linking and execution. Throughout this book, you will learn a lot about these topics. In Linux, there are quite a few ways to alter the dynamic linker's behavior that can serve the binary hacker in many ways. As we move through the book, you will begin to understand the process of linking, relocations, and dynamic loading (program interpreter). Here are a few linker-related attributes that are useful and will be used throughout the book.
The LD_PRELOAD environment variable
The LD_PRELOAD
environment variable can be set to specify a library path that should be dynamically linked before any other libraries. This has the effect of allowing functions and symbols from the preloaded library to override the ones from the other libraries that are linked afterwards. This essentially allows you to perform runtime patching by redirecting shared library functions. As we will see in later chapters, this technique can be used to bypass anti-debugging code and for userland rootkits.
The LD_SHOW_AUXV environment variable
This environment variable tells the program loader to display the program's auxiliary vector during runtime. The auxiliary vector is information that is placed on the program's stack (by the kernel's ELF
loading routine), with information that is passed to the dynamic linker with certain information about the program. We will examine this much more closely in Chapter 3, Linux Process Tracing, but the information might be useful for reversing and debugging. If, for instance, you want to get the memory address of the VDSO page in the process image (which can also be obtained from the maps
file, as shown earlier) you have to look for AT_SYSINFO
.
Here is an example of the auxiliary vector with LD_SHOW_AUXV
:
$ LD_SHOW_AUXV=1 whoami AT_SYSINFO: 0xb7779414 AT_SYSINFO_EHDR: 0xb7779000 AT_HWCAP: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0x8048034 AT_PHENT: 32 AT_PHNUM: 9 AT_BASE: 0xb777a000 AT_FLAGS: 0x0 AT_ENTRY: 0x8048eb8 AT_UID: 1000 AT_EUID: 1000 AT_GID: 1000 AT_EGID: 1000 AT_SECURE: 0 AT_RANDOM: 0xbfb4ca2b AT_EXECFN: /usr/bin/whoami AT_PLATFORM: i686 elfmaster
The auxiliary vector will be covered in more depth in Chapter 2, The ELF Binary Format.
Linker scripts
Linker scripts are a point of interest to us because they are interpreted by the linker and help shape a program's layout with regard to sections, memory, and symbols. The default linker script can be viewed with ld -verbose
.
The ld
linker program has a complete language that it interprets when it is taking input files (such as relocatable object files, shared libraries, and header files), and it uses this language to determine how the output file, such as an executable program, will be organized. For instance, if the output is an ELF
executable, the linker script will help determine what the layout will be and what sections will exist in which segments. Here is another instance: the .bss
section is always at the end of the data segment; this is determined by the linker script. You might be wondering how this is interesting to us. Well! For one, it is important to have some insights into the linking process during compile time. The gcc
relies on the linker and other programs to perform this task, and in some instances, it is important to be able to have control over the layout of the executable file. The ld
command language is quite an in-depth language and is beyond the scope of this book, but it is worth checking out. And while reverse engineering executables, remember that common segment addresses may sometimes be modified, and so can other portions of the layout. This indicates that a custom linker script is involved. A linker script can be specified with gcc
using the -T
flag. We will look at a specific example of using a linker script in Chapter 5, Linux Binary Protection.