Linker
The first big step in building a C project is compiling all the source files to their corresponding relocatable object files. This step is a necessary step in preparing the final products, but alone, it is not enough, and one more step is still needed. Before going through the details of this step, we need to have a quick look at the possible products (sometimes referred to as artifacts) in a C project.
A C/C++ project can lead to the following products:
- A number of executable files that usually have the
.out
extension in most Unix-like operating systems. These files usually have the.exe
extension in Microsoft Windows. - A number of static libraries that usually have the
.a
extension in most Unix-like operating systems. These files have the.lib
extension in Microsoft Windows. - A number of dynamic libraries or shared object files that usually have the
.so
extension in most Unix-like operating systems. These files have the.dylib
extension in macOS, and.dll
in Microsoft Windows.
Relocatable object files are not considered as one of these products; hence, you cannot find them in the preceding list. Relocatable object files are temporary products simply because they only take part in the linking step to produce the preceding products, and after that, we don't need them anymore. The linker component has the sole responsibility of producing the preceding products from the given relocatable object files.
One final and important note about the used terminology: all these three products are called object files. Therefore, it is best to use the term relocatable before the term object file when referring to an object file produced by the assembler as an intermediate product.
We'll now briefly describe each of the final products. The upcoming chapter is totally dedicated to the object files and it will discuss these final products in greater detail.
An executable object file can be run as a process. This file usually contains a substantial portion of the features provided by a project. It must have an entry point where the machine-level instructions are executed. While the main
function is the entry point of a C program, the entry point of an executable object file is platform-dependent, and it is not the main
function. The main
function will eventually be called after some preparations made by a group of platform-specific instructions, which have been added by the linker as the result of the linking step.
A static library is nothing more than an archive file that contains several relocatable object files. Therefore, a static library file is not produced by the linker directly. Instead, it is produced by the default archive program of the system, which on a Unix-like system is the ar
program.
Static libraries are usually linked to other executable files, and they then become part of those executable files. They are the simplest and easiest way to encapsulate a piece of logic so that you can use it at a later point. There is an enormous number of static libraries that exist within an operating system, with each of them containing a specific piece of logic that can be used to access a certain functionality within that operating system.
Shared object files, which have a more complicated structure rather than simply being an archive, are created directly by the linker. They are also used differently; namely, before they are used, they need to be loaded into a running process at runtime.
This is in opposition to static libraries that are used at link time to become part of the final executable file. In addition, a single shared object file can be loaded and used by multiple different processes at the same time. As part of the next chapter, we demonstrate how shared object files can be loaded and used by a C program at runtime.
In the upcoming section, we explain what happens in the linking step and what elements are involved and used by the linker to produce the final products, especially executable files.
How does the linker work?
In this section, we are going to explain how the linker component works and what we exactly mean by linking. Suppose that you are building a C project that contains five source files, with the final product being an executable. As part of the build process, you have compiled all the source files, and now you have five relocatable object files. What you now need is a linker to complete the last step and produce the final executable file.
Based on what we have said so far, to put it simply, a linker combines all of the relocatable object files, in addition to specified static libraries, in order to create the final executable object file. However, you would be wrong if you thought that this step was straightforward.
There are a few concerns, which come from the contents of the object files, that need to be considered when we are combining the object files in order to produce a working executable object file. In order to see how the linker works, we need to know how it uses the relocatable object files, and for this purpose, we need to find out what is inside an object file.
The simple answer is that an object file contains the equivalent machine-level instructions for a translation unit. However, these instructions are not put into the file in random order. Instead, they are grouped under sections called symbols.
In fact, there are many things in an object file, but symbols are one component that explains how the linker works and how it ties some object files together to produce a larger one. In order to explain symbols, let's talk about them in the context of an example: example 2.3. Using this example, we want to demonstrate how some functions are compiled and placed in the corresponding relocatable object file. Take a look at the following code, which contains two functions:
int average(int a, int b) { return (a + b) / 2; } int sum(int* numbers, int count) { int sum = 0; for (int i = 0; i < count; i++) { sum += numbers[i]; } return sum; }
Code Box 2-8 [ExtremeC_examples_chapter2_3.c]: A code with two function definitions
Firstly, we need to compile the preceding code in order to produce the corresponding object file. The following command produces the object file, target.o
. We are compiling the code on our default platform:
$ gcc -c ExtremeC_examples_chapter2_3.c -o target.o $
Shell Box 2-12: Compiling the source file in example 2.3
Next, we use the nm
utility to look into the target.o
object file. The nm
utility allows us to see the symbols that can be found inside an object file:
$ nm target.o 0000000000000000 T average 000000000000001d T sum $
Shell Box 2-13: Using the nm utility to see the defined symbols in a relocatable object file
The preceding shell box shows the symbols defined in the object file. As you can see, their names are exactly the same as the function defined in Code Box 2-8.
If you use the readelf
utility, like we have done in the following shell box, you can see the symbol table existing in the object file. A symbol table contains all the symbols defined in an object file and it can give you more information about the symbols:
$ readelf -s target.o Symbol table '.symtab' contains 10 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS ExtremeC_examples_chapter 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 0 SECTION LOCAL DEFAULT 6 7: 0000000000000000 0 SECTION LOCAL DEFAULT 4 8: 0000000000000000 29 FUNC GLOBAL DEFAULT 1 average 9: 000000000000001d 69 FUNC GLOBAL DEFAULT 1 sum $
Shell Box 2-14: Using the readelf utility to see the symbol table of a relocatable object file
As you can see in the output of readelf
, there are two function symbols in the symbol table. There are also other symbols in the table that refer to different sections within the object file. We will discuss some of these symbols in this chapter and the next chapter.
If you want to see the disassembly of the machine-level instructions, under each function symbol, then you can use the objdump
tool:
$ objdump -d target.o target.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <average>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d fc mov %edi,-0x4(%rbp) 7: 89 75 f8 mov %esi,-0x8(%rbp) a: 8b 55 fc mov -0x4(%rbp),%edx d: 8b 45 f8 mov -0x8(%rbp),%eax 10: 01 d0 add %edx,%eax 12: 89 c2 mov %eax,%edx 14: c1 ea 1f shr $0x1f,%edx 17: 01 d0 add %edx,%eax 19: d1 f8 sar %eax 1b: 5d pop %rbp 1c: c3 retq 000000000000001d <sum>: 1d: 55 push %rbp 1e: 48 89 e5 mov %rsp,%rbp 21: 48 89 7d e8 mov %rdi,-0x18(%rbp) 25: 89 75 e4 mov %esi,-0x1c(%rbp) 28: c7 45 f8 00 00 00 00 movl $0x0,-0x8(%rbp) 2f: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 36: eb 1d jmp 55 <sum+0x38> 38: 8b 45 fc mov -0x4(%rbp),%eax 3b: 48 98 cltq 3d: 48 8d 14 85 00 00 00 lea 0x0(,%rax,4),%rdx 44: 00 45: 48 8b 45 e8 mov -0x18(%rbp),%rax 49: 48 01 d0 add %rdx,%rax 4c: 8b 00 mov (%rax),%eax 4e: 01 45 f8 add %eax,-0x8(%rbp) 51: 83 45 fc 01 addl $0x1,-0x4(%rbp) 55: 8b 45 fc mov -0x4(%rbp),%eax 58: 3b 45 e4 cmp -0x1c(%rbp),%eax 5b: 7c db jl 38 <sum+0x1b> 5d: 8b 45 f8 mov -0x8(%rbp),%eax 60: 5d pop %rbp 61: c3 retq $
Shell Box 2-15: Using the objdump utility to see the instructions of the symbols defined in a relocatable object file
Based on what we see, each function symbol corresponds to a function that has been defined in the source code. When you need to link several relocatable object files, in order to produce an executable object file, this shows that each of the relocatable object files contains only a portion of the whole required function symbols needed to build a complete executable program.
Now, going back to the topic of this section, the linker gathers all the symbols from the various relocatable object files before putting them together in a bigger object file to form a complete executable binary. In order to demonstrate this in a real scenario, we need a different example that has some functions distributed in a number of source files. This way, we can show how the linker looks up the symbols in the given relocatable object files, in order to produce an executable file.
Example 2.4 consists of four C files – three source files and one header file. In the header file, we have declared two functions, with each one defined in its own source file. The third source file contains the main
function.
The functions in example 2.4 are amazingly simple, and after compilation, each function will contain a few machine-level instructions within their corresponding object files. In addition, example 2.4 will not include any of the standard C header files. We have chosen this in order to have a small translation unit for each source file.
The following code box shows the header file:
#ifndef EXTREMEC_EXAMPLES_CHAPTER_2_4_DECLS_H #define EXTREMEC_EXAMPLES_CHAPTER_2_4_DECLS_H int add(int, int); int multiply(int, int); #endif
Code Box 2-9 [ExtremeC_examples_chapter2_4_decls.h]: The declaration of the functions in example 2.4
Looking at that code, you can see that we used the header guard statements to prevent double inclusion. More than that, two functions with similar signatures are declared. Each of them receives two integers as input and will return another integer as a result.
As we said before, each of these functions are implemented in separate source files. The first source file looks as follows:
int add(int a, int b) { return a + b; }
Code Box 2-10 [ExtremeC_examples_chapter2_4_add.c]: The definition of the add function
We can clearly see that the source file has not included any other header files. However, it does define a function that follows the exact same signature that we have declared in the header file.
As we can see next, the second source file is similar to the first one. This one contains the definition of the multiply
function:
int multiply(int a, int b) { return a * b; }
Code Box 2-11 [ExtremeC_examples_chapter2_4_multiply.c]: The definition of the multiply function
We can now move onto the third source file, which contains the main
function:
#include "ExtremeC_examples_chapter2_4_decls.h" int main(int argc, char** argv) { int x = add(4, 5); int y = multiply(9, x); return 0; }
Code Box 2-12 [ExtremeC_examples_chapter2_4_main.c]: The main function of example 2.4
The third source file has to include the header file in order to obtain the declarations of both functions. Otherwise, the source file will not be able to use the add
and multiply
functions, simply because they are not declared, and this may result in a compilation failure.
In addition, the main
function does not know anything about the definitions of either add
or multiply
. Therefore, we need to ask an important question: how does the main
function find these definitions when it does not even know about the other source files? Note that the file shown in Code Box 2-12 has only included one header file, and therefore it has no relationship with the other two source files.
The above question can be resolved by bringing the linker into consideration. The linker will gather the required definitions from various object files and put them together, and this way, the code written in the main
function can finally use the code written in another function.
Note:
To compile a source file that uses a function, the declaration is enough. However, to actually run your program, the definition should be provided to the linker in order to be put into the final executable file.
Now, it's time to compile example 2.4 and demonstrate what we've said so far. Using the following commands, we create corresponding relocatable object files. You need to remember that we only compile source files:
$ gcc -c ExtremeC_examples_chapter2_4_add.c -o add.o $ gcc -c ExtremeC_examples_chapter2_4_multiply.c -o multiply.o $ gcc -c ExtremeC_examples_chapter2_4_main.c -o main.o $
Shell Box 2-16: Compiling all sources in example 2.4 to their corresponding relocatable object files
For the next step, we are going to look at the symbol table contained in each relocatable object file:
$ nm add.o 0000000000000000 T add $
Shell Box 2-17: Listing the symbols defined in add.o
As you see, the add
symbol has been defined. The next object file:
$ nm multiply.o 0000000000000000 T multiply $
Shell Box 2-18: Listing the symbols defined in multiply.o
The same happens to the multiply
symbol within multiply.o
. And the final object file:
$ nm main.o U add U _GLOBAL_OFFSET_TABLE_ 0000000000000000 T main U multiply $
Shell Box 2-19: Listing the symbols defined in main.o
Despite the fact that the third source file, Code Box 2-12, has only the main
function, we see two symbols for add
and multiply
in its corresponding object file. However, they are different from the main
symbol, which has an address inside the object file. They are marked as U
, or unresolved. This means that while the compiler has seen these symbols in the translation unit, it has not been able to find their actual definitions. And this is exactly what we expected and explained before.
The source file containing the main
function, Code Box 2-12, should not know anything about the definitions of other functions if they are not defined in the same translation unit, but the fact that the main
definition is dependent on the declarations of add
and multiply
should be somehow pointed out in the corresponding relocatable object file.
To summarize where we are now, we have three intermediate object files, with one of them having two unresolved symbols. This has now made the job of the linker clear; we need to give the linker the necessary symbols that can be found in other object files. After having found all of the required symbols, the linker can continue to combine them in order to create a final executable binary that works.
If the linker is not able to find the definition of an unresolved symbol, it will fail, and inform us by printing a linkage error.
For the next step, we want to link the preceding object files together. The following command will do that:
$ gcc add.o multiply.o main.o $
Shell Box 2-20: Linking all object files together
We should note here that running gcc
with a list of object files, without passing any option, will result in the linking step trying to create an executable object file out of the input object files. Actually, it calls the linker in the background with the given object files, together with some other static libraries and object files, that are required on the platform.
To examine what happens if the linker fails to find proper definitions, we are going to provide the linker with only two intermediate object files, main.o
and add.o
:
$ gcc add.o main.o main.o: In function 'main': ExtremeC_examples_chapter2_4_main.c:(.text+0x2c): undefined reference to 'multiply' collect2: error: ld returned 1 exit status $
Shell Box 2-21: Linking only two of the object files: add.o and main.o
As you can see, the linker has failed because it could not find the multiply
symbol in the provided object files.
Moving on, let's provide the other two object files, main.o
and multiply.o
:
$ gcc main.o multiply.o main.o: In function 'main': ExtremeC_examples_chapter2_4_main.c:(.text+0x1a): undefined reference to 'add' collect2: error: ld returned 1 exit status $
Shell Box 2-22: Linking only two of the object files, multiply.o and main.o
As expected, the same thing occurred. This happened since the add
symbol could not be found in the provided object files.
Finally, let's provide the only remaining combination of two object files, add.o
and multiply.o
. Before we run it, we should expect it to work since neither object file has unresolved symbols in their symbol tables. Let's see what happens:
$ gcc add.o multiply.o /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o: In function '_start': (.text+0x20): undefined reference to 'main' collect2: error: ld returned 1 exit status $
Shell Box 2-23: Linking only two of the object files, add.o and multiply.o
As you see, the linker has failed again! Looking at the output, we can see the reason was that none of the object files contain the main
symbol that is necessary to create an executable. The linker needs an entry point for the program, which is the main
function according to the C standard.
At this point – and I cannot emphasize this enough – pay attention to the place where a reference to the main
symbol has been made. It has been made in the _start
function in a file located at /usr/lib/gcc/x86_64-Linux-gnu/7/../../../x86_64-Linux-gnu/Scrt1.o
.
The Scrt1.o
file seems to be a relocatable object file that has not been created by us. Scrt1.o
is actually a file that is part of a group of default C object files. These default object files have been compiled for Linux as a part of the gcc
bundle and are linked to any program in order to make it runnable.
As you have just seen, there are a lot of different things that are happening around your source code that can cause conflicts. Not only that, but there are a number of other object files that need to be linked to your program in order to make it executable.
Linker can be fooled!
To make our current discussion even more interesting, there are rare scenarios when the linking step will perform as we planned, but the final binary step does not work as expected. In this section, we are going to look at an example of this occurring.
Example 2.5 is based on an incorrect definition having been gathered by the linker and put into the final executable object file.
This example has two source files, one of which contains the definition of a function with the same name, but a different signature from the declaration used by the main
function. The following code boxes are the contents of these two source files. Here's the first source file:
int add(int a, int b, int c, int d) { return a + b + c + d; }
Code Box 2-13 [ExtremeC_examples_chapter2_5_add.c]: Definition of the add function in example 2.5
And, following is the second source file:
#include <stdio.h> int add(int, int); int main(int argc, char** argv) { int x = add(5, 6); printf("Result: %d\n", x); return 0; }
Code Box 2-14 [ExtremeC_examples_chapter2_5_main.c]: The main function in example 2.5
As you can see, the main
function is using another version of the add
function with a different signature, accepting two integers, but the add
function defined in the first source file, Code Box 2-13, is accepting four integers.
These functions are usually said to be the overloads of each other. For sure, there should be something wrong if we compile and link these source files. It's interesting to see if we can build the example successfully.
The next step is to compile and link the relocatable object files, which we can do by running the following code:
$ gcc -c ExtremeC_examples_chapter2_5_add.c -o add.o $ gcc -c ExtremeC_examples_chapter2_5_main.c -o main.o $ gcc add.o main.o -o ex2_5.out $
Shell Box 2-24: Building example 2.5
As you can see in the shell output, the linking step went well, and the final executable has been produced! This clearly shows that the symbols can fool the linker. Now let's look at the output after running the executable:
$ ./ex2_5.out Result: -1885535197 $ ./ex2_5.out Result: 1679625283 $
Shell Box 2-25: Running example 2.5 twice and the strange results!
As you can see, the output is wrong; it even changes in different runs! This example shows that bad things can happen when the linker picks up the wrong version of a symbol. Regarding the function symbols, they are just names and they don't carry any information regarding the signature of the corresponding function. Function arguments are nothing more than a C concept; in fact, they do not truly exist in either assembly code or machine-level instructions.
In order to investigate more, we are going to look at the disassembly of the add
functions in a different example. In example 2.6, we have two add
functions with the same signatures that we had in example 2.5.
To study this, we are going to work from the idea that we have the following source files in example 2.6:
int add(int a, int b, int c, int d) { return a + b + c + d; }
Code Box 2-15 [ExtremeC_examples_chapter2_6_add_1.c]: The first definition of add in example 2.6
The following code is the other source file:
int add(int a, int b) { return a + b; }
Code Box 2-16 [ExtremeC_examples_chapter2_6_add_2.c]: The second definition of add in example 2.6
The first step, just like before, is to compile both source files:
$ gcc -c ExtremeC_examples_chapter2_6_add_1.c -o add_1.o $ gcc -c ExtremeC_examples_chapter2_6_add_2.c -o add_2.o $
Shell Box 2-26: Compiling the source files in example 2.6 to their corresponding object files
We then need to have a look at the disassembly of the add
symbol in different object files. Therefore, we start with the add_1.o
object file:
$ objdump -d add_1.o add_1.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <add>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d fc mov %edi,-0x4(%rbp) 7: 89 75 f8 mov %esi,-0x8(%rbp) a: 89 55 f4 mov %edx,-0xc(%rbp) d: 89 4d f0 mov %ecx,-0x10(%rbp) 10: 8b 55 fc mov -0x4(%rbp),%edx 13: 8b 45 f8 mov -0x8(%rbp),%eax 16: 01 c2 add %eax,%edx 18: 8b 45 f4 mov -0xc(%rbp),%eax 1b: 01 c2 add %eax,%edx 1d: 8b 45 f0 mov -0x10(%rbp),%eax 20: 01 d0 add %edx,%eax 22: 5d pop %rbp 23: c3 $
Shell Box 2-27: Using objdump to look at the disassembly of the add symbol in add_1.o
The following shell box shows us the disassembly of the add
symbol found in the other object file, add_2.o
:
$ objdump -d add_2.o add_2.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <add>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d fc mov %edi,-0x4(%rbp) 7: 89 75 f8 mov %esi,-0x8(%rbp) a: 8b 55 fc mov -0x4(%rbp),%edx d: 8b 45 f8 mov -0x8(%rbp),%eax 10: 01 d0 add %edx,%eax 12: 5d pop %rbp 13: c3 retq $
Shell Box 2-28: Using objdump to look at the disassembly of the add symbol in add_2.o
When a function call takes place, a new stack frame is created on top of the stack. This stack frame contains both the arguments passed to the function and the return address. You will read more about the function call mechanism in Chapter 4, Process Memory Structure, and Chapter 5, Stack and Heap.
In shell boxes 2-27 and 2-28, you can clearly see how the arguments are collected from the stack frame. In the disassembly of add_1.o
, Shell Box 2-27, you can see the following lines:
4: 89 7d fc mov %edi,-0x4(%rbp) 7: 89 75 f8 mov %esi,-0x8(%rbp) a: 89 55 f4 mov %edx,-0xc(%rbp) d: 89 4d f0 mov %ecx,-0x10(%rbp)
Code Box 2-17: The assembly instructions to copy the arguments from the stack frame to the registers for the first add function
These instructions copy four values from the memory addresses, which have been pointed by the %rbp
register, and put them into the local registers.
Note:
Registers are locations within a CPU that can be accessed quickly. Therefore, it would be highly efficient for the CPU to bring the values from main memory into its registers first, and then perform calculations on them. The register %rbp
is the one that points to the current stack frame, containing the arguments passed to a function.
If you look at the disassembly of the second object file, while it is very similar, it differs by not having the copy operation four times:
4: 89 7d fc mov %edi,-0x4(%rbp) 7: 89 75 f8 mov %esi,-0x8(%rbp)
Code Box 2-18: The assembly instructions to copy the arguments from the stack frame to the registers for the second add function
These instructions copy two values simply because the function only expects two arguments. This is why we saw those strange values in the output of example 2.5. The main
function only puts two values into the stack frame while calling the add
function, but the add
definition was actually expecting four arguments. So, it is likely that the wrong definition continues to go beyond the stack frame to read the missing arguments, which results in the wrong values for the sum operation.
We could prevent this by changing the function symbol names based on the input types. This is usually referred to as name mangling and is mostly used in C++ because of its function overloading feature. We discuss this briefly in the last section of the chapter.
C++ name mangling
To highlight how name mangling works in C++, we are going to compile example 2.6 using a C++ compiler. Therefore, we will use the GNU C++ compiler g++
for this purpose.
Once we have done that, we can use readelf
to dump the symbol tables for each generated object file. By doing this, we can see how C++ has changed the name of the function symbols based on the types of input parameters.
As we have noted before, the compilation pipelines of C and C++ are very similar. Therefore, we can expect to have relocatable object files as a result of C++ compilation. Let's look at both of the object files produced as part of compiling example 2.6:
$ g++ -c ExtremeC_examples_chapter2_6_add_1.o $ g++ -c ExtremeC_examples_chapter2_6_add_2.o $ readelf -s ExtremeC_examples_chapter2_6_add_1.o Symbol table '.symtab' contains 9 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS ExtremeC_examples_chapter 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 0 SECTION LOCAL DEFAULT 6 7: 0000000000000000 0 SECTION LOCAL DEFAULT 4 8: 0000000000000000 36 FUNC GLOBAL DEFAULT 1 _Z3addiiii $ readelf -s ExtremeC_examples_chapter2_6_add_2.o Symbol table '.symtab' contains 9 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS ExtremeC_examples_chapter 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 0 SECTION LOCAL DEFAULT 6 7: 0000000000000000 0 SECTION LOCAL DEFAULT 4 8: 0000000000000000 20 FUNC GLOBAL DEFAULT 1 _Z3addii $
Shell Box 2-29: Using readelf the see the symbol tables of the object files produced by a C++ compiler
As you can see in the output, we have two different symbol names for different overloads of the add
function. The overload that accepts four integers has the symbol name _Z3addiiii
, and the other overload, which accepts two integers, has the symbol name _Z3addii
.
Every i
in the symbol name refers to one of the integer input parameters.
From that, you can see the symbol names are different, and if you try to use the wrong one, you will get a linking error as a result of the linker not being able to find the definition of a wrong symbol. Name mangling is the technique that enables C++ to support function overloading and it helps to prevent the problems we encountered in the previous section.