Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Extreme C

You're reading from   Extreme C Taking you to the limit in Concurrency, OOP, and the most advanced capabilities of C

Arrow left icon
Product type Paperback
Published in Oct 2019
Publisher Packt
ISBN-13 9781789343625
Length 822 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Kamran Amini Kamran Amini
Author Profile Icon Kamran Amini
Kamran Amini
Arrow right icon
View More author details
Toc

Table of Contents (27) Chapters Close

Preface 1. Essential Features FREE CHAPTER 2. From Source to Binary 3. Object Files 4. Process Memory Structure 5. Stack and Heap 6. OOP and Encapsulation 7. Composition and Aggregation 8. Inheritance and Polymorphism 9. Abstraction and OOP in C++ 10. Unix – History and Architecture 11. System Calls and Kernels 12. The Most Recent C 13. Concurrency 14. Synchronization 15. Thread Execution 16. Thread Synchronization 17. Process Execution 18. Process Synchronization 19. Single-Host IPC and Sockets 20. Socket Programming 21. Integration with Other Languages 22. Unit Testing and Debugging 23. Build Systems 24. Other Books You May Enjoy
25. Leave a review - let other readers know what you think
26. Index

Stack

A process can continue working without the Heap segment but not without the Stack segment. This says a lot. The Stack is the main part of the process metabolism, and it cannot continue execution without it. The reason is hiding behind the mechanism driving the function calls. As briefly explained in the previous chapter, calling a function can only be done by using the Stack segment. Without a Stack segment, no function call can be made, and this means no execution at all.

With that said, the Stack segment and its contents are engineered carefully to result in the healthy execution of the process. Therefore, messing with the Stack content can disrupt the execution and halt the process. Allocation from the Stack segment is fast, and it doesn't need any special function call. More than that, the deallocation and all memory management tasks happen automatically. All these facts are all very tempting and encourage you to overuse the Stack.

You should be careful about this. Using the Stack segment brings its own complications. The stack is not very big, therefore you cannot store large objects in it. In addition, incorrect use of the Stack content can halt the execution and result in a crash. The following piece of code demonstrates this:

#include <string.h>
int main(int argc, char** argv) {
  char str[10];
  strcpy(str, "akjsdhkhqiueryo34928739r27yeiwuyfiusdciuti7twe79ye");
  return 0;
}

Code Box 5-1: A buffer overflow situation. The strcpy function will overwrite the content of the Stack

When running the preceding code, the program will most likely crash. That's because the strcpy is overwriting the content of the Stack, or as it is commonly termed, smashing the stack. As you see in Code Box 5-1, the str array has 10 characters, but the strcpy is writing way more than 10 characters to the str array. As you will see shortly, this effectively writes on the previously pushed variables and stack frames, and the program jumps to a wrong instruction after returning from the main function. And this eventually makes it impossible to continue the execution.

I hope that the preceding example has helped you to appreciate the delicacy of the Stack segment. In the first half of this chapter, we are going to have a deeper look into the Stack and examine it closely. We first start by probing into the Stack.

Probing the Stack

Before knowing more about the Stack, we need to be able to read and, probably, modify it. As stated in the previous chapter, the Stack segment is a private memory that only the owner process has the right to read and modify. If we are going to read the Stack or change it, we need to become part of the process owning the Stack.

This is where a new set of tools come in: debuggers. A debugger is a program that attaches to another process in order to debug it. One of the usual tasks while debugging a process is to observe and manipulate the various memory segments. Only when debugging a process are we able to read and modify the private memory blocks. The other thing that can be done as part of debugging is to control the order of the execution of the program instructions. We give examples on how to do these tasks using a debugger shortly, as part of this section.

Let's start with an example. In example 5.1, we show how to compile a program and make it ready for debugging. Then, we demonstrate how to use gdb, the GNU debugger, to run the program and read the Stack memory. This example declares a character array allocated on top of the Stack and populates its elements with some characters, as can be seen in the following code box:

#include <stdio.h>
int main(int argc, char** argv) {
  char arr[4];
  arr[0] = 'A';
  arr[1] = 'B';
  arr[2] = 'C';
  arr[3] = 'D';
  return 0;
}

Code Box 5-2 [ExtremeC_examples_chapter5_1.c]: Declaration of an array allocated on top of the Stack

The program is simple and easy to follow, but the things that are happening inside the memory are interesting. First of all, the memory required for the arr array is allocated from the Stack simply because it is not allocated from the Heap segment and we didn't use the malloc function. Remember, the Stack segment is the default place for allocating variables and arrays.

In order to have some memory allocated from the Heap, one should acquire it by calling malloc or other similar functions, such as calloc. Otherwise, the memory is allocated from the Stack, and more precisely, on top of the Stack.

In order to be able to debug a program, the binary must be built for debugging purposes. This means that we have to tell the compiler that we want a binary that contains debug symbols. These symbols will be used to find the code lines that have been executing or those that caused a crash. Let's compile example 5.1 and create an executable object file that contains debugging symbols.

First, we build the example. We're doing our compilation in a Linux environment:

$ gcc -g ExtremeC_examples_chapter5_1.c -o ex5_1_dbg.out
$

Shell Box 5-1: Compiling the example 5.1 with debug option -g

The -g option tells the compiler that the final executable object file must contain the debugging information. The size of the binary is also different when you compile the source with and without the debug option. Next, you can see the difference between the sizes of the two executable object files, the first one built without the -g option and the second one with the -g option:

$ gcc ExtremeC_examples_chapter2_10.c -o ex5_1.out
$ ls -al ex5_1.out
-rwxrwxr-x 1 kamranamini kamranamini 8640 jul 24 13:55 ex5_1.out
$ gcc -g ExtremeC_examples_chapter2_10.c -o ex5_1_dbg.out
$ ls -al ex5_1.out
-rwxrwxr-x 1 kamranamini kamranamini 9864 jul 24 13:56 ex5_1_dbg.out
$

Shell Box 5-2: The size of the output executable object file with and without the -g option

Now that we have an executable file containing the debug symbols, we can use the debugger to run the program. In this example, we are going to use gdb for debugging example 5.1. Next, you can find the command to start the debugger:

$ gdb ex5_1_dbg.out

Shell Box 5-3: Starting the debugger for the example 5.1

Note:

gdb is usually installed as part of the build-essentials package on Linux systems. In macOS systems, it can be installed using the brew package manager like this: brew install gdb.

After running the debugger, the output will be something similar to the following shell box:

$ gdb ex5_1_dbg.out
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
...
Reading symbols from ex5_1_dbg.out...done.
(gdb)

Shell Box 5-4: The output of the debugger after getting started

As you may have noticed, I've run the preceding command on a Linux machine. gdb has a command-line interface that allows you to issue debugging commands. Enter the r (or run) command in order to execute the executable object file, specified as an input to the debugger. The following shell box shows how the run command executes the program:

...
Reading symbols from ex5_1_dbg.out...done.
(gdb) run
Starting program: .../extreme_c/5.1/ex5_1_dbg.out
[Inferior 1 (process 9742) exited normally]
(gdb)

Shell Box 5-5: The output of the debugger after issuing the run command

In the preceding shell box, after issuing the run command, gdb has started the process, attached to it, and let the program execute its instructions and exit. It did not interrupt the program because we have not set a breakpoint. A breakpoint is an indicator that tells gdb to pause the program execution and wait for further instructions. You can have as many breakpoints as you want.

Next, we set a breakpoint on the main function using the b (or break) command. After setting the breakpoint, gdb pauses the execution when the program enters the main function. You can see how to set a breakpoint on the main function in the following shell box:

(gdb) break main
Breakpoint 1 at 0x400555: file ExtremeC_examples_chapter5_1.c, line 4.
(gdb)

Shell Box 5-6: Setting a breakpoint on the main function in gdb

Now, we run the program again. This creates a new process, and gdb attaches to it. Next, you can find the result:

(gdb) r
Starting program: .../extreme_c/5.1/ex5_1_dbg.out
Breakpoint 1, main (argc=1, argv=0x7fffffffcbd8) at ExtremeC_examples_chapter5_1.c:3
3       int main(int argc, char** argv) {
(gdb)

Shell Box 5-7: Running the program again after setting the breakpoint

As you can see, the execution has paused at line 3, which is just the line of the main function. Then, the debugger waits for the next command. Now, we can ask gdb to run the next line of code and pause again. In other words, we run the program step by step and line by line. This way, you have enough time to look around and check the variables and their values inside the memory. In fact, this is the method we are going to use to probe the Stack and the Heap segments.

In the following shell box, you can see how to use the n (or next) command to run the next line of code:

(gdb) n
5         arr[0] = 'A';
(gdb) n
6         arr[1] = 'B';
(gdb) next
7        arr[2] = 'C';
(gdb) next
8        arr[3] = 'D';
(gdb) next
9        return 0;
(gdb)

Shell Box 5-8: Using the n (or next) command to execute upcoming lines of code

Now, if you enter the print arr command in the debugger, it will show the content of the array as a string:

(gdb) print arr
$1 = "ABCD"
(gdb)

Shell Box 5-9: Printing the content of the arr array using gdb

To get back to the topic, we introduced gdb to be able to see inside the Stack memory. Now, we can do it. We have a process that has a Stack segment, and it is paused, and we have a gdb command line to explore its memory. Let's begin and print the memory allocated for the arr array:

(gdb) x/4b arr
0x7fffffffcae0: 0x41    0x42    0x43    0x44
(gdb) x/8b arr
0x7fffffffcae0: 0x41    0x42    0x43    0x44    0xff    0x7f    0x00    0x00
(gdb)

Shell Box 5-10: Printing bytes of memory starting from the arr array

The first command, x/4b, shows 4 bytes from the location that arr is pointing to. Remember that arr is a pointer that actually is pointing to the first element of the array, so it can be used to move along the memory.

The second command, x/8b, prints 8 bytes after arr. According to the code written for example 5.1, and found in Code Box 5-2, the values A, B, C, and D are stored in the array, arr. You should know that ASCII values are stored in the array, not the real characters. The ASCII value for A is 65 decimal or 0x41 hexadecimal. For B, it is 66 or 0x42. As you can see, the values printed in the gdb output are the values we just stored in the arr array.

What about the other 4 bytes in the second command? They are part of the Stack, and they probably contain data from the recent Stack frame put on top of the Stack while calling the main function.

Note that the Stack segment is filled in an opposite fashion in comparison to other segments.

Other memory regions are filled starting from the smaller addresses and they move forward to bigger addresses, but this is not the case with the Stack segment.

The Stack segment is filled from the bigger addresses and moves backward to the smaller addresses. Some of the reasons behind this design lie in the development history of modern computers, and some in the functionality of the Stack segment, which behaves like a stack data structure.

With all that said, if you read the Stack segment from an addresses toward the bigger addresses, just like we did in Shell Box 5-10, you are effectively reading the already pushed content as part of the Stack segment, and if you try to change those bytes, you are altering the Stack, and this is not good. We will demonstrate why this is dangerous and how this can be done in future paragraphs.

Why are we able to see more than the size of the arr array? Because gdb goes through the number of bytes in the memory that we have requested. The x command doesn't care about the array's boundary. It just needs a starting address and the number of bytes to print the range.

If you want to change the values inside the Stack, you have to use the set command. This allows you to modify an existing memory cell. In this case, the memory cell refers to an individual byte in the arr array:

(gdb) x/4b arr
0x7fffffffcae0: 0x41    0x42    0x43    0x44
(gdb) set arr[1] = 'F'
(gdb) x/4b arr
0x7fffffffcae0: 0x41    0x46    0x43    0x44
(gdb) print arr
$2 = "AFCD"
(gdb)

Shell Box 5-11: Changing an individual byte in the array using the set command

As you can see, using the set command, we have set the second element of the arr array to F. If you are going to change an address that is not in the boundaries of your arrays, it is still possible through gdb.

Please observe the following modification carefully. Now, we want to modify a byte located in a far bigger address than arr, and as we explained before, we will be altering the already pushed content of the Stack. Remember, the Stack memory is filled in an opposite manner compared to other segments:

(gdb) x/20x arr
0x7fffffffcae0: 0x41    0x42    0x43    0x44    0xff    0x7f    0x00    0x00
0x7fffffffcae8: 0x00    0x96    0xea    0x5d    0xf0    0x31    0xea    0x73
0x7fffffffcaf0: 0x90    0x05    0x40    0x00
(gdb) set *(0x7fffffffcaed) = 0xff
(gdb) x/20x arr
0x7fffffffcae0: 0x41    0x42    0x43    0x44    0xff    0x7f    0x00    0x00
0x7fffffffcae8: 0x00    0x96    0xea    0x5d    0xf0    0xff    0x00    0x00
0x7fffffffcaf0: 0x00    0x05    0x40    0x00
(gdb)

Shell Box 5-12: Changing an individual byte outside of the array's boundary

That is all. We just wrote the value 0xff in the 0x7fffffffcaed address, which is out of the boundary of the arr array, and probably a byte within the stack frame pushed before entering the main function.

What will happen if we continue the execution? If we have modified a critical byte in the Stack, we expect to see a crash or at least have this modification detected by some mechanism and have the execution of the program halted. The command c (or continue) will continue the execution of the process in gdb, as you can see next:

(gdb) c
Continuing.
*** stack smashing detected ***: .../extreme_c/5.1/ex5_1_dbg.out terminated
Program received signal SIGABRT, Aborted.
0x00007ffff7a42428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/Unix/sysv/linux/raise.c:54
54      ../sysdeps/Unix/sysv/linux/raise.c: No such file or directory.
(gdb)

Shell Box 5-13: Having a critical byte changed in the Stack terminates the process

As you can see in the preceding shell box, we've just smashed the Stack! Modifying the content of the Stack in addresses that are not allocated by you, even by 1 byte, can be very dangerous and it usually leads to a crash or a sudden termination.

As we have said before, most of the vital procedures regarding the execution of a program are done within the Stack memory. So, you should be very careful when writing to Stack variables. You should not write any values outside of the boundaries defined for variables and arrays simply because the addresses grow backward in the Stack memory, which makes it likely to overwrite the already written bytes.

When you're done with debugging, and you're ready to leave the gdb, then you can simply use the command q (or quit). Now, you should be out of the debugger and back in the terminal.

As another note, writing unchecked values into a buffer (another name for a byte or character array) allocated on top of the Stack (not from the Heap) is considered a vulnerability. An attacker can carefully design a byte array and feed it to the program in order to take control of it. This is usually called an exploit because of a buffer overflow attack.

The following program shows this vulnerability:

int main(int argc, char** argv) {
  char str[10];
  strcpy(str, argv[1]);
  printf("Hello %s!\n", str);
}

Code Box 5-3: A program showing the buffer overflow vulnerability

The preceding code does not check the argv[1] input for its content and its size and copies it directly into the str array, which is allocated on top of the Stack.

If you're lucky, this can lead to a crash, but in some rare but dangerous cases, this can lead to an exploit attack.

Points on using the Stack memory

Now that you have a better understanding of the Stack segment and how it works, we can talk about the best practices and the points you should be careful about. You should be familiar with the scope concept. Each Stack variable has its own scope, and the scope determines the lifetime of the variable. This means that a Stack variable starts its lifetime in one scope and dies when that scope is gone. In other words, the scope determines the lifetime of a Stack variable.

We also have automatic memory allocation and deallocation for Stack variables, and it is only applicable to the Stack variables. This feature, automatic memory management, comes from the nature of the Stack segment.

Whenever you declare a Stack variable, it will be allocated on top of the Stack segment. Allocation happens automatically, and this can be marked as the start of its lifetime. After this point, many more variables and other stack frames are put on top of it inside the Stack. As long as the variable exists in the Stack and there are other variables on top of it, it survives and continues living.

Eventually, however, this stuff will get popped out of the Stack because at some point in the future the program has to be finished, and the stack should be empty at that moment. So, there should be a point in the future when this variable is popped out of the stack. So, the deallocation, or getting popped out, happens automatically, and that can be marked as the end of the variable's lifetime. This is basically the reason why we say that we have automatic memory management for the Stack variables that is not controlled by the programmer.

Suppose that you have defined a variable in the main function, as we see in the following code box:

int main(int argc, char** argv) {
  int a;
  ...
  return 0;
}

Code Box 5-4: Declaring a variable on top of the Stack

This variable will stay in the Stack until the main function returns. In other words, the variable exists until its scope (the main function) is valid. Since the main function is the function in which all the program runs, the lifetime of the variable is almost like a global variable that is declared throughout the runtime of the program.

It is like a global variable, but not exactly one, because there will be a time that the variable is popped out from the Stack, whereas a global variable always has its memory even when the main function is finished and the program is being finalized. Note that there are two pieces of code that are run before and after the main function, bootstrapping and finalizing the program respectively. As another note, global variables are allocated from a different segment, Data or BSS, that does not behave like the Stack segment.

Let's now look at an example of a very common mistake. It usually happens to an amateur programmer while writing their first C programs. It is about returning an address to a local variable inside a function.

The following code box shows example 5.2:

int* get_integer() {
  int var = 10;
  return &var;
}
int main(int argc, char** argv) {
  int* ptr = get_integer();
  *ptr = 5;
  return 0;
}

Code Box 5-5 [ExtremeC_examples_chapter5_2.c]: Declaring a variable on top of the Stack

The get_integer function returns an address to the local variable, var, which has been declared in the scope of the get_integer function. The get_integer function returns the address of the local variable. Then, the main function tries to dereference the received pointer and access the memory region behind. The following is the output of the gcc compiler while compiling the preceding code on a Linux system:

$ gcc ExtremeC_examples_chapter5_2.c -o ex5_2.out
ExtremeC_examples_chapter5_2.c: In function 'get_integer':
ExtremeC_examples_chapter5_2.c:3:11: warning: function returns address of local variable [-Wreturn-local-addr]
   return &var;
          ^~~~
$

Shell Box 5-14: Compiling the example 5.2 in Linux

As you can see, we have received a warning message. Since returning the address of a local variable is a common mistake, compilers already know about it, and they show a clear warning message like warning: function returns address of a local variable.

And this is what happens when we execute the program:

$ ./ex5_2.out
Segmentation fault (core dumped)
$

Shell Box 5-15: Executing the example 5.2 in Linux

As you can see in Shell Box 5-15, a segmentation fault has happened. It can be translated as a crash. It is usually because of invalid access to a region of memory that had been allocated at some point, but now it is deallocated.

Note:

Some warnings should be treated as errors. For example, the preceding warning should be an error because it usually leads to a crash. If you want to make all warning to be treated as errors, it is enough to pass the -Werror option to gcc compiler. If you want to treat only one specific warning as an error, for example, the preceding warning, it is enough to pass the -Werror=return-local-addr option.

If you run the program with gdb, you will see more details regarding the crash. But remember, you need to compile the program with the -g option otherwise gdb won't be that helpful.

It is always mandatory to compile the sources with -g option if you are about to debug the program using gdb or other debugging tools such as valgrind. The following shell box demonstrates how to compile and run example 5.2 in the debugger:

$ gcc -g ExtremeC_examples_chapter5_2.c -o ex5_2_dbg.out
ExtremeC_examples_chapter5_2.c: In function 'get_integer':
ExtremeC_examples_chapter5_2.c:3:11: warning: function returns address of local variable [-Wreturn-local-addr]
   return &var;
          ^~~~
$ gdb ex5_2_dbg.out
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
...
Reading symbols from ex5_2_dbg.out...done.
(gdb) run
Starting program: .../extreme_c/5.2/ex5_2_dbg.out
Program received signal SIGSEGV, Segmentation fault.
0x00005555555546c4 in main (argc=1, argv=0x7fffffffdf88) at ExtremeC_examples_chapter5_2.c:8
8    *ptr = 5;
(gdb) quit
$

Shell Box 5-16: Running the example 5.2 in the debugger

As is clear from the gdb output, the source of the crash is located at line 8 in the main function, exactly where the program tries to write to the returned address by dereferencing the returned pointer. But the var variable has been a local variable to the get_integer function and it doesn't exist anymore, simply because at line 8 we have already returned from the get_integer function and its scope, together with all variables, have vanished. Therefore, the returned pointer is a dangling pointer.

It is usually a common practice to pass the pointers addressing the variables in the current scope to other functions but not the other way around, because as long as the current scope is valid, the variables are there. Further function calls only put more stuff on top of the Stack segment, and the current scope won't be finished before them.

Note that the above statement is not a good practice regarding concurrent programs because in the future, if another concurrent task wants to use the received pointer addressing a variable inside the current scope, the current scope might have vanished already.

To end this section and have a conclusion about the Stack segment, the following points can be extracted from what we have explained so far:

  • Stack memory has a limited size; therefore, it is not a good place to store big objects.
  • The addresses in Stack segment grow backward, therefore reading forward in the Stack memory means reading already pushed bytes.
  • Stack has automatic memory management, both for allocation and deallocation.
  • Every Stack variable has a scope and it determines its lifetime. You should design your logic based on this lifetime. You have no control over it.
  • Pointers should only point to those Stack variables that are still in a scope.
  • Memory deallocation of Stack variables is done automatically when the scope is about to finish, and you have no control over it.
  • Pointers to variables that exist in the current scope can be passed to other functions as arguments only when we are sure that the current scope will be still in place when the code in the called functions is about to use that pointer. This condition might break in situations when we have concurrent logic.

In the next section, we will talk about the Heap segment and its various features.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image