Fundamentals of memory management and allocation
Memory management is a sufficiently challenging technique to incorporate in application development. One of the top challenges is knowing when to retain or discard data. While the concept sounds easy, the fact that it is an entire field of study speaks volumes. Ideally, programmers wouldn’t need to worry about the details in between, but knowing different techniques and how they can be used to ensure maximum efficiency is essential.
Contiguous allocation is the oldest and most straightforward allocation method. When a process is about to execute, and memory is requested, the required memory is compared to the available memory. If sufficient contiguous memory can be found, then allocation occurs, and the process can execute successfully. If an adequate amount of memory contiguous memory blocks cannot be found, then the process will remain in limbo until sufficient memory can be found.
Figure 1.4 shows how memory blocks are aligned and how the assignment is attempted in contiguous allocation. Conceptually, memory blocks are sequentially laid out, and when allocation is required, it is best to place the data being allocated in blocks beside each other or contiguously. This makes read/write operations in applications that rely on the allocated data more efficient.
Figure 1.4 – How contiguous allocation works
The preference for contiguously allocated blocks is evident when we consider that contiguous memory blocks are more accessible to read and manipulate than non-contiguous blocks. One drawback, however, is that memory might not be used effectively since the entire allocation must be successful, or the allocation will fail. For this reason, memory might not get allocated to smaller contiguous blocks.
As developers, we can use the following tips as guidelines to ensure that contiguous allocation occurs in our applications:
- Static allocation – We can ensure that we use variables and data structures where a fixed size is known and allocated at application runtime. For instance, arrays are allocated contiguously in memory.
- Dynamic allocation – We can manually manage memory blocks of fixed sizes. Some languages, such as C and C++, allow you to allocate memory on the fly using functions such as
malloc()
andcalloc()
. Similarly, you can prevent fragmentation by deallocating memory when it is no longer in use. This ensures that memory is being used and freed as efficiently as possible. - Memory pooling – You can reserve a fixed memory space when the application starts. This fixed space in memory will be used exclusively by the application for any resource requirements during the runtime of the application. The allocation and deallocation of memory blocks will also be handled manually, as seen previously.
These techniques can help developers write applications that ensure contiguous memory allocation as and when necessary for certain systems and performance-critical applications.
With contiguous memory, we have the options of stack and heap allocation. Stack allocation pre-arranges the memory allocation and implements it during compilation, while heap allocation is done during runtime. Stack allocation is more commonly used for contiguous allocation, and this is a perfect match since allocation happens in predetermined blocks. Heap allocation is a bit more difficult since the system must find enough memory, which might not be possible. For this reason, heap allocation suits non-contiguous allocation.
Non-contiguous allocation, in contrast to contiguous allocation, allows memory to be allocated across several memory blocks that might not be beside each other. This means that if two blocks are needed for allocation and are not beside each other, then allocation will still be successful.
Figure 1.5 displays memory blocks to be assigned to a process, but the available slots are at opposite ends of the contiguous block. In this model, the process will still receive its allocation request, and the memory blocks will be used efficiently as the empty spaces are used as needed.
Figure 1.5 – How non-contiguous allocation works, where empty memory blocks are used even when they are separated
This method, of course, comes at the expense of optimal read/write performance, but it does help an application move forward with its processes since it might not need to wait too long before memory can be found to fulfill its requests. This also leads to a common problem called fragmentation, which we will review later.
Even with the techniques and recommendations, there are many scenarios where a poor implementation of memory management can affect the robustness and speed of programs. Typical problems include the following:
- Premature frees: When a program gives up memory but attempts to access it later, causing a crash or unexpected behavior.
- Dangling pointers: When a program ends but leaves a dangling reference to the memory block it was allocated.
- Memory leak: When a program continually allocates and never releases memory. This will lead to memory exhaustion on the device.
- Fragmentation: Fragmentation is when a solid gets split into many pieces. Programs operate best when memory is allocated linearly. When memory is allocated using too many small blocks, it leads to poor and inadequate distribution. Eventually, despite having enough spare memory, it can no longer give out big enough leagues.
- Poor locality of reference: Programs operate best when successive memory accesses are nearer to each other. Like the fragmentation problem, if the memory manager places the blocks a program will use far apart, this will cause performance problems.
As we have seen, memory must be handled delicately and has limitations we must be aware of. One of the most significant limitations is the amount of space available to an application. In the next section, we review how memory space is measured.
Units of memory storage
It is essential to know the different units of measurement and overall sizes that specific keywords represent in memory management. This will give us a good foundation for discussing memory and memory usage.
- Bit: The smallest unit of information. A bit can have one of two possible numerical values (1 and 0), representing logical values (true and false). Multiple bits combine to form a binary number.
- Binary number: A numerical (usually an integer) value formed from a sequence of bits, or ones and zeros. Each bit in the sequence represents a value to the power of 2, with each 1 contributing to the sum of the given value. To convert a binary number to decimal, multiply each digit from left to right by the power of 2. The rightmost digit gets the lowest power.
For example, the binary number 1101 represents 1 * 8 + 1 * 4 + 0 * 2 + 1 * 1 to give a total of 13. Figure 1.6 shows a simple table with the binary positions relative to the power of 2.
Power |
27 |
26 |
25 |
24 |
23 |
22 |
21 |
20 |
Base 10 Values |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
13 |
1 |
1 |
0 |
1 |
||||
37 |
1 |
0 |
0 |
1 |
0 |
1 |
||
132 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
Figure 1.6 – A simple binary table with example values
- Binary code: Binary sequences representing alphanumerical and special characters. Each bit sequence is assigned to specific data. The most popular code is ASCII code, which uses 7-bit binary code to represent text, numbers, and other characters.
- Byte: A byte is a sequence of 8 bits that encodes a single character using specified binary code. Since bit and byte begin with the letter b, an uppercase B is used to depict this data size. It also serves as the base unit of measurement, where increments are usually in the thousands (Kilo = 1000, Mega = 1,000,000, etc.)
Now that we understand memory management and its importance, let’s look closely at memory and how it works to ensure that our applications run smoothly.
The fundamentals of how memory works
When considering how applications work and how memory is used and allocated, it is good to have at least a high-level understanding of how computers see memory, the states that memory can exist in, and how algorithms decide how to allocate it.
For starters, each process has its own virtual address space but will share the same physical memory. When developing applications, you will work only with the virtual address space, and the garbage collector allocates and frees virtual memory for you on the managed heap. At the OS level, you can use native functions to interact with the virtual address space to allocate and free virtual memory for you on native heaps.
Virtual memory can be in one of three states:
- Reserved: The memory block is available for your use and can’t be accessed until it’s committed
- Free: The memory block has no references and is available for allocation
- Committed: The block of memory is assigned to physical storage
Memory can become fragmented as memory gets allocated and more processes are spooled up. This means that, as mentioned earlier, the memory is split across several memory blocks that are not contiguous. This leads to holes in the address space. The more fragmented memory becomes, the more difficult it becomes for the virtual memory manager to find a single free block large enough to satisfy the allocation request. Even if you need a space of a specific size and have that amount of space available cumulatively, the allocation attempt might fail if it cannot happen over a single address block. Generally, you will run into a memory exception (like an OutOfMemoryException
in C#) if there isn’t enough virtual address space to reserve or physical space to commit. See Figure 1.5 for a visual example of how fragmentation might look. The process that has been allocated memory has to check in two non-contiguous slots for relevant information. There is a free space in memory during the process runtime, but it cannot be used until another process requests it. This is an example of fragmented memory.
We need to be careful when allocating memory in terms of ordering the blocks to be allocated relative to each new object or process. This can be a tedious task, but thankfully, the .NET runtime provides mechanisms to handle this for us. Let’s review how .NET handles memory allocation for us.
Automatic memory allocation in .NET
Each OS boasts unique and contextually efficient memory allocation techniques. The OS ultimately governs how memory and other resources are allocated to each new process to ensure efficient resource utilization relative to the hardware and available resources.
When writing applications using the .NET runtime, we rely on its ability to allocate resources automatically. Because .NET allows you to write in several languages (C#, C++, Python, etc.), it provides a common language runtime (CLR) that compiles the original language(s) into a single runtime language called managed code, which is executed in a manager execution environment.
With managed code, we benefit from cross-language integration and enhanced security, versioning, and deployment support. We can, for example, write a class in one language and then use a different language to derive a native class from the original class. You can also pass objects of that class between the languages. This is possible given that the runtime defines rules for creating, using, persisting, and binding different reference types.
The CLR gives us the benefit of automatic memory management during managed execution. You do not need to write code to perform memory management tasks as a developer. This eliminates most, if not all, of the negative allocation scenarios that we previously explored. The runtime reserves a contiguous region of address space for each initialized new process. This reserved address space is the managed heap, which is initially set to the base address of the managed heap.
All reference types, as defined by the CLR, are allocated on the managed heap. When the application creates its first reference type instance, memory is allocated at the managed heap’s base address. For every initialized object, memory is allocated in the contiguous memory space following the previously allocated space. This allocation method will continue with each new object, if address space is available.
This process is faster than unmanaged memory allocation since the runtime handles the memory allocation through pointers, which makes it almost as fast as allocating memory directly from the CPU’s stack. Because new objects allocated consecutively are stored contiguously in the managed heap, an application can access the objects quickly.
The allocated memory space must continuously be reclaimed to ensure an effective and efficient operation. You can rely on a built-in mechanism called a garbage collector to orchestrate this process, and we will discuss this at a high level next.