The role of the garbage collector
Garbage collection is the process that governs how programs release memory space that is no longer being used for their operations. This process serves as an automatic memory manager by managing the allocation and release of memory for an application.
Programming languages that support automatic garbage collection free developers from the need to write specific code to perform memory management tasks. Languages that implement automatic memory management allow us to build applications without accounting for common problems such as memory leaks or an application attempting to access freed memory for an already freed object.
Each language handles garbage collection differently, and it is crucial to appreciate how it works in your context. As mentioned, the CLR in .NET implements it automatically, but additional libraries may be required in low-level programming languages such as C. For instance, C developers must handle allocation and deallocation using the malloc()
and dealloc()
functions. In contrast, it is not recommended for a C# developer to handle this as it is already taken care of.
Recall that in C#, allocation happens through a managed heap, and objects are placed in contiguous spaces in memory. In contrast, in C, objects are placed where there is free memory, and locations are tracked through a linked list. Memory allocation will work faster in CLR-supported languages since the allocation is done linearly, ensuring a contiguous allocation process. In C, memory must be traversed to find the next available slot, adding additional time to the allocation process. We will review the details of the allocation process of the CLR in the next chapter.
Here are some additional benefits of the garbage collector:
- Allocates objects on the managed heap efficiently
- Reclaims memory from objects no longer being used so that memory is available for future allocations
- Provides memory safety by ensuring an object can’t claim memory allocated for another object
The garbage collector boasts an optimized engine that performs collection operations at the best possible time based on static fields, local variables on a thread’s stack, CPU registers, GC handles, and the finalized queue from the application’s roots. Each root should refer to an object on the managed heap or have a null value. The garbage collector can ask the rest of the runtime for these roots and will use this list to create a graph containing all the objects accessible from the roots. Any unreachable object is classified as garbage, and the memory that it is using is released.
Garbage collection happens under one of these situations:
- The operating system or host has notified that there is low memory.
- Memory being used by the allocated objects on the managed heap exceeds an acceptable threshold.
- The developer called the
GC.Collect()
function, which forces a collection event. This is not generally required since the GC operates automatically.
The managed heap the GC uses to manage allocation is divided into three sections called generations. Let’s take a closer look at how these generations work and the pros and cons of this mechanism.
Garbage collection in .NET
The GC in .NET has three generations labeled 0, 1, and 2. Each generation is dedicated to tracking objects based on their expected lifetime. Generation 0 stores short-lived objects, ranging to Generation 2 for more long-term objects.
- Generation 0: This generation stores short-lived objects such as temporary variables. When this generation is full and new objects are to be created, the GC will free up space by examining the objects in generation 0 rather than all objects in the managed heap.
- Generation 1: This generation sits between generations 0 and 2. After a GC event in generation 0, objects are compacted and promoted to this generation, where they will enjoy a longer lifetime. When a GC operation is run on this generation, objects that survive get promoted to Generation 2.
- Generation 2: Long-lived objects such as static data and singleton objects are stored in this generation. Anything that survives a collection event on this level stays until it becomes unreachable in a future collection. Collections at this level are also called full garbage collections since they reclaim all generations in the heap.
The garbage collector has an additional heap for large objects, called the Large Object Heap (LOH). This heap is used for objects that are 85,000 bytes or more. Collection events on the LOH and Generation 2 generally take a long time, given the size and lifetime of the cleaned objects.
Garbage collection starts with a marking phase, where it finds and creates a list of all currently allocated objects. It then enters a relocating phase, where references related to the surviving objects are updated. Then, there is a compacting phase where space is reclaimed from dead objects, and the surviving objects are compacted. Compaction is simply the process of moving memory blocks beside each other, which, as mentioned before, is a significant factor in the CLR’s efficient memory allocation method.
Applications consist of several processes and processes run on threads. A thread is a basic to which the OS allocates processor time. The .NET runtime and CLR manage threads, and when a garbage collection operation begins, all managed threads are suspended except for the thread that triggered the collection event.
It is generally ill-advised to run the GC.Collect()
method manually for several reasons. This method will pause your application and allow the collector to run. This may cause your application to become unresponsive and degrade its performance. In addition, the process is not guaranteed to free all unused objects from memory, and those still in use by your application will not be collected. This method should only be used when the application no longer uses any objects that the collector previously collected.
The drawback of garbage collection lies in its effect on performance. Garbage collection must periodically traverse the program, inspecting object references and reclaiming memory. This process consumes system resources and frequently necessitates program pauses.
It is easy to see why garbage collection is a fantastic tool that spares us from carrying out manual memory management and space reclamation. Now, let’s review some of memory management’s impacts on overall application performance.