However, the big changes proposed by .NET were started using a totally different component model approach. Up until 2002, when .NET officially appeared, such a component model was
COM (Component Object Model), introduced by the company in 1993. COM is the basis for several other Microsoft technologies and frameworks, including OLE, OLE automation, ActiveX, COM+, DCOM, the Windows shell, DirectX,
UMDF (User-Mode Driver Framework), and Windows runtime.
At the time of writing this, COM is a competitor with another specification named CORBA (Common Object Request Broker Architecture), a standard defined by the Object Management Group (OMG), designed to facilitate the communication of systems that are deployed on diverse platforms. CORBA enables collaboration between systems on different operating systems, programming languages, and computing hardware. In its life cycle, it has received a lot of criticism, mainly because of poor implementations of the standard.
.NET as a reaction to the Java World
In 1995, a new model was conceived to supersede COM and the unwanted effects related to it, especially versions and the use of the Windows Registry on which COM depends to define accessible interfaces or contracts; a corruption or modified fragment of the registry could indicate that a component was not accessible at runtime. Also, in order to install applications, elevated permissions were required, since the Windows Registry is a sensible part of the system.
A year later, various quarters of Microsoft started making contacts with some of the most distinguished software engineers, and these contacts remained active over the years. These included architects such as Anders Hejlsberg (who became the main author of C# and the principal architect of .NET framework), Jean Paoli (one of the signatures in the XML Standard and the former ideologist of AJAX technologies), Don Box (who participated in the creation of SOAP and XML Schemas), Stan Lippman (one of the fathers of C++, who was working at the time at Disney), Don Syme (the architect for generics and the principal author of the F# language), and so on.
The purpose of this project was to create a new execution platform, free from the caveats of COM and one that was able to hold a set of languages to execute in a secure and extensible manner. The new platform should be able to program and integrate the new world of web services, which had just appeared—based on XML—along with other technologies. The initial name of the new proposal was Next Generation Windows Services (NGWS).
By late 2000, the first betas of .NET framework were released, and the first version appeared on February 13, 2002. Since then, .NET has been always aligned with new versions of the IDE (Visual Studio). The current version of the classic .NET framework at the time of writing this is 4.6.1, but we will get into more detail on this later in the chapter.
An alternative .NET appeared in 2015 for the first time. In the //BUILD/
event, Microsoft announced the creation and availability of another version of .NET, called .NET Core.
The open source movement and .NET Core
Part of an idea for the open source movement and .NET Core comes from a deep change in the way software creation and availability is conceived in Redmond nowadays. When Satya Nadella took over as the CEO at Microsoft, they clearly shifted to a new mantra: mobile-first, cloud-first. They also redefined themselves as a company of software and services.
This meant embracing the open source idea with all its consequences. As a result, a lot of the NET Framework has already been opened to the community, and this movement will continue until the whole platform is opened, some say. Besides, a second purpose (clearly stated several times at the //BUILD/
event) was to create a programming ecosystem powerful enough to allow anyone to program any type of application for any platform or device. So, they started to support Mac OX and Linux as well as several tools to build applications for Android and iOS.
However, the implications run deeper. If you want to build applications for Mac OS and Linux, you need a different Common Language Runtime (CLR) that is able to execute in these platforms without losing out on performance. This is where .NET Core comes into play.
At the time writing this, Microsoft has published several (ambitious) improvements to the .NET ecosystem, mainly based on two different flavors of .NET:
The first one is the version that was last available—.NET (.NET framework 4.6.x)—and the second one is the new version, intended to allow compilations that are valid not only for Windows platforms, but also for Linux and Mac OSes.
NET Core is the generic name for a new open source version of the CLR made available in 2015 (updated last November to version 1.1) intended to support multiple flexible .NET implementations. In addition, the team is working on something called
.NET Native, which compiles to native code in every destination platform.
However, let's keep on going with the main concepts behind the CLR, from a version-independent point of view.
To address some of the problems of COM and introduce the bunch of new capabilities that were requested as part of the new platform, a team at Microsoft started to evolve prior ideas (and the names associated with the platform as well). So, the framework was soon renamed to Component Object Runtime (COR) prior to the first public beta, when it was finally given the name of Common Language Runtime in order to drive the fact that the new platform was not associated with a single language.
Actually, there are dozens of compilers available for use with the .NET framework, and all of them generate a type intermediate code, which—in turn—is converted into native code at execution time, as shown in the following figure:
The CLR, as well as COM, focuses on contracts between components, and these contracts are based on types, but that's where the similarities end. Unlike COM, the CLR establishes a well-defined form to specify contracts, which is generally known as metadata.
Also, the CLR includes the possibility of reading metadata without any knowledge of the underlying file format. Furthermore, such metadata is extensible by means of custom attributes, which are strongly typed themselves. Other interesting information included in the metadata includes the version information (remember, there should be no dependencies of the Registry) and component dependencies.
Besides, for any component (called assembly), the presence of metadata is mandatory, which means that it's not possible to deploy the access of a component without reading its metadata. In the initial versions, implementations of security were mainly based on some evidence included in the metadata. Furthermore, such metadata is available for any other program inside or outside the CLR through a process called Reflection.
Another important difference is that .NET contracts, above all, describe the logical structure of types. There are no in-memory representations, reading order sequences, alignment or parameter conventions, among other things, as Don Box explains in detail in his magnificent Essential .NET (http://www.amazon.com/Essential-NET-Volume-Language-Runtime/dp/0201734117).
Common Intermediate Language
The way these previous conventions and protocols are resolved in CLR is by means of a technique called contract virtualization. This implies that most of the code (if not all) written for the CLR doesn't contain the machine code but an intermediate language called Common Intermediate Language (CIL), or just Intermediate Language (IL).
CLR never executes CIL directly. Instead, CIL is always translated into native machine code prior to its execution by means of a technique called JIT (Just-In-Time) compilation. This is to say that the JIT process always adapts the resulting executable code to the destination machine (independent from the developer). There are several modes of performing a JIT process, and we'll look at them in more detail later in this chapter.
Thus, CLR is what we might call a type-centered framework. For CLR, everything is a type, an object, or a value.
Another critical factor in the behavior of CLR is the fact that programmers are encouraged to forget about the explicit management of memory and the manual management of threads (especially associated with languages such as C and C++) to adopt the new way of execution that the CLR proposes: managed execution.
Under managed execution, CLR has complete knowledge of everything that happens in its execution context. This includes every variable, method, type, event, and so on. This encourages and fosters productivity and eases the path to debugging in many ways.
Additionally, CLR supports the creation of runtime code (or generative programming) by means of a utility called CodeDOM. With this feature, you can emit code in different languages and compile it (and execute it) directly in the memory.
All this drives us to the next logical questions: which languages are available to be used with this infrastructure, which are the common points among them, how is the resulting code assembled and prepared for execution, what are the units of stored information (as I said, they're called assemblies), and finally, how is all this information organized and structured into one of these assemblies?
Every execution environment has a notion of software components. For CLR, such components must be written in a CLI-compliant language and compiled accordingly. You can read a list of CLI languages on Wikipedia. But the question is what is a CLI-compliant language?
CLI stands for
Common Language Infrastructure, and it's a software specification standardized by ISO and ECMA describing the executable code and a runtime environment that allows multiple high-level languages to be used on different computer platforms without being rewritten for specific architectures. The .NET framework and the free and open source Mono are implementations of CLI.
The most relevant points in the CLI would be as follows (according to Wikipedia):
- First, to substitute COM, metadata is key and provides information on the architecture of assemblies, such as a menu or an index of what you can find inside. Since it doesn't depend on the language, any program can read this information.
- That established, there should be a common set of rules to comply in terms of data types and operations. This is the Common Type System (CTS). All languages that adhere to CTS can work with a set of rules.
- For minimal interoperation between languages, there is another set of rules, and this should be common to all programming languages in this group, so a DLL made with one language and then compiled can be used by another DLL compiled in a different CTS language, for example.
- Finally, we have a Virtual Execution System, which is responsible for running this application and many other tasks, such as managing the memory requested by the program, organizing execution blocks, and so on.
With all this in mind, when we use a .NET compiler (from now on, compiler), we generate a byte stream, usually stored as a file in the local filesystem or on a web server.
Structure of an assembly file
Files generated by a compilation process are called assemblies, and any assembly follows the basic rules of any other executable file in Windows and adds a few extensions and information suitable and mandatory for the execution in a managed environment.
In short, we understand that an assembly is just a set of modules containing the IL code and metadata, which serve as the primary unit of a software component in CLI. Security, versioning, type resolution, processes (application domains), and so on, all work on a per-assembly basis.
The significance of this implies changes in the structure of executable files. This leads to a new file architecture represented in the following figure:
Note that a PE file is one that conforms to the Portable/Executable format: a file format for executables, object code, DLLs, FON (Font) files, and others used in 32-bit and 64-bit versions of Windows operating systems. It was first introduced by Microsoft in Windows NT 3.1, and all later versions of Windows support this file structure.
This is why we find a PE/COFF header in the format, which contains compatible information required by the system. However, from the point of view of a .NET programmer, what really matters is that an assembly holds three main areas: the CLR header, the IL code, and a section with resources (Native Image Section in the figure).
Among the libraries linked with CLR, we found a few responsible for loading assemblies in the memory and starting and initializing the execution context. They're generally referenced as CLR Loader. Together with some other utilities, they provide the following:
- Automatic memory management
- Use of garbage collector
- Metadata access to find information on types
- Loading modules
- Analyzing managed libraries and programs
- A robust exception management subsystem to enable programs to communicate and respond to failures in structured ways
- Native and legacy code interoperability
- A JIT compilation of managed code into native code
- A sophisticated security infrastructure
This loader uses OS services to facilitate the loading, compilation, and execution of an assembly. As we've mentioned previously, CLR serves as an execution abstraction for .NET languages. To achieve this, it uses a set of DLLs, which acts as a middle layer between the OS and the application program. Remember that CLR itself is a collection of DLLs, and these DLLs work together to define the virtual execution environment. The most relevant ones are as follows:
mscoree.dll
(sometimes called shim because it is simply a facade in front of the actual DLLs that the CLR comprises)clr.dll
mscorsvr.dll
(multiprocessor) or mscorwks.dll
(uniprocessor)
In practice, one of the main roles of mscoree.dll
is to select the appropriate build (uniprocessor or multiprocessor) based on any number of factors, including (but not limited to) the underlying hardware.
The clr.dll
is the real manager, and the rest are utilities for different purposes. This library is the only one of the CLRs that is located at $System.Root$
, as we can find through a simple search:
My system is showing two versions (there are some more), each one ready to launch programs compiled for 32-bit or 64-bit versions. The rest of the DLLs are located at another place: a secure set of directories generally called
Global Assembly Cache (GAC).
Actually, the latest edition of Windows 10 installs files for all versions of such GAC, corresponding to versions 1.0, 1.1, 2.0, 3.0, 3.5, and 4.0, although several are just placeholders with minimum information, and we only find complete versions of .NET 2.0, .NET 3.5 (only partially), and .NET 4.0.
Also, note that these placeholders (for the versions not fully installed) admit further installations if some old software requires them to. This is to say that the execution of a .NET program relies on the version indicated in its metadata and nothing else.
You can check which versions of .NET are installed in a system using the CLRver.exe
utility, as shown in the following figure:
Internally, several operations take place before execution. When we launch a .NET program, we'll proceed just as usual, as if it were just another standard executable of Windows.
Behind the scenes, the system will read the header in which it will be instructed to launch mscore.dll
, which—in turn—will start the whole running process in a managed environment. Here, we'll omit all the intricacies inherent to this process since it goes far beyond the scope of this book.
We've mentioned that the key aspect of the new programming model is the heavy reliance on metadata. Furthermore, the ability to reflect against metadata enables programming techniques in which programs are generated by other programs, not humans, and this is where CodeDOM comes into play.
We'll cover some aspects of CodeDOM and its usages when dealing with the language, and we'll look at how the IDE itself uses this feature frequently every time it creates source code from a template.
In order to help the CLR find the various pieces of an assembly, every assembly has exactly one module whose metadata contains the assembly manifest: an additional piece of CLR metadata that acts as a directory of adjunct files that contain additional type definitions and code. Furthermore, CLR can directly load modules that contain an assembly manifest.
So, what is the aspect of a manifest in a real program and how can we examine its content? Fortunately, we have a bunch of .NET utilities (which, technically, don't belong to CLR but to the .NET framework ecosystem) that allow us to visualize this information easily.
Introducing metadata with a basic Hello World
Let's build a typical Hello World program and analyze its contents once it is compiled so that we can inspect how it's converted into Intermediate Language (IL) and where the meta-information that we're talking about is.
Along the course of this book, I'll use Visual Studio 2015 Community Edition Update 1 (or higher if an updated version appears) for reasons that I'll explain later. You can install it for free; it's a fully capable version with tons of project types, utilities, and so on.
The only requirement is to register for free in order to get a developer's license that Microsoft uses for statistical purposes—that's all.
After launching Visual Studio, in the main menu, select New Project and go to the Visual C# templates, where the IDE offers several project types, and select a console application, as shown in the following screenshot:
Visual Studio will create a basic code structure composed of several references to libraries (more about that later) as well as a namespace block that includes the program
class. Inside that class, we will find an application entry point in a fashion similar to what we would find in C++ or Java languages.
To produce some kind of output, we're going to use two static methods of the Console
class: WriteLine
, which outputs a string adding a carriage return, and ReadLine
, which forces the program to stop until the user introduces a character and presses the return key so that we can see the output that is produced.
After cleaning these references that we're not going to use, and including the couple of sentences mentioned previously, the code will look like this:
To test it, we just have to press F5 or the Start button and we'll see the corresponding output (nothing amazing, so we're not including the capture).
At the time of editing the code, you will have noticed several useful characteristics of the IDE's editor: the colorizing of sentences (distinguishing the different purposes: classes, methods, arguments, literals, and so on); IntelliSense, which offers what makes sense to write for every class' member; Tooltips, indicating every return type for methods; the value type for literals or constants; and the number of references made to every member of the program that could be found in your code.
Technically, there are hundreds of other useful features, but that's something we will have the chance to test starting from the next chapter, when we get into the C# aspects and discover how to prove them.
As for this little program, it's a bit more interesting to check what produced such output, which we'll find in the Bin/Debug
folder of our project. (Remember to press the Show all files button at the head of Solution Explorer, by the way):
As we can see, two executables are generated. The first one is the standalone executable that you can launch directly from its folder. The other, with the .vshost
prefix before the extension, is the one Visual Studio uses at debug time and that contains some extra information required by the IDE. Both produce the same results.
Once we have an executable, it is time to link the .NET tool – that will let us view the metadata that we're talking about – to Visual Studio.
To do this, we go to the Tools | External Tools option in the main menu, and we'll see a configuration dialog window, presenting several (and already tuned) external tools available; press the New button and change the title to IL Disassembler
, as shown in the following screenshot:
Next, we need to configure the arguments that we're going to pass to the new entry: the name of the tool and the required parameters.
You'll notice that there are several versions of this tool. These depend on your machine.
For our purposes, it will suffice to include the following information:
- The root of the tool (named
ILDASM.exe
, and located in my machine at C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.6.1 Tools
) - The path of the executable generated, for which I'm using a predefined macro expressed by
$targetpath
Given that our program is already compiled, we can go back to the Tools menu and find a new entry for IL Disassembler
. Once launched, a window will appear, showing the IL code of our program, plus a reference called Manifest
(which shows the metadata), and we can also double-click to show another window with this information, as shown in the following screenshot:
Note
Note that I've modified ILDASM's font size for clarity.
The information included in the manifest comes from two sources: the IDE itself, configured to prepare the assembly for execution (we can view most of the lines if we take a more detailed look at the window's content), and customizable information that we can embed in the executable's manifest, such as descriptions, the assembly title, the company information, trademark, culture, and so on. We'll explore how to configure that information in the next chapter.
In the same manner, we can keep on analyzing the contents of every single node shown in the main ILDASM window. For instance, if we want to see the IL code linked to our Main
entry point, the tool will show us another window where we can appreciate the aspect of the IL code (note the presence of the text cil
managed next to the declaration of main):
As I pointed out in the screenshot, entries with the prefix IL_
will be converted to the machine code at execution time. Note the resemblance of these instructions with the Assembly language.
Also, keep in mind that this concept has not changed since the first version of .NET: main concepts and procedures to generate CIL and machine code are, basically, the same as they used to be.
PreJIT, JIT, EconoJIT, and RyuJIT
I have already mentioned that the process of converting this IL code into machine code is undertaken by another piece of the .NET framework, generically known as
Just-In-Time Compiler (JIT). However, since the very beginning of .NET, this process can be executed in at least three different ways, which is why there are three JIT-suffixed names.
To simplify the details of these processes, we'll say that the default method of compilation (and the preferred one in general terms) is the JIT compilation (let's call it Normal JIT):
- In the Normal JIT mode, the code is compiled as required (on demand) and not thrown away but cached for a later use. In this fashion, as the application keeps on running, any code required for execution at a later time that is already compiled is just retrieved from the cached area. The process is highly optimized and the performance penalty is negligible.
- In the PreJIT mode, .NET operates in a different manner. To operate using PreJIT, you need a utility called
ngen.exe
(which stands for native generation) to produce native machine code previous to the first execution. The code is then converted and .exe
is rewritten into the machine code, which gives some optimization, especially at start time. - As for the EconoJIT mode, it's used mainly in applications deployed for low-memory devices, such as mobiles, and it's pretty similar to NormalJIT with the difference that the compiled code is not cached in order to save memory.
In 2015, Microsoft continued to develop a special project called Roslyn, which is a set of tools and services to provide extra functionalities to the process of code management, compilation, and deployment, among others. In connection with this project (which will be treated in depth in Chapter 4, Comparing Approaches for Programming), another JIT appeared, called RyuJIT, which has been made available since the beginning as an open source project and is now included in the latest version of V. Studio by default (remember, V. Studio 2015 Update 1).
Now, let me quote what the .NET team says about their new compiler:
"RyuJIT is a new, next-generation x64 compiler twice as fast as the one before, meaning apps compiled with RyuJIT start up to 30% faster (Time spent in the JIT compiler is only one component of startup time, so the app doesn't start twice as fast just because the JIT is twice as fast.) Moreover, the new JIT still produces great code that runs efficiently throughout the long run of a server process.
This graph compares the compile time ("throughput") ratio of JIT64 to RyuJIT on a variety of code samples. Each line shows the multiple of how much faster RyuJIT is than JIT64, so higher numbers are better."
They finish by saying that RyuJIT will be the basis for all their JIT compilers in the future: x86, ARM, MDIL, and whatever else comes along.
In the .NET framework, the Common Type System (CTS) is the set of rules and specifications established to define, use, and manage the data types used by any .NET application in a language-independent manner.
We must understand that types are the building blocks of any CLR program. Programming languages such as C#, F#, and VB.NET have several constructs for expressing types (for example, classes
, structs
, enums
, and so on), but ultimately, all of these constructs map down to a CLR type definition.
Also, note that a type can declare private and non-private members. The latter form, sometimes known as the contract of the type (since it exposes the usable part of that type), is what we can access by programming techniques. This is the reason why we highlighted the importance of metadata in the CLR.
The common type system is much broader than what most programming languages can handle. In addition to the CTS, a subdivision named CLI selects a subset of the CTS that all languages compatible with CLI must endure. This subset is called Common Language Specification (CLS), and component writers are recommended to make their components' functionalities accessible through CLS-compatible types and members.
Naming conventions, rules, and type access modes
As for the naming rules for a type, this is what applies: any CLR type name has three parts: the assembly name, an optional namespace prefix, and a local name. In the previous example, ConsoleApplication1
was the assembly name, and it was the same as the namespace (but we could have changed it without problems). Program was the name of the only type available, which happened to be a class in this case. So, the whole name of this class was ConsoleApplication1.ConsoleApplication1.Program
.
Namespaces are optional prefixes that help us define logical divisions in our code. Their purpose is to avoid confusion and the eventual overriding of members as well as allowing a more organized distribution of the application's code.
For example, in a typical application (not the demo shown earlier), a namespace would describe the whole solution, which might be separated into domains (different areas in which the application is divided, and they sometimes correspond to individual projects in the solution), and every domain would most likely contain several classes, and each class would contain several members. When you're dealing with solutions that hold, for instance, 50 projects, such logical divisions are very helpful in order to keep things under control.
As for the way that a member of a type can be accessed, each member manages how it can be used as well as how the type works. So, each member has its own access modifier (for example, private
, public
, or protected
) that controls how it should be reached and whether that member is visible to other members. If we don't specify any access modifier, it is assumed that it is private
.
Besides, you can establish whether an instance of the type is required to reference a member, or you can just reference such a member by its whole name without having to call the constructor and get an instance of the type. In such cases, we prefix the declaration of these members with the static
keyword.
Basically, a type admits three kinds of members: fields, methods, and nested types. By nested type, we understand just another type that is included as part of the implementation of the declaring type. All other type members (for example, properties and events) are simply methods that have been extended with additional metadata.
I know, you might be thinking, so, properties are methods? Well, yes; once compiled, the resulting code turns into methods. They convert into name_of_class.set_method(value)
and name_of_class.get_method()
methods in charge of assigning or reading the values linked to the method's name.
Let's review this with a very simple class that defines a couple of methods:
Well, once compiled, we can check out the resulting IL code using IL dissasembler as we did earlier, obtaining the following view:
As we can see, the compiler declares data
and num
as instances of the string
and int
classes, respectively, and it defines the corresponding methods to access these properties.
How does the CLR manage the memory space occupied by a type at runtime? If you remember, we highlighted the importance of the concept of state at the beginning of this chapter. The significance is clear here: the kind of members defined in the type will determine the memory allocation required.
Also, the CLR will guarantee that these members are initialized to their default values in case we indicate it in the declaring sentences: for numeric types, the default value is zero; for Boolean types, it's false
, and for object references, the value is null
.
We can also categorize types depending on their memory allocation: value types are stored in the stack, while reference types will use the heap. A deeper explanation of this will be provided in the next chapter, since the new abilities of Visual Studio 2015 allow us to analyze everything that happens at runtime in great detail with our code under a bunch of different points of view.
A quick tip on the execution and memory analysis of an assembly in Visual Studio 2015
All the concepts reviewed up until here are directly available using the new debugging tools, as shown in the following screenshot, which displays the execution threads of the previous program stopped in a breakpoint:
Note the different icons and columns of the information provided by the tool. We can distinguish known and unknown threads, if they are named (or not), their location, and even ThreadID
, which we can use in conjunction with SysInternals tools if we need some extra information that's not included here:
The same features are available for memory analysis. It even goes beyond the runtime periods, since the IDE is able to capture and categorize the usage of the memory required by the runtime in the application execution and keep it ready for us if we take a snapshot of the managed memory.
In this way, we can review it further and check out the possible bottlenecks and memory leaks. The preceding screenshot shows the managed memory used by the previous application at runtime.
A review of the capabilities of debugging found in Visual Studio 2015 will be covered in depth along the different chapters in this book, since there are many different scenarios in which an aid like this will be helpful and clear.
A quick reminder of these two concepts might be helpful since it transcends the .NET framework, and it's something that's common to many languages and platforms.
To start with, let's remember a few concepts related to processes that we saw at the beginning of this chapter: when a program starts execution, it initializes resources based on the metadata that the CLR reads from the assembly's manifest (as shown in the figure given in the The structure of an assembly file section). These resources will be shared with all the threads that such a process launches.
When we declare a variable, a space in the stack in allocated. So, let's start with the following code:
If we try to compile this, we'll obtain a compilation error message indicating the use of non-assigned variable b
. The reason is that in memory, we just have a declared variable and it's assigned to null, since we didn't instantiate b
.
However, if we use the constructor of the class (the default one, since the class has no explicit constructor), changing the line to Book b = new Book();
, then our code compiles and executes properly.
Therefore, the role of the new
operator is crucial here. It indicates to the compiler that it has to allocate space for a new instance of the Book
object, call the constructor, and—as we'll discover soon—initialize the object's fields to their default value types.
So, what's in the stack memory at the moment? Well, we just have a declaration called b
, whose value is a memory address: exactly the address where StackAndHeap.Book
is declared in the Heap (which I anticipate will be 0x2525910
).
However, how in the world will I know this address and what's going on inside the execution context? Let's take a look at the inner workings of this small application as Visual Studio offers different debugging windows available in this version of the IDE. To do this, we'll mark a breakpoint in line 14, Console.ReadLine();
, and relaunch the application so that it hits the breakpoint.
Once here, there's plenty of information available. In the Diagnostics Tools window (also new in this version of the IDE), we can watch the memory in use, the events, and the CPU usage. In the Memory Usage tab, we can take a snapshot of what's going on (actually, we can take several snapshots at different moments of execution and compare them).
Once the snapshot is ready, we'll look at the time elapsed, the size of objects, and the Heap size (along with some other options to improve the experience):
Note that we can choose to view the Heap sorted by the object size or the heap size. Also, if we choose one of these, a new window appears, showing every component actually in the execution context.
If we want to check exactly what our code is doing, we can filter by the name of the desired class (Book
, in this case) in order to get an exclusive look at this object, its instances, the references to the object alive in the moment of execution, and a bunch of other details.
Of course, if we take a look at the Autos or Locals windows, we'll discover the actual values of these members as well:
As we can see in the
Autos window, the object has initialized the remaining values (those not established by code) using the default value for that type (0 for integer values). This level of detail in the analysis of executables really helps in cases where bugs are fuzzy or only happen occasionally.
We can even see the actual memory location of every member by clicking on the StackAndHeap.Book entry:
Perhaps you're wondering, can we even see further? (I mean the actual assembly code produced by the execution context). The answer, again, is yes; we can right-click on the instance, select Add Watch, and we'll be adding an inspection point directly to that memory position, as shown in the following figure:
Of course, the assembly code is available as well, as long as we have enabled it by navigating to Tools | Options | Debugger in the IDE. Also, in this case, you should enable Enable Address Level Debugging in the same dialog box. After this, just go to Debug | Windows | Dissasembly, and you will be shown the window with the lowest level (executable) code marking the breakpoint, line numbers, and the translation of such code into the original C# statement:
What happens when the reference to the Book
object is reassigned or nulled (and the program keeps going on)? The memory allocated for Book
remains in the memory as an orphan, and it's then when garbage collector comes into play.
Basically, garbage collection is the process of reclaiming memory from the system. Of course, this memory shouldn't be in use; that is, the space occupied by the objects allocated in Heap should not have any variable pointing to them in order to be cleared.
Among the numerous classes included in .NET framework, there's one that's specially dedicated to this process. This means that the garbage collection of objects is not just an automatic process undertaken by CLR but a true, executable object that can even be used in our code (GC is the name, by the way, and we will deal with it in some cases when we try to optimize execution in the other chapters).
Actually, we can see this in action in a number of ways. For example, let's say that we create a method that concatenates strings in a loop and doesn't do anything else with them; it just notifies the user when the process is finished:
There's something to remember here. Since strings are immutable (which means that they cannot be changed, of course), the process has to create new strings in every loop. This means a lot of memory that the process will use and that can be reclaimed since every new string has to be created anew, and the previous one is useless.
We can use CLR Profiler to see what happens in CLR when running this application. You can download CLR Profiler from http://clrprofiler.codeplex.com/, and once unzipped, you'll see two versions (32 and 64 bits) of the tool. This tool show us a more detailed set of statistics, which include GC interventions. Once launched, you'll see a window like this:
Ensure that you check the allocations and calls checkboxes before launching the application using Start Desktop App. After launching (if the application has no stops and is running at a stretch), without breaks, you'll be shown a new statistical window pointing to various summaries of execution.
Each of these summaries lead to a different window in which you can analyze (even with statistical graphics) what happened at runtime in more detail as well as how garbage collector intervened when required.
The following figure shows the main statistical window (note the two sections dedicated to GC statistics and garbage collection handle statistics:
The screenshot shows two GC-related areas. The first one indicates three kinds of collections, named Gen 0
, Gen 1
, and Gen 2
. These names are simply short names for generations.
This is because GC marks objects depending on their references. Initially, when the GC starts working, these objects with no references are cleaned up. Those still connected are marked as Gen 1
. The second review of the GC is initially similar, but if it discovers that there are objects marked Gen 1
that still hold references, they're marked as Gen 2
, and those from Gen 0
with any references are promoted to Gen 1
. The process goes on while the application is under execution.
This is the reason we can often read that the following principles apply to objects that are subject to recollection:
- Newest objects are usually collected soon (they're normally created in a function call and are out of the scope when the function finishes)
- The oldest objects commonly last more (often because they hold references from global or static classes)
The second area shows the number of handles created, destroyed, and surviving (surviving due to garbage collector, of course).
The first one (Time Line) will, in turn, show statistics including the precise execution times in which GC operated, as well as the .NET types implied:
As you can see, the figure shows a bunch of objects collected and/or promoted to other generations as the program goes on.
This is, of course, much more complex than that. The GC has rules to operate with different frequencies depending on the generation. So, Gen 0
is visited more frequently that Gen 1
and much less than Gen 2
.
Furthermore, in the second window, we see all the mechanisms implicit in the execution, allowing us different levels of details so that we can have the whole picture with distinct points of view:
This is a proof of some of the characteristics of GC. First of all, a de-referenced object is not immediately collected, since the process happens periodically, and there are many factors that influence this frequency. On the other hand, not all orphans are collected at the same time.
One of the reasons for this is that the collection mechanism itself is computationally expensive, and it affects performance, so the recommendation, for most cases, is to just let GC do its work the way it is optimized to do.
Are there exceptions to this rule? Yes; the exceptions are in those cases where you have reserved a lot of resources and you want to make sure that you clean them up before you exit the method or sequence in which your program operates. This doesn't mean that you call GC in every turn of a loop execution (due to the performance reasons we mentioned).
One of the possible solutions in these cases is implementing the IDisposable
interface. Let's remember that you can see any member of the CLR by pressing Ctrl + Alt + J or selecting Object Explorer in the main menu.
We'll be presented with a window containing a search box in order to filter our member, and we'll see all places where such a member appears:
Note
Note that this interface is not available for .NET Core Runtime.
So, we would redefine our class to implement IDisposable
(which means that we should write a Dispose()
method to invoke the GC inside it). Or, even better, we can follow the recommendations of the IDE and implement Dispose Pattern
, which is offered to us as an option as soon as we indicate that our program implements this interface, as shown in the following screenshot:
Also, remember that, in cases where we have to explicitly dispose a resource, another common and more suggested way is the using
block within the context of a method. A typical scenario is when you open a file using some of the classes in the System.IO
namespace, such as File. Let's quickly look at it as a reminder.
Imagine that you have a simple text file named Data.txt
and you want to open it, read its content, and present it in the console. A possible way to do this rapidly would be by using the following code:
What's the problem with this code? It works, but it's using an external resource, since the OpenText
method returns an StreamReader
object, which we later use to read the contents, and it's not explicitly closed. We should always remember to close those objects that we open and take some time to process.
One of the possible side effects consists of preventing other processes from accessing the file we opened.
So, the best and suggested solution for these cases is to include the declaration of the conflicting object within a using
block, as follows:
In this way, garbage collector is automatically invoked to liberate the resources managed by StreamReader
, and there's no need to close it explicitly.
Finally, there's always another way of forcing an object to die, that is, using the corresponding finalizer (a method preceded by the ~
sign, which is right opposite to a destructor). It's not a recommended way to destroy objects, but it has been there since the very beginning (let's remember that Hejlsberg inspired many features of the language in C++). And, by the way, the advanced pattern of implementing IDispose
includes this option for more advanced collectable scenarios.
Implementing algorithms with the CLR
So far, we've seen some of the more important concepts, techniques, and tools available and related to CLR. In other words, we've seen how the engine works and how the IDE and other tools gives us support to control and monitor what's going on behind the scenes.
Let's dig into some of the more typical structures and algorithms that we'll find in everyday programming so that we can understand the resources that .NET framework puts in our hands to solve common problems a bit better.
We've mentioned that .NET framework installs a repository of DLLs that offer a large number of functionalities. These DLLs are organized by namespaces, so they can be used individually or in conjunction with others.
As it happens with other frameworks such as J2EE, in .NET, we will use the object-oriented programming paradigm as a suitable approach to programming problems.
Data structures, algorithms, and complexity
In the initial versions of .NET (1.0, 1.1), we could use several types of constructions to deal with collections of elements. All modern languages include these constructs as typical resources, and some of these you should know for sure: arrays, stacks, and queues are typical examples.
Of course, the evolution of .NET has produced many novelties, starting with generics, in version 2.0, and other types of similar constructions, such as dictionaries, ObservableCollections
, and others in a long list.
But the question is, are we using these algorithms properly? What happens when you have to use one of these constructions and push it to the limits? And to cope with these limits, do we have a way to find out and measure these implementations so that we can use the most appropriate one in every situation?
These questions take us to the measure of complexity. The most common approach to the problem nowadays relies on a technique called Big O Notation or Asymptotic Analysis.
Big O Notation (Big Omicron Notation) is a variant of a mathematical discipline that describes the limiting behavior of a function when a value leans toward a particular value or toward infinity. When you apply it to computer science, it's used to classify algorithms by how they respond to changes in the input size.
We understand "how they respond" in two ways: response in time (often the most important) as well as response in space, which could lead to memory leaks and other types of problems (eventually including DoS attacks and other threats).
Tip
One of the most exhaustive lists of links to explanations of the thousands of algorithms cataloged up to date is published by NIST (National Institute of Standards and Technology) at https://xlinux.nist.gov/dads/.
The way to express the response in relation to the input (the O notation) consists in a formula such as O([formula]), where formula is a mathematical expression that indicates the growth, that is the number of times the algorithm executes, as the input grows. Many algorithms are of type O(n), and they are called linear because the growth is proportional to the number of inputs. In other words, such growth would be represented by a straight line (although it is never exact).
A typical example is the analysis of sorting algorithms, and NIST mentions a canonical case: quicksort is O(n log n) on average, and bubble offers O(n²). This means that on a desktop computer, a quicksort implementation can beat a bubble one, which is running on a supercomputer when the numbers to be sorted grow beyond a certain point.
Note
As an example, in order to sort 1,000,000 numbers, the quicksort takes 20,000,000 steps on average, while the bubble sort takes 1,000,000,000,000 steps!
The following graphic shows the growth in time of four classical sorting algorithms (bubble, insertion, selection, and shell). As you can see in the graph, the behavior is quite linear until the number of elements passes 25,000, in which the elements differ noticeably. The shell algorithm wins and has a factor of a worst case complexity of O(n^1.5). Note that quicksort has a smaller factor (n log n).
Unfortunately, there's no mechanical procedure to calculate the Big-O, and the only procedures that can be found deal with a, more or less, empirical approach.
However, we can use some well-defined tables that categorize the algorithms and give us the O(formula) to get an idea of what we can obtain out of its usage, such as the one published by Wikipedia, which is accessible at http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions:
From the point of view of .NET framework, we can use all collections linked to the System.Collections.Generics
namespace that guarantee optimized performance for a vast majority of situations.
An approach to performance in the most common sorting algorithms
You will find in DEMO01-04
a .NET program that compares three classical algorithms (bubble, merge, and heap) to the one implemented in List<T>
collections using integers. Of course, this approach is a practical, everyday approach and not a scientific one, for which the generated numbers should be uniformly randomly generated (refer to Rasmus Faber's answer to this question at http://stackoverflow.com/questions/609501/generating-a-random-decimal-in-c/610228#610228).
Besides that, another consideration should be made for the generators themselves. For practical purposes such as testing these algorithms, generators included in .NET framework do their job pretty well. However, if you need or are curious about a serious approach, perhaps the most documented and tested one is the Donald Knuth's Spectral Test, published in the second volume of his world's famous The Art of Computer Programming, Volume 2: Seminumerical Algorithms (2nd Edition), by Knuth, Donald E., published by Addison-Wesley.
That said, the random generator class included in .NET can give us good enough results for our purposes. As for the sorting methods targeted here, I've chosen the most commonly recommended ones in comparison with to extremes: the slowest one (bubble with an O(n²) in performance) and the one included in the System.Collections.Generic
namespace for the List<T>
class (which is, internally, a quick sort). In the middle, a comparison is made between the heap and merge methods—all of them considered O(n log n) in performance.
The previously mentioned demo follows recommended implementations with some updates and improvements for the user interface, which is a simple Windows Forms application, so you can test these algorithms thoroughly.
Also, note that you should execute these tests several times with different amounts of inputs to get a real glimpse of these methods' performance, and that .NET framework is built with optimized sorting methods for integers, strings, and other built-in types, avoiding the cost of calling delegates for comparisons, and so on. So, in comparison with built-in types, typical sorting algorithms are going to be much slower normally.
For example, for 30,000 integers, we obtain the following results:
As you can see, the results of bubble (even being an optimized bubble method) are far worse when the total numbers go beyond 10,000. Of course, for smaller numbers, the difference decreases, and if the routine does not exceed 1,000, it's negligible for most practical purposes.
As an optional exercise for you, we leave the implementation of these algorithms for string sorting.
Remember that, for such situations, you should use generic versions of the merge and heap algorithms so that an invocation can be made to the same algorithm independently of the input values.