Looking at the Node.js architecture in detail
The principal foundations of Node.js have been inspired by a few things:
- The single worker thread featured in browsers was already quite successful in the server space. Here, the popular nginx web server showed that the event loop pattern (explained later in this chapter) was actually a blessing for performance – eliminating the need to use a dedicated thread pool for handling requests.
- The idea of packaging everything in a file-centric structure called modules. This allowed Node.js to avoid many of the pitfalls of other languages and frameworks – including JavaScript in the browser.
- The idea of avoiding creating a huge framework and leaving everything extensible and easy to get via package managers.
Threads
Modern computers offer a lot of computing power. However, for an application to really use the available computing power, we need to have multiple things working in parallel. Modern operating systems know about different independently running tasks via so-called threads. A thread is a group of operations running sequentially, which means in a given order. The operating system then schedules when threads run and where (i.e., on which CPU core) they are placed.
These principles together form a platform that seems easy to create, but hard to replicate. After all, there are plenty of JavaScript engines and useful libraries available. For Ryan Dahl, the original creator and maintainer of Node.js, the basis of the framework had to be rock solid.
Ryan Dahl selected an existing JavaScript engine (V8) to take over the responsibility of parsing and running the code written in JavaScript. The V8 engine was chosen for two good reasons. On the one hand, the engine was available as an open source project under a permissive license – usable by projects such as Node.js. On the other hand, V8 was also the engine used by Google for its web browser Chrome. It is very fast, very reliable, and under active development.
One of the drawbacks of using V8 is that it was written in C++ using custom-built tooling called GYP. While GYP was replaced in V8 years later, the transition was not so easy for Node.js. As of today, Node.js is still relying on GYP as a build system. The fact that V8 is written in C++ seems like a side note at first, but might be pretty important if you ever intend to write so-called native modules.
Native modules allow you to go beyond JavaScript and Node.js – making full use of the available hardware and system capabilities. One drawback of native modules is that they must be built on each platform. This is against the cross-platform nature of Node.js.
Let’s take a step back to arrange the parts mentioned so far in an architecture diagram. Figure 1.1 shows how Node.js is composed internally:
Figure 1.1 – Internal composition of Node.js
The most important component in Node.js’s architecture – besides the JavaScript engine – is the libuv library. libuv is a multi-platform, low-level library that provides support for asynchronous input/output (I/O) based on an event loop. I/O happens in multiple forms, such as writing files or handling HTTP requests. In general, I/O refers to anything that is handled in a dedicated area of the operating system.
Any application running Node.js is written in JavaScript or some flavor of it. When Node.js starts running the application, the JavaScript is parsed and evaluated by V8. All the standard objects, such as console
, expose some bindings that are part of the Node.js API. These low-level functions (such as console.log
or fetch
) make use of libuv. Therefore, some simple script that only works against language features such as primitive calculations (2 + 3) does not require anything from the Node API and will remain independent of libuv. In contrast, once a low-level function (for example, a function to access the network) is used, libuv can be the workforce behind it.
In Figure 1.2, a block diagram illustrating the various API layers is shown. The beauty of this diagram is that it reveals what Node.js actually is: a JavaScript runtime allowing access to low-level functionality from state-of-the-art C/C++ libraries. The Node.js API consists of the included Node.js bindings and some C/C++ addons:
Figure 1.2 – Composition of Node.js in terms of building blocks
One thing that would need explanation in the preceding diagram is how the event loop is implemented in relation to all the blocks. When talking about Node.js’s internal architecture, a broader discussion of what an event loop is and why it matters for Node.js is definitely required. So let’s get into these details.