What is Node.js?
Node.js consists of a JavaScript engine together with low-level APIs for core server-side functionality. The execution engine is the same V8 engine developed for the Chrome web browser. Node.js takes this engine and embeds it in a standalone application that can run JavaScript outside the browser.
In Node.js, the standard APIs found in browsers to support client-side web development, such as the Document Object Model (DOM) and XMLHttpRequest
, are not present. Instead, there are APIs to support general-purpose application development. These core APIs cover low-level functionality such as the following:
- Networking and security
- Accessing the file system
- Defining and requiring modules
- Raising and consuming events
- Handling binary data streams
- Compression
- UTF-8 support
- Retrieving basic information about the OS
- Managing child processes
Some of these APIs may already be familiar from developing client-side JavaScript. For example, the Timers API exposes the familiar setTimeout
and setInterval
functions.
Node.js also provides several tools to help with the development process. These include console logging, debugging, a Read-Eval-Print Loop (REPL) (or interactive console), and basic assertions for testing.
Understanding the Node.js execution model
The execution model of Node.js follows that of JavaScript in the browser. It is quite different from that of most general-purpose programming platforms.
Stated formally, Node.js has a single-threaded, non-blocking, event-driven execution model. We will define each of these terms in this section.
Non-blocking
Put simply, Node.js recognizes that many programmes spend most of their time waiting for other things to happen, for example, slow I/O operations such as disk access and network requests.
Node.js addresses this by making these operations non-blocking. This means that program execution can continue while they happen. For example, the filesystem API's stat
function for retrieving statistics about a file may be called as follows:
fs.stat('/hello/world', function (error, stats) { console.log('File last updated at: ' + stats.mtime); });
Two arguments are passed to the fs.stat
function: the name of the file that we are interested in, and a callback function. The fs.stat
call returns immediately, returning control of execution to the current thread but not returning a value. If there are further commands following the fs.stat
call, these will then be executed. Otherwise, the thread is released to perform other work. The callback function is invoked (that is 'called back') only after the runtime has finished communicating with the filesystem. The result of the filesystem operation is passed into the callback function.
This non-blocking approach is also called
asynchronous programming. Other platforms support this (for example, C#'s async
/await
keywords and .NET's Task Parallel Library). However, it is baked in to Node.js in a way that makes it simple and natural to use. Asynchronous API methods are all called in the same way as fs.stat
. They all take a callback function that gets passed error and result arguments.
Event-driven
The event-driven nature of Node.js describes how operations are scheduled. In typical procedural environments, a program has an entry point that executes a set of commands until completion, or enters a loop and performs some processing on each iteration.
Node.js has a built-in event loop, which isn't exposed to the developer. It is the job of the event loop to decide which piece of code to execute next. Typically, this will be a callback function that is ready to run in response to some other event. For example, a filesystem operation may have completed, a timeout may have expired, or a new network request may have arrived.
This built-in event loop simplifies asynchronous programming by providing a consistent approach and avoiding the need for applications to manage their own scheduling.
Single-threaded
The single-threaded nature of Node.js simply means that there is only one thread of execution in each process. Also, each piece of code is guaranteed to run to completion without being interrupted by other operations. This greatly simplifies development and makes programs easier to reason about. It removes the possibility for a range of concurrency issues. For example, it is not necessary to synchronize/lock access to shared in-process state as it is in Java or .NET. A process can't deadlock itself or create race conditions within its own code. Single-threaded programming is only feasible if the thread never gets blocked waiting for long-running work to complete. Thus, this simplified programming model is made possible by the non-blocking nature of Node.js.
Introducing the Node.js ecosystem
The built-in Node.js APIs provide a low-level core for creating applications. Applications typically only use a small number of these APIs directly. They often use third-party library modules that provide higher-level abstractions for application development.
Node.js has its own package manager, npm. This is similar to .NET's NuGet or the package management aspects of Java's Maven. Applications specify their dependencies in a simple JSON file.
The npm registry provides a central repository for packages. This registry has grown rapidly and is already much larger (in terms of number of available packages) than the corresponding repositories for other platforms (see http://www.modulecounts.com/). There are hundreds of thousands of packages available, providing a vast array of functionality.
The npm command line tool can be used to download packages and install new ones. Library dependencies are installed locally to each application. Some packages provide command-line tools, which may be installed globally rather than under a specific project.
Many frameworks available on npm are split into a small extensible core and a number of composable modules. This approach makes it easy to understand the libraries on which your application depends, avoiding the need to reason about complex heavyweight frameworks.
The consistency of calling non-blocking (asynchronous) API methods in Node.js carries through to its third-party libraries. This consistency makes it easy to build applications that are asynchronous throughout.