Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering  Node.js

You're reading from   Mastering Node.js Build robust and scalable real-time server-side web applications efficiently

Arrow left icon
Product type Paperback
Published in Dec 2017
Publisher Packt
ISBN-13 9781785888960
Length 498 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Sandro Pasquali Sandro Pasquali
Author Profile Icon Sandro Pasquali
Sandro Pasquali
Kevin Faaborg Kevin Faaborg
Author Profile Icon Kevin Faaborg
Kevin Faaborg
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Understanding the Node Environment FREE CHAPTER 2. Understanding Asynchronous Event-Driven Programming 3. Streaming Data Across Nodes and Clients 4. Using Node to Access the Filesystem 5. Managing Many Simultaneous Client Connections 6. Creating Real-Time Applications 7. Using Multiple Processes 8. Scaling Your Application 9. Microservices 10. Testing Your Application 11. Organizing Your Work into Modules 12. Creating Your Own C++ Add-ons

Extending JavaScript

When he designed Node, JavaScript was not Ryan Dahl's original language choice. Yet, upon exploration, he found a good modern language without opinions on streams, the filesystem, handling binary objects, processes, networking, and other capabilities one would expect to exist in a systems language. JavaScript, strictly limited to the browser, had no use for, and had not implemented, these features.

Guided by the Unix philosophy, Dahl was guided by a few rigid principles:

  • A Node program/process runs on a single thread, ordering execution through an event loop
  • Web applications are I/O intensive, so the focus should be on making I/O fast
  • Program flow is always directed through asynchronous callbacks
  • Expensive CPU operations should be split off into separate parallel processes, emitting events as results arrive
  • Complex programs should be assembled from simpler programs

The general principle is, operations must never block. Node's desire for speed (high concurrency) and efficiency (minimal resource usage) demands the reduction of waste. A waiting process is a wasteful process, especially when waiting for I/O.

JavaScript's asynchronous, event-driven design fits neatly into this model. Applications express interest in some future event, and are notified when that event occurs. This common JavaScript pattern should be familiar to you:

Window.onload = function() {
// When all requested document resources are loaded,
// do something with the resulting environment
}
element.onclick = function() {
// Do something when the user clicks on this element
}

The time it will take for an I/O action to complete is unknown, so the pattern is to ask for notification when an I/O event is emitted, whenever that may be, allowing other operations to be completed in the meantime.

Node adds an enormous amount of new functionality to JavaScript. Primarily, the additions provide evented I/O libraries offering the developer system access not available to browser-based JavaScript, such as writing to the filesystem or opening another system process. Additionally, the environment is designed to be modular, allowing complex programs to be assembled out of smaller and simpler components.

Let's look at how Node imported JavaScript's event model, extended it, and used it in the creation of interfaces to powerful system commands.

Events

Many functions in the Node API emit events. These events are instances of events.EventEmitter. Any object can extend EventEmitter, providing Node developers with a simple and uniform way to build tight, asynchronous interfaces to object methods.

The following code sets Node's EventEmitter object as the prototype of a function constructor we define. Each constructed instance has the EventEmitter object exposed to its prototype chain, providing a natural reference to the event API. The counter instance methods emit events, and code after that listens for them. After making a Counter, we listen for the incremented event, specifying a callback Node will call when the event happens. Then, we call the increment twice. Each time, our Counter increments the internal count it holds, and then emits the incremented event. This calls our callback, giving it the current count, which our callback logs:

// File counter.js
// Load Node's 'events' module, and point directly to EventEmitter there
const EventEmitter = require('events').EventEmitter;
// Define our Counter function
const Counter = function(i) { // Takes a starting number
this.increment = function() { // The counter's increment method
i++; // Increment the count we hold
this.emit('incremented', i); // Emit an event named incremented
}
}
// Base our Counter on Node's EventEmitter
Counter.prototype = new EventEmitter(); // We did this afterwards, not before!
// Now that we've defined our objects, let's see them in action
// Make a new Counter starting at 10
const counter = new Counter(10);
// Define a callback function which logs the number n you give it
const callback = function(n) {
console.log(n);
}
// Counter is an EventEmitter, so it comes with addListener
counter.addListener('incremented', callback);
counter.increment(); // 11
counter.increment(); // 12

The following is the output for counter.js:

$ node counter.js
11
12

To remove the event listeners bound to counter, use this:

counter.removeListener('incremented', callback).

For consistency with browser-based JavaScript, counter.on and counter.addListener are interchangeable.

Node brought EventEmitter to JavaScript and made it an object your objects can extend. This greatly increases the possibilities available to developers. With EventEmitter, Node can handle I/O data streams in an event-oriented manner, performing long-running tasks while keeping true to Node's principles of asynchronous, non-blocking programming:

// File stream.js
// Use Node's stream module, and get Readable inside
let Readable = require('stream').Readable;
// Make our own readable stream, named r
let r = new Readable;
// Start the count at 0
let count = 0;
// Downstream code will call r's _read function when it wants some data from r
r._read = function() {
count++;
if (count > 10) { // After our count has grown beyond 10
return r.push(null); // Push null downstream to signal we've got no more data
}
setTimeout(() => r.push(count + '\n'), 500); // A half second from now, push our count on a line
};
// Have our readable send the data it produces to standard out
r.pipe(process.stdout);

The following is the output for stream.js:

$ node stream.js
1
2
3
4
5
6
7
8
9
10

This example creates a readable stream r, and pipes its output to the standard out. Every 500 milliseconds, code increments a counter and pushes a line of text with the current count downstream. Try running the program yourself, and you'll see the series of numbers appear on your terminal.

On what would be the 11th count, r pushes null downstream, indicating that it has no more data to send. This closes the stream, and with nothing more to do, Node exits the process.

Subsequent chapters will explain streams in more detail. Here, just note how pushing data onto a stream causes an event to fire, how you can assign a custom callback to handle this event, and how the data flows downstream.

Node consistently implements I/O operations as asynchronous, evented data streams. This design choice enables Node's excellent performance. Instead of creating a thread (or spinning up an entire process) for a long-running task like a file upload that a stream may represent, Node only needs to commit the resources to handle callbacks. Additionally, in the long stretches of time in between the short moments when the stream is pushing data, Node's event loop is free to process other instructions.

As an exercise, re-implement stream.js to send the data r produces to a file instead of the terminal. You'll need to make a new writable stream w, using Node's fs.createWriteStream:

// File stream2file.js
// Bring in Node's file system module
const fs = require('fs');
// Make the file counter.txt we can fill by writing data to writeable stream w
const w = fs.createWriteStream('./counter.txt', { flags: 'w', mode: 0666 });
...
// Put w beneath r instead
r.pipe(w);

Modularity

In his book, The Art of Unix Programming, Eric Raymond proposed the Rule of Modularity:

"Developers should build a program out of simple parts connected by well-defined interfaces, so problems are local, and parts of the program can be replaced in the future versions to support new features. This rule aims to save time on debugging complex code that is complex, long, and unreadable."

Large systems are hard to reason about, especially when the boundaries of internal components are fuzzy, and the interactions between them are complex. This principle of building large systems out of small, simple, and loosely-coupled pieces is a good idea for software and beyond. Physical manufacturing, management theory, education, and government, all have benefited from this design philosophy.

When developers began employing JavaScript for larger and more complex software challenges, they encountered this challenge. There was not yet a good way (and later, no common standard way) to assemble a JavaScript program from smaller ones. For example, you've probably seen HTML pages with tags like these at the top:

<head>
<script src="fileA.js"></script>
<script src="fileB.js"></script>
<script src="fileC.js"></script>
<script src="fileD.js"></script>
...
</head>

This works, but leads to a number of problems:

  • The page must declare all potential dependencies before any are needed or used. If, while running, your program encounters a situation where it needs an additional dependency, dynamically loading another module is possible, but a separate hack.
  • The scripts are not encapsulated. Code in every file writes to the same global object. Adding a new dependency may break an earlier one because of a name collision.
  • fileA cannot address fileB as a collection. An addressable context like fileB.function1 isn't available.

The <script> tag would be a nice place for useful module services such as dependency awareness and version control, but it doesn't have these features.

These difficulties and dangers made creating and using JavaScript modules feel more treacherous than effortless. A good module system with features like encapsulation and versioning can reverse this, encouraging code organization and sharing, and leading to a robust ecosystem of high-quality open source software components.

JavaScript needed a standard way to load and share discreet program modules, and found one in 2009 with the CommonJS Modules specification. Node follows this specification, making it easy to define and share bits of reusable code called modules or packages.

Choosing a delightfully simple design, a package is just a directory of JavaScript files. Metadata about the package, such as its name, version, and software license, lives in an additional file named package.json. The JSON contents of this file are easily both human and machine-readable. Let's take a look:

{
"name": "mypackage1",
"version": "0.1.2",
"dependencies": {
"jquery": "^3.1.0",
"bluebird": "^3.4.1",
},
"license": "MIT"
}

This package.json defines a package named mypackage1, which depends on two other packages: jQuery and Bluebird. Alongside the package names is a version number. Version numbers follow the Semantic Versioning (SemVer) rules, with a pattern like Major.Minor.Patch. Looking at the incremented version numbers of a package your code has been using, here's what that means:

  • Major: There's a change in the purpose or outcome of the API. If your code calls an updated function, it may break or produce an unintended result. Figure out what's changed, and determine if it affects your code.
  • Minor: The package has added functionality, but remains compatible. Run all your tests, and you're good to go. Check out the documentation if you're curious, as there might be new, more advanced parts of the API alongside the functions and objects you're familiar with.
  • Patch: The package fixed a bug, improved performance, or refactored a little. Run all your tests, and you're good to go.

Packages enable the construction of large systems from many small, interdependent systems. Perhaps even more importantly, packages encourage sharing. More detailed information about SemVer is available in Appendix A, Organizing Your Work Into Modules, where npm and packages are discussed in more depth.

"What I'm describing here is not a technical problem. It's a matter of people getting together and making a decision to step forward and start building up something bigger and cooler together."
– Kevin Dangoor, creator of CommonJS
Not just about modules, CommonJS is actually a whole collection of standards founded with the goal of removing everything that was holding JavaScript back from world domination, open source developer Kris Kowal explained in a 2009 post evangelizing the initiative. He names the first of these impediments as the absence of a good module system. The second? The absence of a standard library, including such systems-level fundamentals as access to the filesystem, manipulation of I/O streams, and types for bytes and blocks of binary data. Today, CommonJS is known for giving JavaScript a module system, while Node is what gave JavaScript systems-level access:

https://arstechnica.com/information-technology/2009/12/commonjs-effort-sets-javascript-on-path-for-world-domination/

CommonJS gave JavaScript packages. With packages, the next thing JavaScript needed was a package manager. Node provided one with npm.

A registry of packages, npm is accessible in two ways. First, at the website www.npmjs.com, you can link to and search for packages, essentially shopping for the right one. Stats that count how many times a package has been downloaded in the last day, week, and month show popularity and usage. Most packages link to a developer profile page and open source code on GitHub, so you can see the code, visualize recent development activity, and judge the reputations of the authors and contributors.

The second way to access npm is through the command-line tool npm, which is installed with Node. Using npm as a traditional package manager for your workstation, you can install packages globally, creating new command-line tools on your shell's path. npm also knows how to create, read, and edit package.json files, and can start you out with a new, empty Node package, add the dependencies it needs, download all the code, and keep everything up to date.

Along with Git and GitHub, npm is now achieving a dream of software development identified in the 1970s: that code could be reused more often, and software projects would be written entirely from scratch less frequently.
Earlier attempts at reaching this goal through version control systems like CVS and Subversion, and open source code sharing websites like SourceForge.net, focused on bigger units of both code and people, and didn't achieve as much.

GitHub and npm took a different approach in two important ways:
  • Favoring individual developers working alone over communities meeting and discussing, developers could focus more on code and less on conversation
  • Favoring small, atomic software components over complete applications, encapsulated composition started happening not just at a micro-level of subroutines and objects, but at the more important macroscale of application design
Even documentation is better with the new approach: in a monolithic software application, documentation was too often the afterthought that may or may not have happened after the product shipped.
With components, great documentation is necessary to sell your package to the world, getting it a larger public daily download count, and the social media accounts you keep as a developer of more followers.

In no small part, Node's success is due to the number and quality of packages available to you as a Node developer.

More extensive information on creating and managing Node packages can be found in Appendix A, Organizing Your Work into Modules.

The key design philosophy to follow is this: build programs out of packages where possible, and share those packages when possible. The shape of your applications will be clearer and easier to maintain. Importantly, the efforts of thousands of other developers can be linked into applications via npm, directly by inclusion, and indirectly as shared packages are tested, improved, refactored, and repurposed by members of the Node community.

Contrary to popular belief, npm is not an abbreviation for Node Package Manager, and should never be used or explained as an acronym:
https://docs.npmjs.com/policies/trademark

The network

I/O in the browser is mercilessly hobbled, for very good reasons - if the JavaScript on any given website could access your filesystem, for instance, users could only click links to new sites they trusted, rather than ones they simply wanted to try out. Keeping pages in a limited sandbox, the design of the web made navigating from thing1.com to thing2.com not have the consequences of double-clicking thing1.exe and thing2.exe.

Node, of course, recasts JavaScript in the role of a systems language, giving it direct and unfettered access to operating system kernel objects such as files, sockets, and processes. This lets Node create scalable systems with high I/O requirements. It's likely the first thing you coded in Node was a HTTP server.

Node supports standard network protocols in addition to HTTP, such as TLS/SSL, and UDP. With these tools we can easily build scalable network programs, moving far beyond the comparatively limited AJAX solutions JavaScript developers know from the browser.

Let's write a simple program that sends a UDP packet to another node:

const dgram = require('dgram');
let client = dgram.createSocket("udp4");
let server = dgram.createSocket("udp4");
let message = process.argv[2] || "message";
message = Buffer.from(message);
server
.on('message', msg => {
process.stdout.write(`Got message: ${msg}\n`);
process.exit();
})
.bind(41234);
client.send(message, 0, message.length, 41234, "localhost");

Go ahead and open two terminal windows and navigate each to your code bundle for Chapter 8, Scaling Your Application, under the /udp folder. We're now going to run a UDP server in one window, and a UDP client in another.

In the right window, run receive.js with a command like the following:

$ node receive.js

On the left, run send.js with a command, as follows:

$ node send.js

Executing that command will cause the message to appear on the right:

$ node receive.js
Message received!

A UDP server is an instance of EventEmitter, emitting a message event when messages are received on the bound port. With Node, you can use JavaScript to write your application at the I/O level, moving packets and streams of binary data with ease.

Let's continue to explore I/O, the process object, and events. First, let's dig into the machine powering Node's core, V8.

You have been reading a chapter from
Mastering Node.js - Second Edition
Published in: Dec 2017
Publisher: Packt
ISBN-13: 9781785888960
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image