Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Node Cookbook: Second Edition

You're reading from   Node Cookbook: Second Edition Transferring your JavaScript skills to server-side programming is simplified with this comprehensive cookbook. Each chapter focuses on a different aspect of Node, featuring recipes supported with lots of illustrations, tips, and hints.

Arrow left icon
Product type Paperback
Published in Apr 2014
Publisher Packt
ISBN-13 9781783280438
Length 378 pages
Edition Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
David Mark Clements David Mark Clements
Author Profile Icon David Mark Clements
David Mark Clements
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Node Cookbook Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
1. Making a Web Server 2. Exploring the HTTP Object FREE CHAPTER 3. Working with Data Serialization 4. Interfacing with Databases 5. Employing Streams 6. Going Real Time 7. Accelerating Development with Express 8. Implementing Security, Encryption, and Authentication 9. Integrating Network Paradigms 10. Writing Your Own Node Modules 11. Taking It Live Index

Optimizing performance with streaming


Caching content certainly improves upon reading a file from disk for every request. However, with fs.readFile, we are reading the whole file into memory before sending it out in a response object. For better performance, we can stream a file from disk and pipe it directly to the response object, sending data straight to the network socket a piece at a time.

Getting ready

We are building on our code from the last example, so let's get server.js, index.html, styles.css, and script.js ready.

How to do it...

We will be using fs.createReadStream to initialize a stream, which can be piped to the response object.

Tip

If streaming and piping are new concepts, don't worry! We'll be covering streams in depth in Chapter 5, Employing Streams.

In this case, implementing fs.createReadStream within our cacheAndDeliver function isn't ideal because the event listeners of fs.createReadStream will need to interface with the request and response objects, which for the sake of simplicity would preferably be dealt with in the http.createServer callback. For brevity's sake, we will discard our cacheAndDeliver function and implement basic caching within the server callback as follows:

//...snip... requires, mime types, createServer, lookup and f vars...

fs.exists(f, function (exists) {
  if (exists) {
    var headers = {'Content-type': mimeTypes[path.extname(f)]};
    if (cache[f]) {
      response.writeHead(200, headers);
      response.end(cache[f].content);
      return;
   } //...snip... rest of server code...

Later on, we will fill cache[f].content while we are interfacing with the readStream object. The following code shows how we use fs.createReadStream:

var s = fs.createReadStream(f);

The preceding code will return a readStream object that streams the file, which is pointed at by variable f. The readStream object emits events that we need to listen to. We can listen with addEventListener or use the shorthand on method as follows:

var s = fs.createReadStream(f).on('open', function () {
  //do stuff when the readStream opens
});

Because createReadStream returns the readStream object, we can latch our event listener straight onto it using method chaining with dot notation. Each stream is only going to open once; we don't need to keep listening to it. Therefore, we can use the once method instead of on to automatically stop listening after the first event occurrence as follows:

var s = fs.createReadStream(f).once('open', function () {
  //do stuff when the readStream opens
});

Before we fill out the open event callback, let's implement some error handling as follows:

var s = fs.createReadStream(f).once('open', function () {
  //do stuff when the readStream opens
}).once('error', function (e) {
  console.log(e);
  response.writeHead(500);
  response.end('Server Error!');
});

The key to this whole endeavor is the stream.pipe method. This is what enables us to take our file straight from disk and stream it directly to the network socket via our response object as follows:

var s = fs.createReadStream(f).once('open', function () {
  response.writeHead(200, headers);
  this.pipe(response);
}).once('error', function (e) {
  console.log(e);
  response.writeHead(500);
  response.end('Server Error!');
});

But what about ending the response? Conveniently, stream.pipe detects when the stream has ended and calls response.end for us. There's one other event we need to listen to, for caching purposes. Within our fs.exists callback, underneath the createReadStream code block, we write the following code:

fs.stat(f, function(err, stats) {
  var bufferOffset = 0;
  cache[f] = {content: new Buffer(stats.size)};
  s.on('data', function (chunk) {
    chunk.copy(cache[f].content, bufferOffset);
    bufferOffset += chunk.length;
  });
}); //end of createReadStream

We've used the data event to capture the buffer as it's being streamed, and copied it into a buffer that we supplied to cache[f].content, using fs.stat to obtain the file size for the file's cache buffer.

Note

For this case, we're using the classic mode data event instead of the readable event coupled with stream.read() (see http://nodejs.org/api/stream.html#stream_readable_read_size_1) because it best suits our aim, which is to grab data from the stream as soon as possible. In Chapter 5, Employing Streams, we'll learn how to use the stream.read method.

How it works...

Instead of the client waiting for the server to load the entire file from disk prior to sending it to the client, we use a stream to load the file in small ordered pieces and promptly send them to the client. With larger files, this is especially useful as there is minimal delay between the file being requested and the client starting to receive the file.

We did this by using fs.createReadStream to start streaming our file from disk. The fs.createReadStream method creates a readStream object, which inherits from the EventEmitter class.

The EventEmitter class accomplishes the evented part of the Node Cookbook Second Edition tagline: Evented I/O for V8 JavaScript. Due to this, we'll be using listeners instead of callbacks to control the flow of stream logic.

We then added an open event listener using the once method since we want to stop listening to the open event once it is triggered. We respond to the open event by writing the headers and using the stream.pipe method to shuffle the incoming data straight to the client. If the client becomes overwhelmed with processing, stream.pipe applies backpressure, which means that the incoming stream is paused until the backlog of data is handled (we'll find out more about this in Chapter 5, Employing Streams).

While the response is being piped to the client, the content cache is simultaneously being filled. To achieve this, we had to create an instance of the Buffer class for our cache[f].content property.

A Buffer class must be supplied with a size (or array or string), which in our case is the size of the file. To get the size, we used the asynchronous fs.stat method and captured the size property in the callback. The data event returns a Buffer variable as its only callback parameter.

The default value of bufferSize for a stream is 64 KB; any file whose size is less than the value of the bufferSize property will only trigger one data event because the whole file will fit into the first chunk of data. But for files that are greater than the value of the bufferSize property, we have to fill our cache[f].content property one piece at a time.

Tip

Changing the default readStream buffer size

We can change the buffer size of our readStream object by passing an options object with a bufferSize property as the second parameter of fs.createReadStream.

For instance, to double the buffer, you could use fs.createReadStream(f,{bufferSize: 128 * 1024});.

We cannot simply concatenate each chunk with cache[f].content because this will coerce binary data into string format, which, though no longer in binary format, will later be interpreted as binary. Instead, we have to copy all the little binary buffer chunks into our binary cache[f].content buffer.

We created a bufferOffset variable to assist us with this. Each time we add another chunk to our cache[f].content buffer, we update our new bufferOffset property by adding the length of the chunk buffer to it. When we call the Buffer.copy method on the chunk buffer, we pass bufferOffset as the second parameter, so our cache[f].content buffer is filled correctly.

Moreover, operating with the Buffer class renders performance enhancements with larger files because it bypasses the V8 garbage-collection methods, which tend to fragment a large amount of data, thus slowing down Node's ability to process them.

There's more...

While streaming has solved the problem of waiting for files to be loaded into memory before delivering them, we are nevertheless still loading files into memory via our cache object. With larger files or a large number of files, this could have potential ramifications.

Protecting against process memory overruns

Streaming allows for intelligent and minimal use of memory for processing large memory items. But even with well-written code, some apps may require significant memory.

There is a limited amount of heap memory. By default, V8's memory is set to 1400 MB on 64-bit systems and 700 MB on 32-bit systems. This can be altered by running node with --max-old-space-size=N, where N is the amount of megabytes (the actual maximum amount that it can be set to depends upon the OS, whether we're running on a 32-bit or 64-bit architecture—a 32-bit may peak out around 2 GB and of course the amount of physical RAM available).

Note

The --max-old-space-size method doesn't apply to buffers, since it applies to the v8 heap (memory allocated for JavaScript objects and primitives) and buffers are allocated outside of the v8 heap.

If we absolutely had to be memory intensive, we could run our server on a large cloud platform, divide up the logic, and start new instances of node using the child_process class, or better still the higher level cluster module.

Tip

There are other more advanced ways to increase the usable memory, including editing and recompiling the v8 code base. The http://blog.caustik.com/2012/04/11/escape-the-1-4gb-v8-heap-limit-in-node-js link has some tips along these lines.

In this case, high memory usage isn't necessarily required and we can optimize our code to significantly reduce the potential for memory overruns. There is less benefit to caching larger files because the slight speed improvement relative to the total download time is negligible, while the cost of caching them is quite significant in ratio to our available process memory. We can also improve cache efficiency by implementing an expiration time on cache objects, which can then be used to clean the cache, consequently removing files in low demand and prioritizing high demand files for faster delivery. Let's rearrange our cache object slightly as follows:

var cache = {
  store: {},
  maxSize : 26214400, //(bytes) 25mb
}

For a clearer mental model, we're making a distinction between the cache object as a functioning entity and the cache object as a store (which is a part of the broader cache entity). Our first goal is to only cache files under a certain size; we've defined cache.maxSize for this purpose. All we have to do now is insert an if condition within the fs.stat callback as follows:

fs.stat(f, function (err, stats) {
  if (stats.size<cache.maxSize) {
    var bufferOffset = 0;
    cache.store[f] = {content: new Buffer(stats.size),      timestamp: Date.now() };
    s.on('data', function (data) {
      data.copy(cache.store[f].content, bufferOffset);
      bufferOffset += data.length;
    });
  }
});

Notice that we also slipped in a new timestamp property into our cache.store[f] method. This is for our second goal—cleaning the cache. Let's extend cache as follows:

var cache = {
  store: {},
  maxSize: 26214400, //(bytes) 25mb
  maxAge: 5400 * 1000, //(ms) 1 and a half hours
  clean: function(now) {
    var that = this;
    Object.keys(this.store).forEach(function (file) {
      if (now > that.store[file].timestamp + that.maxAge) {
        delete that.store[file];
      }
    });
  }
};

So in addition to maxSize, we've created a maxAge property and added a clean method. We call cache.clean at the bottom of the server with the help of the following code:

//all of our code prior
  cache.clean(Date.now());
}).listen(8080); //end of the http.createServer

The cache.clean method loops through the cache.store function and checks to see if it has exceeded its specified lifetime. If it has, we remove it from the store. One further improvement and then we're done. The cache.clean method is called on each request. This means the cache.store function is going to be looped through on every server hit, which is neither necessary nor efficient. It would be better if we clean the cache, say, every two hours or so. We'll add two more properties to cache—cleanAfter to specify the time between cache cleans, and cleanedAt to determine how long it has been since the cache was last cleaned, as follows:

var cache = {
  store: {},
  maxSize: 26214400, //(bytes) 25mb
  maxAge : 5400 * 1000, //(ms) 1 and a half hours
  cleanAfter: 7200 * 1000,//(ms) two hours
  cleanedAt: 0, //to be set dynamically
  clean: function (now) {
    if (now - this.cleanAfter>this.cleanedAt) {
      this.cleanedAt = now;
      that = this;
      Object.keys(this.store).forEach(function (file) {
        if (now > that.store[file].timestamp + that.maxAge) {
          delete that.store[file]; 
        }
      });
    }
  }
};

So we wrap our cache.clean method in an if statement, which will allow a loop through cache.store only if it has been longer than two hours (or whatever cleanAfter is set to) since the last clean.

See also

  • The Handling file uploads recipe discussed in Chapter 2, Exploring the HTTP Object

  • Chapter 2, Exploring the HTTP Object

  • The Securing against filesystem hacking exploits recipe

  • Chapter 5, Employing Streams

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image