Cloning and load balancing
Traditional, multithreaded web servers are usually scaled only when the resources assigned to a machine cannot be upgraded any more or when doing so would involve a higher cost than simply launching another machine. By using multiple threads, traditional web servers can take advantage of all the processing power of a server, using all the available processors and memory. However, a single Node.js process is unable to do that, being single-threaded and having a memory limit of 1GB (on 64-bit machines, which can be increased to a maximum of 1.7GB). This means that Node.js applications are usually scaled much sooner compared to traditional web servers, even in the context of a single machine, to be able to take advantage of all its resources.
Note
In Node.js, vertical scaling (adding more resources to a single machine) and horizontal scaling (adding more machines to the infrastructure) are almost equivalent concepts; both in fact involve similar techniques to leverage...