The theory around application scaling is a complex and interesting topic that continues to be refined and expanded. A comprehensive discussion of the topic will require several books, curated for different environments and needs. For our purposes, we will simply learn how to recognize when scaling up (or even scaling down) is necessary.
Having a flexible architecture that can add and subtract resources as needed is essential to a resilient scaling strategy. A vertical scaling solution does not always suffice (simply adding memory or CPUs will not deliver the necessary improvements). When should horizontal scaling be considered?
It is essential that you are able to monitor your servers. One simple but useful way to check the CPU and memory usage commanded by Node processes running on a server is to use the Unix ps (process status) command, for example, ps aux | grep...