Monitoring & Vertically Scaling Node.js Applications

Updated Jan 4, 2023 • 11 min read

The term "horizontal scaling" has lately become so popular that I now consider it a meme. Everyone talks about it, tries to implement it, but in fact, most of the companies or projects don’t need it.

It applies unnecessary complexity to infrastructure, makes troubleshooting harder, forces you to just care about more machines, requires you to maintain and provision a load balancer, and possibly change the way you deploy your application. In most cases, scaling your application by just giving it more power and memory would be totally sufficient.

This term is known as vertical scaling.

Vertical scaling

Vertical scaling is also very useful when running Node.js applications. Why? Because by default, Node limits itself to 1.76GB of memory on 64-bit machines. This means, that even if you spin up a machine with 32GB of RAM, the Node process would only consume a fraction of it. Same goes for the CPU. Node.js is single-threaded by its very nature, so how could we run it on multiple cores? How to solve that issue and unlock the full potential of our machines?

Multiple processes to the rescue

The first thing that may come to your mind is to run multiple instances of your application.

Guess what? It just won’t work. Because of the specifics of UNIX, you’ll get an EADDRINUSE error, which means that the selected port is already bound, and you cannot bind more than one process to selected port.

If you cannot use multiple ports, maybe try assigning random numbers as ports for different Node processes? That might work, however, there are several issues with that solution:

You have to somehow discover these ports.
You have to use a load balancer, which will route your traffic from port 80 or 443 to Node processes.
If one process dies, you have to write some logic to restart the Node that failed or do something else.

There must be an easier way. Fortunately, there is.

Enter the cluster module

To take advantage of multi-core systems, you will have to launch a “cluster” of Node.js processes to handle the load. Luckily, the Node.js API has a module called cluster. This module allows for the easy creation of child processes that all share server ports. You can find it’s documentation here.

Take a look at the simplest possible example of a working cluster straight from Node.js documentation:

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end('hello world\n');
  }).listen(8000);
}

It’s a great starter, but there’s one fundamental problem with this bit of code. If one of your child processes dies, it never spawns back. That’s not a major issue, though, and we can solve it with simple trick. Let’s add the following piece of code inside the if (cluster.isMaster) statement:

cluster.on('exit', (worker, code, signal) => {
  cluster.fork();
});

Great! Now, each time one of your nodes dies, it simply gets respawned by master – regardless of whether it was because of an uncaught exception, an out-of-memory exception, or because someone explicitly killed that process.

Unfortunately, there’s a problem with this small improvement. Image there’s a small bug inside your process initialization code or just a typo, something like this, for instance: consoel.log(“There\’s a typo here!”);

What would happen here would be something like a controlled semi-forkbomb. Your processes would keep starting and dying immediately. Because would they disappear, the master would try to spawn new processes, but they would also die immediately. Welcome to an infinite loop of forks and unexpected process exits.

Actually, there are many cases like the one above. If you try to handle them all, you’ll spend a non-trivial amount of time writing your own process manager, and at the end of the day, it isn’t going to be perfect anyway. Congrats! You’ve just wasted a day of work, while you could have closed 3 tickets instead and bring some business value.

What’s the solution?

DO NOT reinvent the wheel

Right now, Node.js is one of the most popular backend frameworks on the market. It’s obvious that someone before you has already dealt with that problem and did it better than you.

A module that gained the most popularity and is now considered an enterprise standard is called pm2. pm2 is trusted by the community (almost 20k stars on github.com) and also by my company, Netguru. It’s in active development, has a stable API and is constantly assailed by more than 1000 tests.

Using pm2 is dead simple. To clusterify your existing server, simply run:

pm2 start app.js -i 4 --name="api"

This command will create a cluster of 4 nodes named “api”.

If you’d like to see how your cluster performs, just type pm2 ls. You’ll see a nice table showing how well (or poorly) each process is working:

Monitoring & Vertically Scaling Node.js Applications1.png

Of course, pm2 has many more features such as scaling, staged restarts, or even deploying. The feature I find particularly interesting is pm2’s integration with keymetrics.io. With one command, you can establish a link between your local cluster and the web console which provides all necessary measures to monitor your app effectively. Its web interface is very convenient and looks like this:

Monitoring & Vertically Scaling Node.js Applications2.png

As you can see, writing your own solution for process management is hard. Instead, you should use battle-tested existing solutions like pm2 with Keymetrics. You can try out Keymetrics for free here.