All Ruby on Rails Node JS Android iOS React Native Frontend

Nginx Tutorial #2: Performance


Hello! Sharing is caring so we'd love to share another piece of knowledge with you. We prepared three-part series with the Nginx tutorial. If you already know something about Nginx or you'd just like to expand your experience and understanding - this is the perfect place for you!

We will tell you how Nginx works, what concepts are behind it, how could you optimize it to boost up your app's performance or how to set it up to have it up and running.

This tutorial will have three parts:

  • Basics concepts - where you get to know the difference between directive and context, inheritance model, and the order in which Nginx pick server blocks, and locations.
  • Performance - tips and tricks to improve speed. We will discuss here gzip, caching, buffers, and timeouts.
  • SSL setup - set up configuration to serve content through HTTPS.

We aimed to create a series in which you can easily find the proper configuration for a particular topic (like gzip, SSL, etc.), or simply read it all through. For the best learning experience, we suggest to set Nginx up on your own machine and try some practice.

tcp_nodelay, tcp_nopush, and sendfile


In the early days of TCP, engineers were facing a danger of congestion collapse. Quite a few solutions emerged as a prevention, and one of them was algorithm proposed by John Nagle.

Nagle’s algorithm aims to prevent being overwhelmed with a great number of small packages. It does not interfere with full-size TCP packages (Maximum Segment Size, or MSS in short). Only with packets that are smaller than MSS. Those packages will be transmitted only if receiver successfully sends back all the acknowledgments of previous packages (ACKs). And during waiting, the sender can buffer more data.

if package.size >= MSS.size
elsif acks.all_received?
  # acumulate data

During that time, another proposal emerged: the Delayed ACK.

In TCP communication we are sending data, and receiving acknowledgments (ACK) - which tells us that those data were delivered successfully.
Delayed ACK tries to resolve an issue where the wire is flooded by the massive number of ACK packages. To cut this number, the receiver will wait for some data to be sent back to the sender and include ACK packages with those data. If there is no data to be sent back, we have to send ACKs at least every 2 * MSS, or every 200 - 500 ms (in case we are no longer receiving packages).

if packages.any?
elsif last_ack_send_more_than_2MSS_ago? || 200_ms_timer.finished?
  # wait

As you may start noticing - this may lead to some temporary deadlocks on the persisted connection. Let's reproduce it!


  • the initial congestion window is equal 2. The congestion window is part of another TCP mechanism, called Slow-Start. The details are not important right now, just keep in mind that it restricts how many packages can be sent at once. In first round-trip, we are allowed to send 2 MSS packages. In second: 4 MSS packages, in third: 8 MSS, and so on.
  • 4 buffered packages, waiting to be sent: A, B, C, D
  • A, B, C are MSS packages
  • D is a small package


  • Due to initial congestion window, the sender is allowed to transmit two packages: A and B.
  • Receiver, upon successful getting both packages, is sending an ACK.
  • The sender transmits C package. However, Nagle holds him from sending D (package is too small, wait for the ACK from C)
  • On receiver side, Delayed ACK holds him from sending ACK (which is sent every 2 packages or every 200 ms)
  • After 200ms, the receiver sends ACK for C package.
  • The sender receives ACK and sends D package.




During this exchange 200ms lag was introduced, due to deadlock between Nagel and Delayed ACK.

Nagle algorithm was a true savior in its time and still provides great value. However, in most cases we won’t need it for our website, thus it can be safely turned down via adding the flag TCP_NODELAY.

tcp_nodelay on;     # sets TCP_NODELAY flag, used on keep-alive connections

Enjoy your 200 ms gain!

To get some nitpicky details, I encourage reading this great paper.


Normally, when a file needs to be sent, following steps are required:

  • malloc(3) - allocate a local buffer, for storing the object data
  • read(2) - retrieve and copy the object into the local buffer
  • write(2) - copy object from the local buffer into the socket buffer

This involves 2 context switches (read, write), and makes unnecessary, second copy of the same object. As you may see, it is not the optimal way. Thankfully there is another system call, that improves sending files, and it's called (surprise surprise): sendfile(2). This call retrieves an object to the file cache, and passes the pointers (without copying the whole object), straight to the socket descriptor. Netflix states, that using sendfile(2) increased network throughput from 6Gbps to 30Gbps.

However, sendfile(2) got some caveats:
  • does not work with UNIX sockets (e.g. when serving static files through your upstream server)
  • can perform differently, depending on the operating system (more here)

To turn this in nginx

sendfile on;


tcp_nopush is opposite to tcp_nodelay. Instead of pushing packages as fast as possible - it aims to optimize an amount of data sends at once.
It will force to wait for the packages to get it's maximum size (MSS), before sending it to the client. And this directive only works, when sendfile is on.

sendfile on;
tcp_nopush on;

It may appear that tcp_nopush and tcp_nodelay are mutually exclusive. But if all 3 directives are turned on, nginx will:
* ensure packages are full, before sending to the client
* for the last packet, tcp_nopush will be removed - allowing TCP to send immediately, without 200ms delay

How many processes should I have?

Worker processes

worker_process directive defines, how many workers should be run. By default, this value is set to 1. Safest setting is to use the number of cores by passing auto option.
But still, due to Nginx architecture, which handles requests blazingly fast - we probably won’t use more than 2 - 4 processes at a time (unless you are hosting Facebook, or doing some CPU intensive stuff inside nginx).

worker_process auto;

Worker connections

The directive that is directly tied with worker_process is worker_connections. It specifies how many connections at once can be opened by a worker process. This number includes all connections (e.g. connections with proxied servers), and not only connections with clients. Also, it is worth keeping in mind, that one client can open multiple connections, to fetch other resources simultaneously.

worker_connections 1024;

Open files limit

“Everything is a file” in Unix based systems. It means that documents, directories, pipes or even sockets are files. And the system has a limitation how many files can be opened at once by a process. To check the limits:

ulimit -Sn      # soft limit
ulimit -Sn      # hard limit

This system limit must be tweaked accordingly to worker_connections. Any incoming connection opens at least one file (usually two - connection socket and either backend connection socket or static file on disk). So it is safe to have this value equal to worker_connections * 2. Nginx, fortunately, provide an option for increasing this system value, withing nginx config. To do so, add worker_rlimit_nofile directive with proper number and reload the nginx.

worker_rlimit_nofile 2048;


worker_process auto;
worker_rlimit_nofile 2048; # Changes the limit on the maximum number of open files (RLIMIT_NOFILE) for worker processes.
worker_connections 1024;   # Sets the maximum number of simultaneous connections that can be opened by a worker process.

Maximum number of connections

Given above, we can calculate how many concurrent connections we can handle at once:

max no of connections =

    worker_processes * worker_connections
 (keep_alive_timeout + avg_response_time) * 2

keep_alive_timeout (more on that later) + avg_response_time tells us, how long single connection is opened. We also divide by 2, as usally, you will have 2 connections opened, by one client: one between nginx and client, and second between nginx and upstream server.


Enabling gzip should significantly reduce the weight of your response, thus it will appear faster on the client side.

Compression level

Gzip has the different level of compressions, from 1 to 9. Incrementing this level will reduce the size of the file, but also increase resources consumption. As a standard we keep this number at 3 - 5, as going above it will gain small saves, simultaneously with larger CPU usage.

Here is an example of compressing the file with gzip with different levels. 0 stands for an uncompressed file.

curl -I -H 'Accept-Encoding: gzip,deflate'
❯ du -sh ./*
 64K    ./0_gzip
 16K    ./1_gzip
 12K    ./2_gzip
 12K    ./3_gzip
 12K    ./4_gzip
 12K    ./5_gzip
 12K    ./6_gzip
 12K    ./7_gzip
 12K    ./8_gzip
 12K    ./9_gzip

❯ ls -al
-rw-r--r--   1 matDobek  staff  61711  3 Nov 08:46 0_gzip
-rw-r--r--   1 matDobek  staff  12331  3 Nov 08:48 1_gzip
-rw-r--r--   1 matDobek  staff  12123  3 Nov 08:48 2_gzip
-rw-r--r--   1 matDobek  staff  12003  3 Nov 08:48 3_gzip
-rw-r--r--   1 matDobek  staff  11264  3 Nov 08:49 4_gzip
-rw-r--r--   1 matDobek  staff  11111  3 Nov 08:50 5_gzip
-rw-r--r--   1 matDobek  staff  11097  3 Nov 08:50 6_gzip
-rw-r--r--   1 matDobek  staff  11080  3 Nov 08:50 7_gzip
-rw-r--r--   1 matDobek  staff  11071  3 Nov 08:51 8_gzip
-rw-r--r--   1 matDobek  staff  11005  3 Nov 08:51 9_gzip

gzip_http_version 1.1;

This directive tells nginx to use gzip only for HTTP 1.1 and above. We don’t include HTTP 1.0 here, as, for 1.0 version, it is impossible to use both keep-alive and gzip. So you have to decide which one you prefer: HTTP 1.0 clients missing out on gzip, or HTTP 1.0 clients missing out on keep-alive.


gzip on;               # enable gzip
gzip_http_version 1.1; # turn on gzip for http 1.1 and above
gzip_disable "msie6";  # IE 6 had issues with gzip
gzip_comp_level 5;     # inc compresion level, and CPU usage
gzip_min_length 100;   # minimal weight to gzip file
gzip_proxied any;      # enable gzip for proxied requests (e.g. CDN)
gzip_buffers 16 8k;    # compression buffers (if we exceed this value, disk will be used instead of RAM)
gzip_vary on;          # add header Vary Accept-Encoding (more on that in Caching section)

# define files which should be compressed
gzip_types text/plain;
gzip_types text/css;
gzip_types application/javascript;
gzip_types application/json;
gzip_types application/;
gzip_types application/x-font-ttf;
gzip_types font/opentype;
gzip_types image/svg+xml;
gzip_types image/x-icon;


Caching is another thing, that can speed up requests nicely, for returning users.

Managing the cache can be controlled just by 2 headers:

  • Cache-Control for managing cache in HTTP/1.1
  • Pragma for backward compatibility with HTTP/1.0 clients

Caches itself can be grouped in 2 categories: public and private caches. Public cache store responses for reuse for more than one user. A private cache is dedicated to a single user. We can easily define, which cache should be used:

add_header Cache-Control public;
add_header Pragma public;

For standard assets we would also like to keep them for 1 month:

location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
  expires 1M;
  add_header Cache-Control public;
  add_header Pragma public;

The configuration above seems to be sufficient. However, there is one caveat when using the public cache.

Let’s see what will happen if we would store our asset in public cache (e.g. CDN) with URI as the only identifier. In this scenario, we also assume that gzip is on.

We got 2 browsers:

  • the old one, which does not support gzip
  • the new one, which does support gzip

The old browser sends a request for a to our CDN. As CDN does not have this resource yet, it will request our server for it and return uncompressed response. CDN stores file in the hash (for later usage):

  ... => FILE("/sites/netguru/style.css")

And return it to the client.

Now, the new browser sends the same request to the CDN, asking for, expecting gziped resource. As CDN only identifies resources by the URI, it will return the same uncompressed resource for the new browser. The new browser will try to extract a non-gziped file and will get garbage.

If we could tell somehow public cache to identify resource based on URI, and encoding - we could avoid this issue

  (, gzip) => FILE("/sites/netguru/style.css.gzip")
  (, text/css) => FILE("/sites/netguru/style.css")

And this is exactly what Vary Accept-Encoding; does. It tells public cache that resource can be distinguished by URI and Accept-Encoding header.

So our final configuration will look like:

location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
  expires 1M;
  add_header Cache-Control public;
  add_header Pragma public;
  add_header Vary Accept-Encoding;


The client_body_timeout and client_header_timeout define how long nginx should wait for a client to transmit body or header before throwing the 408 (Request Time-out) error.

send_timeout sets a timeout for transmitting a response to the client. The timeout is set only between two successive write operations, not for the transmission of the whole response. If the client does not receive anything within this time, the connection is closed.

Be careful when setting those values, as too large waiting time, can make you vulnerable to attackers, and too little will cut off slow clients.

# Configure timeouts
client_body_timeout   12;
client_header_timeout 12;
send_timeout          10;



Sets buffer size for reading client request body. In case the request body is larger than the buffer, the whole body or only its part is written to a temporary file. For client_body_buffer_size setting 16k is enough in most cases.

This is yet another setting which can have a massive impact but has to be used with care. Too little - and nginx will constantly use I/O to write remaining parts to the file. Too much - and you can be vulnerable to DOS attacks when the attacker could open all the connections, but you are not able to allocate a buffer on your system to handle those connections.

client_header_buffer_size and large_client_header_buffers

If headers don’t fit into client_header_buffer_size, then large_client_header_buffers will be used. If the request also won’t fit into that buffer, error is returned to the client. For most requests, a buffer of 1K bytes is enough. However, if a request includes long cookies, it may not fit into 1K.

If the size of a request line is exceeded, the 414 (Request-URI Too Large) error is returned to the client.
If the size of a request header is exceeded, the 400 (Bad Request) error is thrown.


Sets the maximum allowed size of the client request body, specified in the “Content-Length” request header field. Depending if you want to allow users to upload files, tweak this configuration for your needs.


client_body_buffer_size       16K;
client_header_buffer_size     1k;
large_client_header_buffers   2 1k;
client_max_body_size          8m;


TCP protocol, on which HTTP is based, requires performing three-way handshake, to initiate the connection. It means that before the server can send you data (e.g. images), three full roundtrips between client and the server need to be made.
Assuming that you are requesting /image.jpg from Warsaw, and connecting to the nearest server in Berlin:

Open connection

TCP Handshake:
Warsaw  ->------------------ synchronize packet (SYN) ----------------->- Berlin
Warsaw  -<--------- synchronise-acknowledgement packet (SYN-ACK) ------<- Berlin
Warsaw  ->------------------- acknowledgement (ACK) ------------------->- Berlin

Data transfer:
Warsaw  ->---------------------- /image.jpg --------------------------->- Berlin
Warsaw  -<--------------------- (image data) --------------------------<- Berlin

Close connection

For another request, you will have to perform this whole initialization, once again. If you sending multiple requests, during short periods of time, this can add up fast. And here keep alive comes handy. After the successful response, it keeps the connection idle for given period of time (e.g. 10s). If another request will be made during this time, existing connection will be reused, and the idle time - refreshed.

Nginx provides few directives to tweak keepalive settings. Those can be grouped into two categories:

  • keep-alive between client and nginx
keepalive_disable msie6;        # disable selected browsers.

# The number of requests a client can make over a single keepalive connection. The default is 100, but a much higher value can be especially useful for testing with a load‑generation tool, which generally sends a large number of requests from a single client.
keepalive_requests 100000;

# How long an idle keepalive connection remains open.
keepalive_timeout 60;
  • keep-alive between nginx and upstream
upstream backend {
    # The number of idle keepalive connections to an upstream server that remain open for each worker process
    keepalive 16;

server {
  location /http/ {
    proxy_pass http://http_backend;
    proxy_http_version 1.1;
    proxy_set_header Connection "";

Aaand that’s it.


Thank You for reading. This series could not be possible without the great amount of resources found in the depths of the internet. Here are few great websites we found especially useful, during writing this series:

We would be grateful for your feedback and comments, so feel free to discuss! Did you like this series? Do you have some suggestions what topic should be tackled next? Or maybe did you spot a bug? Let us know and see you next time!

Subscribe Netguru Codestories
Join over 10.000 specialists

Code stories CTA
Read also
Need a successful project?
Estimate project or contact us