All Ruby on Rails Node JS Android iOS React Native Frontend

ActiveRecord - Part 1: N+1 Queries Problem

Today, something about well-know problem to experienced developers. We will discuss N+1 queries problem and how to deal with it using ActiveRecord.

UPDATE 14.06.2018: Updated information about #includes and #preload methods.

N+1 queries problem

The typical and old problem well known to all experienced coders. What is it about? Let's assume that we have Clients table. Each record in that table is represented by Client model in our simple app. We also have table Addresses and corresponding Address model. These entities are related - one client can have many addresses and address has exactly one owner (Client). Simple and typical one-to-many association. 

 

The problem occurs when we try to fetch 1-to-many relationship for more than one parent object (in our case, it's Client). 

(calling "to_a" forces our model to fetch data from database)

This code will trigger N+1 number of queries where N is the number of clients. E.g. we have 1000 clients - we trigger 1001 queries. One to fetch all clients and one for each of those clients to fetch his addresses. 

I do not want to say that is something wrong with that approach. Not at all. It is called "lazy loading" and could be really useful, but it really depends onthe specific situation.

Solution #1 - preload

We will use #preload method offered by Rails' ActiveRecord. 

What we achieved by that? Code above will trigger only two queries! One to fetch all the clients and one to fetch associated addresses for all clients at once. Then 1001, now 2. Sounds good. 

No magic or complicated logic here :) Simple SELECT for clients and SELECT (using IN clause) for addresses. 

Solution #2 - eager_load

#eager_load also preloads association but it does it in one query! Ok... do not get too excited :) Less not always means better.

Any magic here? No, no really. It just creates SQL query using LEFT OUTER JOIN and this type of join is very important. LEFT OUTER JOIN guarantees that we will receive all clients - even if they do not have any associated addresses (in opposite to INNER JOIN - see below).

Solution #3 - includes

There is some magic behind #includes method but documentation is not clear about this. How many queries will be called depends on what additional instructions we use (where / order etc). If we add additional condition related to preloading associations, this method will behave like eager_load method. Otherwise, like preload method.

Code above will behave like we would use preload method (two queries will be triggered).

This code will trigger one query (using LEFT OUTER JOIN) so just like in eager_load method.

joins

Not marked as a solution because #joins method does not preload assocations! It is just an interface to create SQL query using INNER JOIN. Because of how SQL's INNER JOIN works, the following code snippet will not return all of our clients:

Why? It will skip records for which join condition is satisfied - in our case: it will skip clients without any addresses so we will receive only client with at least one address. 

No preloading any associations right here. Calling:

will trigger 1001 queries to our database.

Another important thing: Assume that 600 of our clients have associated at least one address. Will we receive the result with 600 client objects? Answer is: 600 is the minimal amount. Again because how SQL's INNER JOIN works. Such query will return record for all matching pairs so each client with each address separately. We can easily get rid of duplicates using #distinct method.

Benchmarks

Let's compare the performance of each solution. In my database, I have 1000 clients and each client has 3 addresses associated. We will measure the real time of calling "to_a" on each client's addresses association (just to force loading data from database). I put an average of 10 measurements (of course, I did not use any caching feature).

 

Time

[s]

Queries 

[-] 

Memory allocated

(Memory retained)

[bytes]

lazy-loading 1.201 1001  297139 (1104)
includes / preload 0.142 2  128105 (1070)
eager_load 0.162 1  125954 (1143)
joins 2.875 1001 873977 (1191)

 

While comparing lazy-loading to includes, we observe 88% time reduction! We can also see benefits in the amount of allocated memory. Brilliant! Does it mean that we should always preload all associations? 

No, absolutely not! We do not always need to preload related data. Sometimes lazy-loading is very useful and more efficient. It really depends on the situation.

I just want you to know that "N+1 queries" problem exists, how to deal with it and whether fighting with it can be beneficial to our app.

 

Let's make the world better for everyone - join us
New Call-to-action
READ ALSO FROM Ruby/Ruby on Rails
Read also
Need a successful project?
Estimate project or contact us