All Case Studies Design Development Interviews Our Way Project Management

Tracing Patterns that Might Hinder Performance

There is a pretty good chance you will encounter at least one unresponsive app or a slowly loading web page today. It’s 2017 already, and we want to do everything more quickly, yet we still experience annoying delays. How is that possible? Doesn’t our Internet connection improve every year? Doesn’t our browser perform better day by day? In this article, we will cover the latter.

Indeed, browsers and their engines are getting faster, new features are added all the time, and some other legacy features are becoming obsolete. The same happens to websites and apps. They also become heavier and larger, therefore we must take into account that even though browsers and hardware are always improving, we still need to take care of the performance – at least to some extent. You will find out shortly how to avoid a few popular pitfalls and improve the overall performance of apps and websites, but before that, let’s have a bit of an overview.

Optimisation

I could probably write an entire book explaining the pipeline, but in this article, I want to focus on the key aspects that will help you optimise the process. I will describe the common mistakes that can substantially hurt performance. For the sake of brevity, I will not talk about parsing, AST, machine code generation, GC (Garbage Collector), feedback collection or OSR (on-stack replacement), but fear you not – we’ll give those issues more space in future articles.

The Old World

Let’s start with the old world (baseline compiler + Crankshaft), which became obsolete as of Chrome M59.

The baseline compiler doesn’t perform any magical optimisations. It just compiles the code quickly and lets it execute. You must be aware that the generating efficiently optimised code relies heavily on speculative optimisations, which in turn require type feedback to speculate on, so you need to run the baseline first.

In case your function becomes “hot” (it’s a common definition for function that the engine finds worth optimising), Crankshaft kicks in and does its magic. The performance of such code is very, very decent, comparable to Java. This approach was one of the first in the industry and it brought about a massive performance boost. As a result, JS could finally be executed smoothly and frontend developers were able to create complex web applications.

The New World

As the web evolved, new frameworks arrived, and specifications changed, extending Crankshaft capabilities became troublesome. Some patterns had never been given much love by Crankshaft, for instance certain accesses to arguments object (the safe uses were on unmonkey-patched Function.prototype.apply, length access and in-bound indices) or using a try catch statement. There were lots of other patterns too. Luckily, Ignition and TurboFan can solve a few of those performance bottlenecks. Now, some patterns can be optimised in a more sophisticated way. As stated earlier, optimisation is expensive, and it takes some resources (which might be little on mobile low-end devices). In most cases, however, you would still like your function to be optimised.

When it comes to TurboFan, there were a few reasons it was introduced:

  • Providing a uniform code generation architecture

  • Reducing porting / maintenance overhead of V8 (currently 10 ports!)

  • Removing performance cliffs due to slow builtins

  • Making experimenting with new features easier (i.e. changes to load/store ICs, bootstrapping an interpreter)

Of course, this had to be done without sacrificing the performance. Generating bytecode is relatively cheap, but interpreting bytecodes can be up to 100x slower than executing optimised code. Obviously, it depends on the complexity of compiler. The baseline compiler was never meant to produce very fast code, but it is still faster (not much though, but in some cases fullcodegen is faster 3x-4x) than Ignition – taking into account just executing the code. TurboFan was aimed to replace Crankshaft – the previous optimising compiler.

Do We Need the Optimisations?

Yes and no.

If we run our function once or twice, optimisation may not be worth it. However, if it’s likely to be executed multiple times, and the types of the values and the shapes of the objects are stable, then you should probably consider optimising your code. We might not be aware of some quirks that are present in the specification. The steps needed to be taken by the engine are often difficult to understand. For instance, when accessing a property, the engine has to take care of edge cases that are very unlikely to happen in the real world. Why is that? Sometimes due to backwards compatibility, sometimes there is another reason – each case is different. However, if we find something redundant, we might not actually need to do it! The process of optimising spots such situations and tries to remove the redundant operations. A function with removed redundant operations is called a stub.

Since JS is a dynamically typed language, we always have to make plenty of assumptions. It is best to keep our property access site monomorphic or, in other words, it should have only one known path. If our assumptions mismatch, we encounter a deopt and our optimised function is no longer valid. We definitely want to avoid it whenever possible. Each process optimisation is more or less expensive. Once we optimise again, we need to take into account all previous circumstances to prevent further deopts, so our property access site will no longer be monomorphic. It’s polymorphic and will stay polymorphic as long as there are no more than four paths. If there are more than four paths, it’s megamorphic.

Before you start

All functions with the percent sign as a prefix are available only if you pass --allow-natives-syntax.

Normally, you should not access them. If you want to find their definition, go to src/runtime (V8 source code). All bailout (deopt) reasons are available here https://cs.chromium.org/chromium/src/v8/src/bailout-reason.h

If you want to see whether your function is optimised or not, pass the --trace-opt flag. If you want to be notified once your optimised function gets deoptimised, pass the --trace-deopt flag.

Examples

Example 1

We will start with a very straightforward example.

We will declare a very simple add function that takes two arguments and returns the sum of them. Quite simple, right? Let's see then.

https://gist.github.com/P0lip/13fb3cf7c7cba5a383245bb2891c5f71

d8 --trace-deopt --print-opt-code --allow-natives-syntax --code-comments --turbo add.js

If you run V8 older than 5.9, you must pass the --turbo flag explicitly to make sure your function goes through TurboFan.

If you run the above, you will get something like this:

https://gist.github.com/P0lip/4de9323fd4d4cc73a3a31d6f47036dd5

As you can see, there are at least three different situations in which our function may be eagerly deopted.

If we took lazy deopts into account, we would find even more, but let’s focus on eager deopts.

By the way, at the moment here are three types of deopts: eager, lazy and soft.

It may look a bit awkward and scary, but don't worry! You will get it soon.

Let's start with the first likely deopt.

// ;; debug: deopt index 0

A deopt reason 'not a Smi'. If you have already heard about Smi, you can skip the next paragraph sentences.

Basically, a Smi is a shorthand for small integer. It varies quite a lot from other objects represented in V8.

If you dig into V8 source code, you will find a file objects.h there (https://chromium.googlesource.com/v8/v8.git/+/master/src/objects.h).

As you can see, a Smi is not a HeapObject.

A HeapObject is a "superclass for everything allocated in the heap". Basically, what we have access to (as frontend developers) is subclasses of JSReceiver.

For example, a plain array (JSArray) or function (JSFunction) inherits that class.

So, as you can see, a Smi is something different. You can find some information about this if you look for Javascript tagging schemes.

A Smi is a 32-bit signed int on 64-bit architectures and a 31-bit signed int on 32-bit architectures.

If you pass anything else than such a number your function will be deopted.

For example:

add(2 ** 31, 0)

will be deopted because 2 ** 31 is higher than 2 ** 31 - 1.

Of course, if you don't pass a number but a string, array or anything else, you will get a deopt as well, for example:

add([], 0);

add({ foo: 'bar' }, 2);

Let's move to the second deopt index

;; debug: deopt index 1

The same flow applies here. The only difference is that now it's a check for the second argument called ‘b’.

add(0, 2 ** 31) // would cause a deopt as well.

Okay, let's move to the last deopt index.

;; debug: deopt index 2

'Overlow'

Since you know what a Smi is, it's quite easy to understand what happens here.

Basically, that reason will be triggered once the previous checks pass, but the function doesn't return a Smi. For instance,

add(1, 2 ** 31 - 1); // returned value higher than 2 ** 31 - 1

Example 2

Let's move forward then and declare a function that looks identical.

https://gist.github.com/P0lip/16b4573ab73bab4e7c03660f6f9780be

A similar function, but a result that’s way different. Why?! Don't the same checks apply to all identically looking functions?

Nope! These checks are type-dependent, meaning that the engine doesn’t make assumptions in advance. It just adjusts its behavior and optimisations during runtime and once the function is executed. Therefore, even though the function looks the same, you have a different path.

In this case, our function is optimised by Crankshaft.

d8 --trace-deopt --code-comments --print-opt-code --allow-natives-syntax concat.js

https://gist.github.com/P0lip/fe9747a97b8f53c125b243803064dba5

Okay, so let's discuss this case.

;; debug: deopt index 1

A deopt occurs once you pass a HeapObject instead of a Smi. In fact it's the opposite of 'Not a smi', so I will skip explaining it. I can only add that this check applies to the first argument called ‘a’.

;; debug: deopt index 2

'wrong instance type' – this is more interesting. We haven't seen it yet!

Quite easy to guess. This check fails if you don't pass a string or when you pass nothing.

concat([], 'd');

concat(new String('d'), 'xx');

The last 2 reasons are exactly the same as above, but apply to the second argument ('b').

Example 3

Okay, let’s move on and have a go at a slightly different example.

https://gist.github.com/P0lip/6ced42e58c57e34d34dd821e243b25a9

d8 --trace-deopt --code-comments --print-opt-code --allow-natives-syntax --turbo elem-at.js

https://gist.github.com/P0lip/a349a577c61f7e5196584ccc4e11aba0 

Before we start explaining the new reasons, we have to make sure we know what a (hidden) map (aka a hidden class) is. As we have already mentioned, the engine must make assumptions in order to spend less time processing redundant operations. Still, we must know the elements well. Each element has a kind. V8 implements TypeFeedbackVector. I encourage you to read this article if you want to get some more information. The known kinds are available here.

We also do have a few native functions that help us check whether our element fits into a given type. Their definitions are located in the file above, but their native names are available here

So let’s get back to deopts.

;; debug: deopt reason 'Smi' 

;; debug: deopt index 0

Trivial. It happens when you pass a Smi as the function’s first argument called ‘arr’.

;; debug: deopt reason 'wrong map'
;; debug: deopt index 1

Unfortunately, this tends to happen very often.

Our map is: <Map(FAST_SMI_ELEMENTS)>

So any time our array ‘arr’ contains something different than a Smi element, the map will no longer match. Of course, this also happens when we don’t pass a plain array but something else, for instance:

elemAt([‘netguru’], 0);

elemAt({ 0: ‘netguru’ }, 0);

If you want to check whether our array consists of Smi elements, you can run a native method I mentioned before.

print(%HasFastSmiElements([2, 4, 5])); // prints true

print(%HasFastSmiElements([2, 4, 'd'])); // prints false

print(%HasFastSmiElements([2.1])); // prints false

print(%HasFastSmiElements({})); // prints false

Okay, now we are performing checks on the second argument ('index'). As you will quickly notice, its deopt reasons rely on the first argument.

;; debug: deopt reason 'out of bounds'
;; debug: deopt index 2

'Out of bounds'. Literally, when your index is higher than the length of the array or lower than 0, this will cause a deopt.

In other words, you are trying to access the element whose index doesn't belong to the array.

Examples:

elemAt([2,3,5], 4);
;; debug: deopt reason 'not a heap number'
;; debug: deopt index 4

'not a heap number' – not a number (not to be confused with smi as it's doesn’t mean the same), examples:

elemAt([2,3,5], '2');

elemAt([2,3,5], new Number(5));
;; debug: deopt reason 'lost precision or NaN'
;; debug: deopt index 5

If you encounter this check, it means you have passed a number, but... is it a valid number?

Lost precision – not an int, for example 1.1

elemAt([0, 1], 1.1);

elemAt([0], NaN);
;; debug: deopt reason 'minus zero'
;; debug: deopt index 6

Easy peasy.

add(0, -0); // weird, I know

Easy task.

Yet another example – a combination of the previous ones. We won’t explain it in detail, and I’ve thought it as more of a task for you :)

https://gist.github.com/P0lip/9290aabe6cc4db0feffa89fdd05b826b

In case you don’t have d8:

https://gist.github.com/P0lip/039ffe3dc1ddf6a4321156141c3ccc31

That’s that.

We went through two very simple examples, but hopefully you’ve got the general idea.

If you want to be notified once your function is deopted, just pass --trace-deopt.

To sum up – don’t over-optimize, because it may hurt the code readability in some cases (see our third example and the function elem-at). You can pass arrays of strings, etc. as well, there is really nothing wrong with it. However, don’t optimize if you don’t really need to. As far as the first example is concerned, in my opinion, even though the functions are pretty much the same, it’s better to have two separate functions with different namings, because when a different developer sees something like concat or sum, they can quickly find out what this function does.

In the future, you can add a case in concat specific for strings, like for instance, (a + b).toUpperCase() etc. and you don’t have add any special cases to the ‘sum’ function.

Last but not least, you should always keep in mind that over-optimising might hurt readability and you may end up with unmaintainable code. Just try not to use any weird patterns you wouldn’t use in a compiled language.

Finally, I would like to thank Benedikt Meurer, a Software Engineer at Google and Tech Lead of the V8 team in Munich, who reviewed this article. Check out his blog as well.

Follow Netguru
Join our Newsletter

Scaling SaaS
READ ALSO FROM frontend
Read also
Need a successful project?
Estimate project or contact us