Let’s get those Javascript Arrays to work fast

Hi,

So now let us watch how to handle Javascript arrays with performances in mind.

The array object is indeed very powerful, but you need to handle it with care since some quite innocent feature can be have a high time and/or garbage creation cost.

I’ll show here a few simple techniques that avoids those costs.
The performance benefits can be very high (more than 10X) for some features, so all this is worth a look.

( I also wrote some tips to write faster Javascript : check the blog post here :  Writing efficient Javascript : A few tips  )

1. Stating the obvious : [] is better than {}

Any time you can, avoid using an {} object to store data/perform computations/…
It is quite likely that you can use an integer instead of string to identify your data.
You might have to use another integer ->string array to do so, but it’s worth the performance gains.

2. [ integer ] is better then [ something else ]

In the same manner, prefer to use arrays of integers : arrays where *all* ‘slots’ of the array are filled with integers. In such a situation, most javascript engine will recognize that they are dealing with an array of integer and perform much faster.

3. Do not loop backward within an array.

You might know this way to iterate through an array :

  var i = myArray.length;
  while ( i-- ) {
     ... process myArray[i] ...
  }

Which makes you avoid an extra test (compared to the standard for loop).
But you know what ? this will be much slower than using the right order.
Because all CPU caches in the world expect the processing to be ‘straight’, you will have cache misses again and again, and a 2X slow down is what you’ll get when you are lucky.

So do not loop backward unless you have very good reasons to do so.

Notice that even lastIndexOf is be faster when straight, like this :

  function lastIndexOf ( arr, value ) {
     var i=0, len=arr.length, ind = -1;
     while (i != len) {
         if (arr[i] == value) { ind=i ; }
         i++;
     }
     return ind;
  }

Depending on array size, and value distribution ( == ? will we find a few or a lot of matches) ?, this one might be way faster in fact.

Surprising.

4. Do not change again and again the array’s size

One important thing to notice, that is hidden by the simplicity of push(), pop(), among others, is that any time you change the size of an array it might lead to memory allocation and/or memory fragmentation and/or memory copy.
Exemple :

// let us allocate 3 arrays : 
var A = [5,6], B = [1,2,3], C = [ 1, 2, 4 ];   
// let's push a value on B :
B.push (1);  
// let's pop a value on C : 
C.pop();

 

those 3 lines of code will be handled like this :

Array

    Just on this very simple example : 1 push, 1 pop,  we are doing much more memory activity than we should : 3 memory allocation, 3 memory dis-allocation, 8 values copy (!). You have to imagine the case where you are using array of arrays, altogether with a lot of smaller objects, when such issue will become very important.

==> Push and pop are not constant time operations, but are taking a time proportional to the array’s length ( O(n) ), and memory fragmentation.

( Maybe the term ‘list’ used in other languages, is less misleading on that regard. )

Rq : Changing the length of the array directly :

myArray.length = 12 ;

might have exactly the same performance impact.

==>> Operate on fixed sized arrays whenever possible.

==>> Using over-sized array is way better than changing often the size.

To use an over-sized array, just handle its used length separately :

var B = [ 0, 0, 0, 0, 0 ] ;    // allocated array of length 5.
var BLength = 0 ;
// to do a 'push', do 
B[ BLength++ ] = 1;
// to do a 'pop' :
var last = B[ --BLength  ] ;

Those two operation required 1 read, 1 write, 2 add.
No memory allocation, no copy : even if you add boundary checks for the pop operation, performances just can’t compare to the not-so-complex scenario exposed above.

5. Remarks on performance measures

If you happen to test performances for various cases, take great care.
Some remarks about jsperf specifically :
– In javascript, no-one knows when a garbage collector will happen, so it might even happen ‘later’,  when it is no longer the test that created garbage that is running. Yet,  with the high number of loops performed this effect seems not too important.
– But worse : no memory is allocated before the test starts, so if we test with just one array, the memory management task is much much simpler than in real-life applications, and proves nothing : let us see an example with push/pop :

// jsperf Case : only one array in memory :
var A = [ 0,0,0,0 ];
//    memory = AAAA 
A.push(XX) ;
//   memory = AAAAA : the memory management just had 
//                  to reclaim one more item.
var val = A.pop();
//    memory = AAAAA  (just reduce the size)

No disallocation / re-allocation / copy will never happen with a single array test, and all operations are simpler : far too simple.
I noticed that when testing pooling objects within an array with only one type of object : even IE was faster at handling memory than Chrome (!), and pooling was useless on any browser. Then i just used 3 arrays of 3 differently sized object, which is a not so complex situation. The performances of IE went down in the deep sea, and all browsers had a hard time to get their memory straight : by avoiding memory allocation/fragmentation, pooling showed impressive improvements. But the single-object test case was just too simple.
The real-life case are way harder on the garbage collector, as you can see in paragraph 4.

Conclusion : When it comes to memory issues, expect the real performance improvements to be in fact higher than the jsperf result.

6. pre-allocate your arrays whenever you can.

In case you want to fill an array with some values, allocate the needed memory at once by using the Array constructor :

 var n = 1000;
 var myArray = new Array(n); 
 var i=0;
 while (i<n) { myArray[i] =0; i++; }

This way of proceeding is ***10 times**** faster then a push() loop. …
(http://jsperf.com/push-allocated-vs-dynamic)

I was surprised by the fact that pre-allocating the array like this :

 var myArray   = [];
 myArray[n-1]  = 0 ;

was not a good idea for performance. My guess is that an array filled by both undefined
(all the n-1 first items) and integer cannot be treated internally as an integer array.

On the other hand, when using the Array constructor (new Array(length)), Firefox and Chrome at least seems to be able to optimize internally the array without problems, so keep this in mind whenever you allready know the size your array will have.

6. write your own, push, pop, unshift, splice (among others).

All those nice features are to be avoided if you need performance : because they are functions : you pay the price of a call, so if you can inline, do it.
And also because often you know something about your array the function doesn’t (expl : all items are >0 integers).
Splice is especially to be avoided, since it returns an array of removed items : slower, useless, and creating garbage.
So replace those functions by your own code, and even inline the smallest one : depending on functions, browsers and devices, the performance gain can be from 2X to 10X.

push :      myArray[myArray.length] = XX ;
pop  :      var last=myArray[myArray.length--] ;
unshift :
var unshiftArray = function (arr, item) {
                     var len=arr.length;
                     while (len) { arr[len] = arr[len-1]; len--}
                     arr[0] = item; };
         // this unshift is around 4 to 10 times faster 
         // than the original
         // (http://jsperf.com/array-unshift-vs-prepend/8)
shift : i couldn't beat the original function by hand (FF/Ch).
Splice :
// if you plan on just removing one item, this one 
//    will be way faster :
var spliceOne = function(arr, index) {
                         var len=arr.length;
                         if (!len) { return }
                         while (index<len) { 
                               arr[index] = arr[index+1]; index++ }
                         arr.length--;
                };
indexOf :
var indexOf = function(arr, item) {
                  for (var i=0, len=arr.length; i!=len ; i++) {
                       if (arr[i] === item) { return i }
                   }
                   return -1;
              };
lastIndexOf :
var lastIndexOf = function(arr, item) {
                     var i=arr.length;
                     while (i--) { 
                           if (arr[i] === item) { break } }
                           return i;
                   }; // ... or the paragraph 3 version...

There are many functions that deals with arrays, as you can see here :
https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array

There are even more functions that can use arrays, since any function that takes an arbitrary number of arguments can be used with an array :

expl with min :

  // to use Math.min on an array, write :
  val mim = Math.min.apply(Math, myArray);
  // since min is context-less you can even write :
  val mim = Math.min.apply(null, myArray);
  // caching Math.min wins some time on slower devices/browsers
  var minfunc = Math.min;
  // ... but anyway min is faster when done with 
  //                       a properly coded function... 
  // (http://jsperf.com/math-min-apply-vs-loop/3)

Be carefull about what are doing the arrays or array-friendly functions you use, and if you happen to use one very frequently in critical sections of your code, ask yourself if you cannot beat the javascript engine.
In quite some not-so-special cases you can.

Rq about shift/unshift : beware, those are always O(n) operations (meaning : each operation will take a time proportionnal to the number of the array length). Unless you really need to, you shouldn’t use them.  Rather build your own rotating array if you need such feature.

7. Could i never de-allocate my arrays ? Yes, you can !

A nice way to avoid allocation/disallocation issues is to handle a separate index,
and to handle its pseudo-length separately. Let us call this the ‘sLength’ (stackLength)
of the array.
The code could look like :

  var myStack = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ];
  var myStacksLength = 0;
  // to do a push :
  myStack[myStacksLenght++] = value; 
  // to do a pop :
  var val =  myStack[myStacksLength--]; // you might want to check 
                                       // myStacksLenght before...

So here, if you choose well the initial size, you have a stack with a
null memory cost for its whole life. If by any bad chance you lack space,
no special code is needed : the array will increase (with a memory cost), then
will stay in its new size. You might want to check at the end of one game the
highest stack length to avoid any such re-allocation during the game.

Rq for shift/unshift users : apply the same principle, with two indexes, to
avoid copy/reallocation. One index on the left, one on the right, both starting
at the middle of the array. Then you’ll be again in O(1) time. Better.
Don’t forget to re-center the indexes when they are ==.

If think you’ll easily figure out how to write min/max/indexOf/… such arrays, just use
the stackLength instead of the array length in the loops.

Rq about sLength.
You cannot add sLength as a property of your array  (myArray.sLength = 3; )
Because by doing so you’ll transform your Array ([]) into an Object ({}), since you’d do the same as writing myArray[‘sLength’]=3. Expect the performances to drop dramatically if
you do so.

8. Some last words about Typed Array.

In javascript, one thing that makes array slower is the fact that they are, as everything in javascript, untyped.
That’s why html5 invented the typed arrays, which are super fast since they are optimized for a given type of integer. I was very excited about that, but when benchmarking or watching other’s benchmark, i quickly saw that the performance gain was null : Firefox and Chrome at least allready recognize the arrays that are integer array, so no interest in the extra coding for comon cases.

An very interesting article is here :
http://www.html5rocks.com/en/tutorials/webgl/typed_arrays/

But thoses arrays are worth a look : maybe you have special needs that they might fullfill.

If you often copy, the set() method allows to copy very fast, close from ten times faster than by hand.
( http://jsperf.com/typedarray-set-aliased-arrays ).

Another interest might be the ‘clamped’ array, that avoid you to test everything you write. Again a special case, but the gain might be interesting.

Last example : since you can set different views on the same buffer, you can test for some properties using one view, and do some other computation with another view.

Example : imagine you want to make a ‘tint’ effect on your canvas : if a point has a R,G,B above a given value, set the color to be brighter. Imagine also most pixels are black (0,0,0,0), so you can be much faster by testing that first : create a 32 bits view on the image to detect the black pixels at once.

Here’s how you’ll do it :

var tintMe = function( ctx, sizeX, sizeY) {  
    // let's retrieve the image :  
    var myGetImageData = ctx.getImageData(0,0, sizeX,sizeY);
    // get the image buffer
    var buff = myGetImageData.data.buffer;    
    // now here's a clamped byte view on it
    var sourceBuffer8     = new Uint8ClampedArray(buff);  
    // now an int32 view on it.
    var sourceBuffer32     = new Int32Array(buff);
    // this is a view pointing on the alpha part of each pixel
    var Alpha8  = new Uint8ClampedArray(buff, 0, 4); 
    // Alpha8[j] = alpha component of j-th pixel.

    var sum = 0; 
    // iterate through pixels :
    for (var i=0, j=0, len= sourceBuffer32.length ; 
                                 i!= len ; 
                                          i++, j+=4) {
        if (sourceBuffer32[i]) {  // process non-null pixels
            // sum R + G + B 
            sum = sourceBuffer8[j] + 
                   sourceBuffer8[j+1] + 
                    sourceBuffer8[j+2] ;       
           if (sum > 300) { // bright enough ?
                            // -> make it brighter.                         sourceBuffer8[j] = sourceBuffer8[j]+10;                            sourceBuffer8[j+1] = sourceBuffer8[j+1]+10;
         sourceBuffer8[j+2] = sourceBuffer8[j+2]+10;
              // notice that no need to test for values <=255
              // since it is clamped
           }
       }
     }
    // write updated image
    ctx.putImageData(myGetImageData, 0, 0);
    };

you can test it here : http://jsbin.com/ugeloz/2

i didn’t take the time to benchmark it, but no doubt it will be incredibly faster.
Yet another great Javascript feature.

So that’s it, we’ve seen quite some pitfalls to avoid, now your arrays won’t stand in the way .

Don’t you feel better ?

Advertisements
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

19 Responses to Let’s get those Javascript Arrays to work fast

  1. I’ll Keep this in mind for my next project 😉

  2. thx interesting tipps here and nice writing style ^^

  3. Nick says:

    Good material, thanks for sharing the knowledge.

  4. Mark says:

    If you use arrrays such as [x,y] or [[x1, y1], [x2, y2]] as keys of nodes in a BST which have to be inserted and removed frequently enough then can you improve speed performance (of course storage) by somehow creating these arrays without all their associated properties and/or methods?

    • First thing : do not care at all about the numerous methods of an array : they are all stored once on the array prototype, and do not affect performances / memory.
      For a BSP tree, you’ll get best performances by using, behind the scene, a flat array, and to replace the pointers by indexes to that array. Use a growing array (meaning handle by yourself the length, and never reduce its size).
      Now if, within this array, you need small arrays to store 2D or 3D coordinates, be sure to use a pool of such object and allocate from/release to the pool every time you use them. Performances improvements are dramatic with a pool (your insertion/removal/… algorithms becomes plain basic algorithm that don’t make use of the memory system).

  5. Mark says:

    If you want to used fixed array sizes and need to stack objects (i.e, pointers to objects) then would you set this up by creating an object like:

    var obj = new MyObject();

    Afterwards, you set up the array something like:
    var stack = [obj,obj,obj,obj,obj,obj];

    Is there something one should be aware of in this type of situation with JavaScript?

    • I really wish i could answer properly but i’m not sure i understand.
      To answer anyway : If i need an array of object, knowing its maximum size, i use a ‘growing array’, meaning an array + an int var storing the used length :
      var maxObjectCount = 50;
      var objects = new Array(maxObjectCount);
      var objectCount = 0;
      adding an object : objects[objectCount++]= somevalue ;
      poping an Object : if (objectCount>0) somevar = objects[–objectCount];

      Obviously, a Class would allow to implement such a growing array in a neater way.
      Do not hesitate to tell if you’d like to see the ‘clean’ OOP version.

      ( In fact i code in Google Dart now, so quite of some javascript issues seems oddities to me, but i still keep a warm feeling about Javascript so i wouldn’t mind to write that ‘Class’
      😉 ).

      • Mark says:

        Sorry I wasn’t very clear. LOL.

        I was curious about allocating space for an array of objects. In your examples you do it for integers:

        var myStack = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ];

        But I wanted to do it for objects and was wondering if there are specific Javascript issues one needs to be aware of when coding for speed.

        Thanks for the tip on Dart.

    • Ok, now i get it, and wonder why i didn’t 🙂
      The way you suggest is the best way to go : create a dummy item, and fill an array with this item. This way, the js engine just can’t miss it’s an ‘Array of SomeClass‘ and it will optimize.
      (So obviously you need to handle a separate ‘actualLength’ var/property. )
      Don’t forget, when popping (pop()) or shortening the array, to reset the related array values, to allow for Garbage Collection to happen a.s.a.p.
      Use the dummy item to reset, that way you can’t make more obvious to the JS engine the type of the array.

  6. Pingback: Real-World JavaScript Performance Tips

  7. Pingback: Real-World JavaScript Performance Tips | Voxxed

  8. sosh101 says:

    Looks like #1 no longer applies (at least in chrome) : http://jsperf.com/typed-array-vs-js-array-vs-object V informative article though, thanks.

  9. Pingback: JavaScript faster shift, unshift and splice implementation | news-rss feed form stackoverflow

  10. Thanks for the article, could you please show an example of improving the `splice` method too? I mean how can I do so that the array is not re-indexed every time if I e.g. add an element between two existent at index `i`, e.g. doing `myArray.splice(i, 0, element)`?

  11. Thanks for the article, could you please show an example of improving the `splice` method too? I mean how can I do so that the array is not re-indexed every time if I e.g. add an element between two existent at index `i`, e.g. doing `myArray.splice(i, 0, element)`?

    (I am sorry, maybe I am posting twice but I didn’t see the post after I clicked on “Post Comment”).

    • Thanks for your interest. The splice method seem difficult to improve, since you can use it in so many ways. For the special use of inserting one element at place k, no easy solution comes to mind to avoid (length – k) copy of elements, disposal of an array and allocation of length+1 array. Reminder : be sure it is a real performance issue (== profile, easy in Chrome developer tools for instance ). If you have 500 of such objects used in a video game’s I.A., you might have to optimize, but if you have a few objects in a simple mouse-driven app, …
      Also check that there’s no other way to store/retrieve data that would avoid those inserts.

      An idea though if you decide to optimize : instead of inserting, you could ‘push’ inside a buffered array, and have a sort function that handles nicely the array’s last unused null elements, that will put it in the right place. This way you have a zero memory cost for your operation, and since sort is native, it might still be quite fast, although slower than splice i fear. But you’d avoid to awake the evil garbage collector …

      • TB says:

        If you want to insert an item in array at index, I think this would be good way? There’s a lot of copying and it breaks the “don’t iterate backwards” rule, but it doesn’t allocate any temporary arrays.

        function insert(item, index, array) {
        var i = array.length;
        while (i– >= index) array[i + 1] = array[i];
        array[index] = item;
        return array;
        }

      • I think you’re right. I didn’t benchmark that, but other solutions i can think of (splice, slice+push+concat, new Array) would surely be more expensive.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s