High Performance JavaScript techniques

High Performance JavaScript is a great introduction to performance considerations but the coding guidelines will become outdated as JS implementations advance.

Variable scope

Each JS function has an internal scope chain. When a function executes, the JS engine creates an execution context that defines the execution environment of the function. This initially contains the function scope chain, fixed on function creation, appended to the activation object, containing locals/arguments for this execution.

Elements at the beginning of the scope chain, like local variables, are accessed faster during execution. Storing a reference locally (though this has a smaller effect on modern browsers) can mitigate the depth penalty, producing a significant speedup in older browsers if the object property is read multiple times.

DOM manipulation is costly

DOM/JS browser implementations are often in separate modules, requiring an expensive ‘bridge crossing’ when communicating.

To improve performance stay within ECMAScript and consider converting HTMLCollections to arrays if they are going to be iterated over (always cache the length). Filter elements using native methods and use CSS query selectors for speed improvements over tree traversals (verbose checking). Old IE versions suffer significant performance degradation with :hover CSS selectors.

Event delegation improves performance by reducing the amount of event handlers. The event handler is added to the parent element and captures all related events on its children through event bubbling. Further propagation can be stopped if the event should not bubble (preventDefault/stopPropagation or returnValue/cancelBubble). Workarounds are required to delegate the the focus event in all browsers.

Causes of DOM reflows

  • Visible DOM elements are added or removed
  • Elements change position/size
  • Content changes
  • Initial page renders
  • Window resize

Reflow prevention

Avoid flushing the render queue to improve performance. It is flushed when offset*/scroll*/client* properties are accessed after the DOM has been manipulated.

When performing multiple changes it is more efficient to remove the element from the document flow by appending it to a DocumentFragment. Then apply all the changes and reappend to the main document.

If performing animation take the animated element out of the flow of the page so that the rest of the document is only resized once.

Code performance

Algorithms and loops

Lookup tables implemented with JS objects are extremely fast. Of all the C-style style loop constructs, for-in is the only construct that is significantly slower. Caching array.length can provide performance improvements.

Duff’s Device decreases the number of loop iterations and increases performance (significantly past 1000 iterations) as the expensive of causing the function to become unreadable. On the other hand, functional iteration using arr.forEach is more readable but takes 8x the time to execute. It is a trade-off between readability/maintainability and performance.

The call stack size is only 500 in Safari 3.2. Avoid tail recursive functions for high iterations and opt for loops instead. As with most languages, memoization is a useful technique to reduce execution time by calculating results once.

Strings

Array.join on a number of strings has the highest performance in most browsers. Use native string functions to search for literal strings instead of regular expressions. The YUI compressor performs string folding at build time, improving performance and reducing code size. IE7 has quadratic running time for native string concatenation.

With regular expressions, backtracking can destroy performance (consider (A+A+)+B), especially when using non-greedy quantifiers. Atomic groups can be used to prevent backtracking (natively as (?>regex) or emulated with (?=(regex))\1). Many performance problems can be preventing by ensuring that two parts of a regex cannot match the same parts of a string (use an emulated atomic group as a last resort). Cause more failures (use cat|bat instead of [cb]at).

HPJ goes through a number of String.trim alternatives and proposes a hybrid solution. Browser performance characteristics differ so wildly that I recommend browser detection to pick between the solutions. Lazy loading replaces the method on first use so that subsequent calls only use the specialized version; conditional advance loading uses upfront detection and replacement before first-use.

Browser threading

Finish JavaScript tasks quickly because most browsers stop queuing tasks for the UI thread while the JavaScript is executing. The maximum length of time for a single JS call is then 100ms (due to Miller’s usability research). If it cannot be executed within that time, yield control to the UI thread using timers with at least 25ms to allow for most UI updates.

Web workers are separate processing threads for JS with no ties to the browser UI.

Data transmission

To cache responses use GET and set an expires header (maximum length is a year RFC 2616). This can be set in Apache directives using ExpiresDefault. It is also possible to use a local URL to object cache in JS.

AJAX requests

XMLHttpRequest allows asynchronous sending/receiving of data. The server response can be interacted with while the request is still being transferred by listening for the readyState event. Use GET for idempotent requests that can be cached. POST requires at least two HTTP packets but is necessary when the parameters exceed 2048 characters.

Dynamic script tag insertion can be used to request data from a server on a different domain.

Multipart XHR uses base64 encoding, in conjunction with a delimiter not present in base64, to provide multiple resources from one HTTP request. readyState can be used to process completed sections in the XHR response before the entire response is downloaded. None of the resources can be cached by the browser.

Beacons are Image objects used to send GET data to the server. They have very little overhead but response data is limited to width/height. Usually there is no response, instead a 204 No Content header.

Data formats

Favor lightweight formats.

XML is extremely verbose and slow to parse. It has no place in high performance AJAX. Parsing XML can be made faster by using XPath instead of DOM methods like getElementsByTagName.

JSON is the cornerstone of high-performance AJAX. HPJ presents different JSON formats: verbose, simple and array. Array is the fastest but least readable. Custom formats are the fastest but require custom parsing for each format.

Loading JS into the page

Place <script> at the bottom of the <body> because browsers block until the head is completely loaded. JS also requires up-to-date style information and preceding CSS imports can prevent asynchronous downloading. The defer attribute prevents blocking, scripts are downloaded asynchronously and executed just before the onload event.

Dynamically created <script> elements begin downloading when added to the DOM without blocking other page processes. Add them to the head to prevent IE throwing exceptions. When the code is ready the load event is fired (readyState in IE), see the book for details about these events.

The recommended nonblocking pattern is to download code to dynamically load JS then use that to download and execute the rest of the JavaScript. This increases the perceived loading time of the page. Loaders in YUI 3, LazyLoad and LABjs take different approaches and are worth learning.

Building and deploying

Common portable build tools are Apache Ant and make. Apache Ant can be used to concatenate all the JavaScript code into a single file. The C preprocessor can be used with JavaScript to remove debug statements before release. YUI compressor can be used for JavaScript minification. The Closure Compiler from Google (2009) has advanced optimizations.

Packer provides decompression at runtime but this incurs a fixed cost of 300ms. smasher preprocesses JS files in PHP.

(read October 2011)


See also