synchronous vs. asynchronous tags - what’s the big deal?

Sep 21, 2012 - Krux Engineering - 0
  • Twitter
  • Facebook
  • MySpace
  • StumbleUpon
  • Reddit
  • Slashdot
  • Digg
  • Del.icio.us
  • E-mail

Today's guest blog is from the Krux engineering team.  Read on...

If the web is a human body, JavaScript tags – the bits of code governing the execution of web pages – are like its nerve endings.  They are the means by which websites sense, respond, execute, measure, and remember.  They issue commands to load most page content and deliver nearly every advertising message.  Further, they serve as the gateways through which information is shared: across web pages; between different websites; and among the countless media companies, marketers, and service providers involved in creating a user’s web experience.
But all that activity comes with a cost.  Often, tag activity slows down pages, which can have a detrimental effect on user experience and the web operator’s bottom line.  Don’t believe me?  Ask these guys:

  • Amazon has reported that a 0.1 second delay in page load time can translate into a 1% drop in customer activity.
  • AOL observed a ~30% variance in page views/visit based on load time alone – slow pages drive users away. 
  • Perhaps most importantly, page load time is an increasingly important factor in the Google and Bing search rankings, a critical consideration for any web-enabled business.   

In this post, I aim to provide some valuable context and perspective to help you understand tag serving challenges, the dynamics at play, and potential solutions.  And if we’re lucky, you may even come away with some small talk fodder so you can look smart at your next tech or media industry cocktail party. 

About The Problem

As any front end performance guru will attest, JavaScript is the #1 cause of slow web pages.  Why?  Because it's blocking.  That's geek speak for "preventing everything else after it from happening.”  The impact of that blocking is illustrated in the waterfall graph below, showing the behind-the-scenes requests for all the assets needed on a given page. 



The key takeaway is that while the JavaScript is loading, nothing else is happening.  This means that users are stuck waiting for that JavaScript to load, and other content won't load until it’s downloaded and parsed.  Browsers render page elements, including tags, synchronously, meaning one page element can’t load until the one before it has.  In the case above, one JavaScript file held up the rendering of the page for nearly 1.5 seconds.  And as noted earlier, 1.5 seconds may not sound like much, but it can have an enormous impact on the bottom line.

You've probably heard some chatter about the potential for asynchronous tags in mitigating the risk of JavaScript-related page slow down.  That’s basically a process by which JavaScript calls are parallelized, eliminating the blocking dynamic illustrated above.  There is potential there, particularly for JavaScript tags that call non-visible page elements, such as measurement or analytics tags.  For visible page elements, such as images, content, or ads, things get more complicated.  Let’s explore what synchronous and asynchronous really mean within the context of web browsers.

Pro tip: You can analyze your own pages at webpagetest.org, and you will be able to generate snazzy waterfall graphs like this one, as well as gain other insight on what is slowing down your pages.
But wait, there’s more… The problem of synchronous calls creates challenges on the server as well.  For an introduction to the benefits of non-blocking, event-driven programming concepts, view Ryan Dahl’s introduction to node.js.


So, why do browsers rely on synchronous loading?  Shouldn't they parallelize the calls?  If they load them asynchronously – making multiple requests simultaneously – wont they be able to prevent these kinds of roadblocks? 

Oh, they'd love to.  The reason why they can't is because of an arcane JavaScript construct called document.write.  It’s a method for inserting something into a web page, such as a string of text or a tag that calls for some page element to be loaded (like an ad creative or an image).  It’s a handy tool and widely used, but there’s a catch:  document.write expects to alter the page inline, inserting content as the page is rendering

This means that the use of document.write requires, almost without exception, that all page elements be loaded synchronously. 

To illustrate, consider the following example of instructions given to the browser.  Here, we’re inserting the word “Nick” between two blocks of content on the page. 

<html> content that comes before, <script>document.write(‘Nick’)</script>, content that comes after</html>

We’d expect the resulting output to look like:

[content that comes before], Nick, [content that comes after]

That’s precisely what document.write was designed for, the inline, ad hoc insertion of some content or page element.  But what if the document.write is in a remote script, one that calls an external source to provide an element that is being rendered?  Example:

<html> content that comes before, <script src=”remote.js”></script>, content that comes after</html>

And remote.js contains:

document.write(“Remote Nick”);

Still not a problem, as long as the browser blocks and waits for remote.js to come back.  If executed synchronously (or, sequentially), these instructions would result in the following:

[content that comes before], Remote Nick, [content that comes after]

However, if we execute asynchronously (or, non-sequentially), it will take a while to execute that remote.js round trip, and we might end up something that looks like this:

[content that comes before], [content that comes after], Remote Nick

This isn’t what we wanted at all.  The browser loaded elements as they were returned, and given the nature of the document.write construct, when the remote.js responded, it was able to insert ‘Remote Nick’ in the wrong place on the page. 

Now, imagine that ‘Remote Nick,’ instead of being a couple of random words, was actually an advertisement that got mangled as the page loaded.  That’s something with real dollars-and-cents implications for web publishers and the advertisers that help keep the lights on.  Ultimately, the potential risks to web operators face are too great and the use of document.write too pervasive, limiting browser-level asynchronous loading.  Unless the browser is guaranteed that the remote script does not contain document.write, it must block and wait for the return to ensure that the web page renders properly.

So, if document.write is so terrible, why don't we just get rid of it altogether?  A lot of people would love to.  It slows down script execution, it complicates asynchronous loading, and it can even completely blank the page in older browsers.  While document.write is considered bad practice, and there are viable alternatives out there (e.g., element creation, innerHTML, and libraries), it’s deeply rooted in our Internet’s infrastructure, especially in the advertising technology ecosystem.  Until the entire ad stack – from ad servers, to RTB platforms, all the way to each and every ad creative – can guarantee that document.write won’t be used, browsers are stuck blocking.  In short, document.write is not going away anytime soon. 

Pro tip:  When you load JavaScript synchronously from a third-party, what happens to your site when that third-party is down, or simple very, very slow?  You probably have guessed by now, it blocks the rest of the content from loading.  If you're running a third-party’s tag synchronously, you are well advised to ask questions about the quality and speed of their infrastructure. 

When it comes to synchronous loading, their performance is your performance.


Potential Solutions

A lot of neurons are being pointed at this challenge, and while progress is being made, most of the solutions that are emerging aren’t quite there yet. 

Browser manufacturers have tried to solve this problem.  With HTML5, browsers have added extra attributes to <script> tags that allow the developer to specify that the script is safe to be downloaded asynchronously.  The catch?  This only works if the call is guaranteed to not use document.write (which, as previously mentioned, is not feasible given its broad use, particularly in the ad tech space).

Tag writing libraries have also sprung up to try to address this problem.  They promise to convert scripts to load asynchronously safely.  Some of them have made valiant attempts, but there always seem to be pretty big tradeoffs. 

  • Head.js showed promise for a while, but their solution mandates that content is cached (because they use what is called a pre-loading hack); that just won’t do for the advertising ecosystem, with decisions happening in real-time about which ad appears where. 
  • More recently an open source project called writeCapture has garnered some attention, but at Krux, we’ve found some edge cases where content can be malformed given the current approach. 

These types tag writing approaches that override document.write, need to be handled very carefully.  It’s difficult – almost impossible – to process asynchronous streams of data flowing in, manage the original intended placement, and keep the HTML intact and well formed, and do so across all browsers.

How about iframes?  If the ad serving stack is part of the problem, what about just using iframes for the delivery of all the ads?  Inline frames are a potential solution, because they create a parallel execution path for browsers.  In fact, there are a lot of benefits with iframes when delivering ads and other third-party content.

But, every seasoned Ad Operations professional will tell you about the woes of iframes.  The simple truth is that many ad formats require direct access to the top level page, such as expandable ads, rollovers, roadblocks, and ads with advanced contextual or dynamic targeting.  Today, many website operators are faced with a terrible dilemma – accepting the user experience implications of slow pages or not providing paying customers with a guarantee that ad campaigns will deliver correctly.

Sidebar:  Google’s new magic Asynchronous Implementation in the Google Publisher TagAll it’s doing behind the scenes is using iframes.  Want to deliver creatives that can’t be inside iframes?  Those won’t work asynchronously, and you will have to use their synchronous implementation.  This was surprisingly hard to find details on, but their help system explains it
Net:  Before choosing Google’s asynchronous implementation, make sure that all of your creatives will work inside an iframe.

Synchronous Tag Writing, Redefined

Krux’s approach to writing tags in SuperTag solves these problems.  We’ve been at this for a while now, and we’ve come up an approach that takes the learnings and best practices from work to date and adds a fair amount of our own brand of Krux magic.  By using something we call Dom Proxies to buffer, stream, and ultimately write the tags, by leveraging the native behavior of the browser.  The benefits? 

  • It works with five 9s of reliability, deliver page elements correctly.
  • We automagically convert synchronous scripts to asynchronous, allowing the rest of your content to load, unblocked.
  • We don’t use iframes, so most any ad types or content elements are supported. 

We think SuperTag is the best tag management solution on the market, and we’re planning a follow up post to go a level deeper on why that’s so.  So, stay tuned…

 

Here's what other people had to say

There are no comments yet... be the first!

What about you?

Name

Email

Remember my personal information
Notify me of follow-up comments?

Website