Skip to main content

Web: Robust Click-through Tracking

I have a web service that provides recommendations. I want to know when people click on the links. The site showing the links (imagine a book store) is separate from my web service.

Let's imagine a situation. My server generates some recommendations. The site shows those recommendations. After 10 minutes, my server goes down because both of my datacenters go down. I want to know if the user clicks on a link, but if my server is down, that must not block the user from surfing to that link.

I see how Google does click-through tracking. It's simple, non-obtrusive, and effective. However, as far as I can tell, it requires the server to be up. Well, they're Google ;) It's different when you're a simple web service that must never ever cause the customer's site to stop working.

I came up with the following:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

<html>
<head>
<title>Click-through Tracking</title>
<script type="text/javascript">
function click(elem) {
(new Image()).src = 'http://localhost:5000/api/beacon';
return true;
}
</script>
</head>

<body>
<p>
<a href="http://www.google.com"
onclick="return click(this);">click me!</a>
</p>
</body>
</html>
Note a few things. It doesn't mess with the href. It works whether or not the third-party server (localhost) is up. It does talk to a third-party server, but it does so using an image request; hence, the normal cross-site JavaScript constraints aren't imposed. It has all the qualities I want, and I actually think it's a pretty clever trick. However, I'm worried.

I like the fact that loading an image is asynchronous. I'm depending on that. However, what if it takes the browser 1 second to connect to my server, and only 0.1 seconds to move on to Google (because that's what the link links to). It's a race condition. As long as the browser makes the request at all, I'm fine. However, if it gives up on the request because DNS takes too long, I'm hosed.

Does anyone have any idea how the browsers will behave? Do my requirements make sense? Is there an easier way?

Comments

Jeff said…
You can use the image's onload property, but that does mean setting the browser's location.
jjinux said…
What if the image never loads because the server is down?
Brett Hoerner said…
Is it OK to lose a few clicks here and there?

Basically, the way Reddit does it is very neat and built to be super-fast for the client (no change in user experience). But you will "lose" a click if a user clicks-through to another site and never comes back to yours.

http://www.reddit.com/r/programming/comments/66jmp/dear_reddit_devs_you_guys_are_brilliant_thanks/c02zsjw
Brett Hoerner said…
Hmm, no autolink on Blogger, lame.

How reddit tracks clicks.
jjinux said…
That's a great tip, thanks.

Unfortunately, it won't work in my particular case. Remember, we're a third-party recommendation server, so we don't see the user's cookies. We'd have to instrument code on all of our customers servers to get access to those cookies, which would be a nightmare.

I definitely think this trick is worth keeping in mind, though.
Max Ischenko said…
Here on developers.org.ua we're using Google Analytics to track the clicks for us.

See http://www.developers.org.ua/static/js/ga-ext.js
jjinux said…
Thanks for the tip.
jjinux said…
Hmm, I think this problem suffers from the Heisenberg uncertainty principle ;)

I can have really reliable tracking or the ability for my tracking server to go down, but it's really hard to have both. I wish I could tell the Web browser, "Hey, I just set img.src. I know you're trying to go to another page. That's cool and all, but can you finish loading img.src too?"
jjinux said…
(Which is to say, I think my solution actually does suffer from the race condition that I hypothesized. It works sometimes, but sometimes the browser doesn't actually bother downloading the image.)
jjinux said…
Heh, prior art: http://www.webmasterworld.com/forum91/2420.htm
jjinux said…
Heh, I came up with a solution :-D

Here's a version of the click function that will ping the server if the server is up, but functions correctly if the server is not up:

function click(elem) {
var href = elem.href; // Avoid memory leaks.
function go(event) {
location.href = href;
};
var img = new Image();
img.addEventListener('load', go, true);
img.addEventListener('error', go, true);
img.src = 'http://localhost:5000/api/beacon';
return false;
}
jjinux said…
This uses syntax that works on IE and hopefully avoids memory leaks. I can't use a JavaScript framework because I'm a third-party service that must keep a really small footprint.

function click(elem) {
var href = elem.href;
function go(event) {
location.href = href;
};
var img = new Image();
img.onload = go;
img.onerror = go;
img.src = 'http://localhost:5000/api/beacon';
img = null; // Avoid memory leaks.
elem = null;
return false;
}
jjinux said…
It works :-D
Unknown said…
I know you aren't doing it here, but be careful with new Image() on IE and appending that image into the DOM.

http://support.microsoft.com/default.aspx/kb/927917

http://clientside.cnet.com/code-snippets/manipulating-the-dom/ie-and-operation-aborted/
jjinux said…
Jon, good tip. Thanks!

Popular posts from this blog

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;) In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3 , Erlang , and the original Lisp machine . The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts. Now, I'm sure that many of these ideas have already been envisioned within Tunes.org , LLVM , Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway. Rather than wax philosophical, let me just dump out some ideas: Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code.&quo

Haskell or Erlang?

I've coded in both Erlang and Haskell. Erlang is practical, efficient, and useful. It's got a wonderful niche in the distributed world, and it has some real success stories such as CouchDB and jabber.org. Haskell is elegant and beautiful. It's been successful in various programming language competitions. I have some experience in both, but I'm thinking it's time to really commit to learning one of them on a professional level. They both have good books out now, and it's probably time I read one of those books cover to cover. My question is which? Back in 2000, Perl had established a real niche for systems administration, CGI, and text processing. The syntax wasn't exactly beautiful (unless you're into that sort of thing), but it was popular and mature. Python hadn't really become popular, nor did it really have a strong niche (at least as far as I could see). I went with Python because of its elegance, but since then, I've coded both p