Skip to main content

Web: Robust Click-through Tracking

I have a web service that provides recommendations. I want to know when people click on the links. The site showing the links (imagine a book store) is separate from my web service.

Let's imagine a situation. My server generates some recommendations. The site shows those recommendations. After 10 minutes, my server goes down because both of my datacenters go down. I want to know if the user clicks on a link, but if my server is down, that must not block the user from surfing to that link.

I see how Google does click-through tracking. It's simple, non-obtrusive, and effective. However, as far as I can tell, it requires the server to be up. Well, they're Google ;) It's different when you're a simple web service that must never ever cause the customer's site to stop working.

I came up with the following:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

<html>
<head>
<title>Click-through Tracking</title>
<script type="text/javascript">
function click(elem) {
(new Image()).src = 'http://localhost:5000/api/beacon';
return true;
}
</script>
</head>

<body>
<p>
<a href="http://www.google.com"
onclick="return click(this);">click me!</a>
</p>
</body>
</html>
Note a few things. It doesn't mess with the href. It works whether or not the third-party server (localhost) is up. It does talk to a third-party server, but it does so using an image request; hence, the normal cross-site JavaScript constraints aren't imposed. It has all the qualities I want, and I actually think it's a pretty clever trick. However, I'm worried.

I like the fact that loading an image is asynchronous. I'm depending on that. However, what if it takes the browser 1 second to connect to my server, and only 0.1 seconds to move on to Google (because that's what the link links to). It's a race condition. As long as the browser makes the request at all, I'm fine. However, if it gives up on the request because DNS takes too long, I'm hosed.

Does anyone have any idea how the browsers will behave? Do my requirements make sense? Is there an easier way?

Comments

Jeff said…
You can use the image's onload property, but that does mean setting the browser's location.
jjinux said…
What if the image never loads because the server is down?
Brett Hoerner said…
Is it OK to lose a few clicks here and there?

Basically, the way Reddit does it is very neat and built to be super-fast for the client (no change in user experience). But you will "lose" a click if a user clicks-through to another site and never comes back to yours.

http://www.reddit.com/r/programming/comments/66jmp/dear_reddit_devs_you_guys_are_brilliant_thanks/c02zsjw
Brett Hoerner said…
Hmm, no autolink on Blogger, lame.

How reddit tracks clicks.
jjinux said…
That's a great tip, thanks.

Unfortunately, it won't work in my particular case. Remember, we're a third-party recommendation server, so we don't see the user's cookies. We'd have to instrument code on all of our customers servers to get access to those cookies, which would be a nightmare.

I definitely think this trick is worth keeping in mind, though.
Max Ischenko said…
Here on developers.org.ua we're using Google Analytics to track the clicks for us.

See http://www.developers.org.ua/static/js/ga-ext.js
jjinux said…
Thanks for the tip.
jjinux said…
Hmm, I think this problem suffers from the Heisenberg uncertainty principle ;)

I can have really reliable tracking or the ability for my tracking server to go down, but it's really hard to have both. I wish I could tell the Web browser, "Hey, I just set img.src. I know you're trying to go to another page. That's cool and all, but can you finish loading img.src too?"
jjinux said…
(Which is to say, I think my solution actually does suffer from the race condition that I hypothesized. It works sometimes, but sometimes the browser doesn't actually bother downloading the image.)
jjinux said…
Heh, prior art: http://www.webmasterworld.com/forum91/2420.htm
jjinux said…
Heh, I came up with a solution :-D

Here's a version of the click function that will ping the server if the server is up, but functions correctly if the server is not up:

function click(elem) {
var href = elem.href; // Avoid memory leaks.
function go(event) {
location.href = href;
};
var img = new Image();
img.addEventListener('load', go, true);
img.addEventListener('error', go, true);
img.src = 'http://localhost:5000/api/beacon';
return false;
}
jjinux said…
This uses syntax that works on IE and hopefully avoids memory leaks. I can't use a JavaScript framework because I'm a third-party service that must keep a really small footprint.

function click(elem) {
var href = elem.href;
function go(event) {
location.href = href;
};
var img = new Image();
img.onload = go;
img.onerror = go;
img.src = 'http://localhost:5000/api/beacon';
img = null; // Avoid memory leaks.
elem = null;
return false;
}
jjinux said…
It works :-D
Unknown said…
I know you aren't doing it here, but be careful with new Image() on IE and appending that image into the DOM.

http://support.microsoft.com/default.aspx/kb/927917

http://clientside.cnet.com/code-snippets/manipulating-the-dom/ie-and-operation-aborted/
jjinux said…
Jon, good tip. Thanks!

Popular posts from this blog

Drawing Sierpinski's Triangle in Minecraft Using Python

In his keynote at PyCon, Eben Upton, the Executive Director of the Rasberry Pi Foundation, mentioned that not only has Minecraft been ported to the Rasberry Pi, but you can even control it with Python. Since four of my kids are avid Minecraft fans, I figured this might be a good time to teach them to program using Python. So I started yesterday with the goal of programming something cool for Minecraft and then showing it off at the San Francisco Python Meetup in the evening.

The first problem that I faced was that I didn't have a Rasberry Pi. You can't hack Minecraft by just installing the Minecraft client. Speaking of which, I didn't have the Minecraft client installed either ;) My kids always play it on their Nexus 7s. I found an open source Minecraft server called Bukkit that "provides the means to extend the popular Minecraft multiplayer server." Then I found a plugin called RaspberryJuice that implements a subset of the Minecraft Pi modding API for Bukkit s…

Apple: iPad and Emacs

Someone asked my boss's buddy Art Medlar if he was going to buy an iPad. He said, "I figure as soon as it runs Emacs, that will be the sign to buy." I think he was just trying to be funny, but his statement is actually fairly profound.

It's well known that submitting iPhone and iPad applications for sale on Apple's store is a huge pain--even if they're free and open source. Apple is acting as a gatekeeper for what is and isn't allowed on your device. I heard that Apple would never allow a scripting language to be installed on your iPad because it would allow end users to run code that they hadn't verified. (I don't have a reference for this, but if you do, please post it below.) Emacs is mostly written in Emacs Lisp. Per Apple's policy, I don't think it'll ever be possible to run Emacs on the iPad.

Emacs was written by Richard Stallman, and it practically defines the Free Software movement (in a manner of speaking at least). Stal…

JavaScript: Porting from react-css-modules to babel-plugin-react-css-modules (with Less)

I recently found a bug in react-css-modules that prevented me from upgrading react-mobx which prevented us from upgrading to React 16. Then, I found out that react-css-modules is "no longer actively maintained". Hence, whether I wanted to or not, I was kind of forced into moving from react-css-modules to babel-plugin-react-css-modules. Doing the port is mostly straightforward. Once I switched libraries, the rest of the port was basically:
Get ESLint to pass now that react-css-modules is no longer available.Get babel-plugin-react-css-modules working with Less.Get my Karma tests to at least build.Get the Karma tests to pass.Test things thoroughly.Fight off merge conflicts from the rest of engineering every 10 minutes ;) There were a few things that resulted in difficult code changes. That's what the rest of this blog post is about. I don't think you can fix all of these things ahead of time. Just read through them and keep them in mind as you follow the approach above.…