Skip to main content

Async: To Be or Not To Be

Just because I have to use a callback-oriented style on the client doesn't mean I want to use a callback-oriented style on the server. Now, before anyone gets all upset and tells me that I don't know the difference between async and a kitchen sink, let me explain :)

The client is necessarily an event-oriented place. If I don't know which button the user is going to press, it makes a lot of sense to use a different callback for each button. The server is different. If I'm waiting for the result of a database query before I can continue processing a request, it sure is convenient to just block and wait.

My key point is that it's important to separate what style you want to code with and what performance and scalability characteristics you want. You shouldn't necessarily pick a callback-oriented style just because you want the performance and scalability characteristics of asynchronous networking APIs.

My favorite two examples are gevent and Erlang, but Go is similar. When you code using gevent or Erlang, your code looks like synchronous, blocking code. However, below the covers, they use asynchronous networking APIs. Now, before anyone tells me that it's impossible, buggy, or that it'll never work, let me point out that these tricks have been in production for decades at Ericsson, Yahoo Groups, and IronPort Cisco.

Furthermore, I should point out that asynchronous networking APIs aren't a perfect fit for every problem. For instance, if your goal is to send 10 gigabytes of information to another server, it turns out that synchronous networking APIs will actually outperform asynchronous networking APIs. The reason asynchronous networking APIs are so popular is because they can handle a larger number of clients than synchronous networking APIs can and because they use less memory than a large number of threads, which each have to have their own stack. gevent and Erlang can handle a large number of clients, don't use up much memory, and don't require a real OS-level stack per client.

So what's my problem with the callback-oriented style? I find it a lot harder to read. I've coded projects in Twisted, Node.js, etc., and I prefer the gevent approach. You get roughly the same performance and scalability characteristics, but with much easier to read code. Of course, what's readable to me may not be readable to other people. I've met people who are perfectly happy using Twisted Web 1 and don't think that callback-oriented code poses any real challenge.

If you're interested in hearing more about my thoughts on async and concurrency, check out my other blog posts, which include a link to my Dr. Dobb's Journal article on Python concurrency.


wleslie said…
I know that a lot of arguments about async these days are around performance, but the primary motivation for it actually is conceptual. I'm sure you've read the problem with threads, since it gets thrown around on #python quite a bit, but a better one-page explanation of the conceptual simplicity of async is the distributed computing example in E.

I imagine the sync/async divide is similar to the functional/stateful divide, where there are implementation details that drive performance issues, but the more interesting aspect is a matter of how we think about problems, and what problems become significantly simpler to understand when posed asynchronously. Something you may have considered is what it would take to design a sensible memory model for python or javascript in a threaded vs an async world (if you imagine they are mutually exclusive).
Sam Rushing said…
Event-driven stuff doesn't scale. 8^)

You might be able to write an event-driven HTTP server. But once you try to combine it with an event-driven DNS resolver and an event-driven database client, the state space has exploded.

You might then write a set of tools - help from the compiler or runtime - to essentially reinvent a cooperative threading package. And that works great. But at some point the difference between your threading package and the next one comes down to semantics.

The true Holy Grail of [server] scalability would be to route around the barrier presented by the operating system [which is itself a coroutine system] and have an OS with no kernel/user wall and without artificial limits on scalability. Depending on your politics, you could call it an "in-kernel server" or a "user-space tcp stack".
Peter Zsoldos said…
I don't have enough experience with C# 5's await keyword to judege it properly, but it sure makes for an easier read. I wonder if something like that would be possible with some helper function in python... E.g.: with await(asyncMethodInvoication): process result

And the point of async not always being the right solution is certainly a valid one!
jjinux said…
verte, sorry I haven't read anything more than the abstract for "the problem with threads", and I haven't read "the distributed computing example in E." It's going to take me a while to get to those. In the meantime, do you care to summarize?

I actually wasn't arguing for threads. I don't actually like threads. Erlang has something it calls processes, which I think is ideal. gevent has greenlets which are actually a lot more deterministic than threads.
jjinux said…
Peter, C#'s await keyword reminds me of EventMachine in Ruby. It looks like it's a way to use blocks as callbacks. Is that right?

I do think having blocks makes callback-oriented programming easier to read, but it's still not as easy to read as, say, gevent's approach.
wleslie said…
The distributed computing example is a summary, so I won't try to summarise it here. The introduction is about a page.

"The problem with threads" deals with the explosive nondeterminism resulting from shared-state concurrency. To be clear, I would also consider gevent to be shared state concurrency*, in that you can't look at a function and be able to tell from its body if it contains a context switch or not - that could be hidden away inside some function that we call.

This is a significant burden on the programmer. As someone who maintains a moderately-sized swing application rife with concurrency bugs, I think I can say, until programmers are forced to think about concurrency from the outset, maintenance is a battle between introducing new code and trying to figure out the way it interacts with existing locks and tasks.

But don't take my word for it: the article mentions a concrete example of an application written by concurrency experts that mysteriously deadlocked once they bought a machine with more cores.

In general, I'm all for runtimes and compilers that figure out details so you don't have to. I like dynamic types, I like garbage collection. But concurrency is a more complicated subject, I think, and it deserves very explicit language from the programmer. (Concurrent Haskell is an interesting example - the language is functional, the concurrency features serve only to give greater control over communication to the programmer).

* important side note: finalisers actually introduce shared state concurrency in many languages, python included. See eg. unexpected concurrency
jjinux said…
verte, thank you for the excellent comment! In general, I agree with you.

gevent is non-deterministic, but it's not as bad as threads which can context switch at any time. Since it can only context switch when doing IO, the problem isn't nearly as heinous. Sure, that's not perfect, but it's a lot easier for me to wrap my brain around than threads.

As for multi-threading in Swing, I wrote some quick tricks here ( Basically, I avoid mutable, shared state like the plague.
gus said…
Since you mentioned it, I'm curious why didn't Ironport use Erlang instead of developing a new concurrency framework for a slow interpreted language like Python?

If you were starting today (2013), do you think Erlang is the right tool for network appliances like Iron port's?

BTW, great blog!


jjinux said…
Thanks, Gus. Erlang wasn't as popular back then as it is now. My guess is that none of the early IronPort people even knew about it. In contrast, Sam Rushing already knew how to solve the async problem in Python.

Python will never be as good as Erlang at what Erlang does. Hence, for certain network servers, it makes a lot of sense to use Erlang. However, Python has so many other advantages that it probably makes sense to use Python (and gevent) as the "main" language for a company.

Popular posts from this blog

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;) In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3 , Erlang , and the original Lisp machine . The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts. Now, I'm sure that many of these ideas have already been envisioned within , LLVM , Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway. Rather than wax philosophical, let me just dump out some ideas: Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code.&quo

Haskell or Erlang?

I've coded in both Erlang and Haskell. Erlang is practical, efficient, and useful. It's got a wonderful niche in the distributed world, and it has some real success stories such as CouchDB and Haskell is elegant and beautiful. It's been successful in various programming language competitions. I have some experience in both, but I'm thinking it's time to really commit to learning one of them on a professional level. They both have good books out now, and it's probably time I read one of those books cover to cover. My question is which? Back in 2000, Perl had established a real niche for systems administration, CGI, and text processing. The syntax wasn't exactly beautiful (unless you're into that sort of thing), but it was popular and mature. Python hadn't really become popular, nor did it really have a strong niche (at least as far as I could see). I went with Python because of its elegance, but since then, I've coded both p