Wednesday, November 03, 2010

JavaScript: A Second Impression of NodeJS

When I first heard about NodeJS, my reaction was, "Why would I use JavaScript on the server when there are similar continuation-passing-style, asynchronous network servers such as Twisted and Tornado in Python already? Python is a nicer language. Furthermore, I prefer coroutine-based solutions such as gevent and Concurrence." However, after watching this video, Ryan Dahl, the author of NodeJS, has convinced me that NodeJS is worthy of attention.

First of all, NodeJS is crazy fast. Dahl showed one benchmark that had it beating out Nginx. (However, as he admitted, it was an unfair comparison since he was comparing NodeJS serving something out of memory with Nginx serving something from disk.) It's faster than Twisted and Jetty. That last one surprised me.

Dahl argued against green thread systems and coroutine-based systems due to the infrastructural overhead and magic involved. He argued that he doesn't like Eventlet because it's too magical both at an implementation level and also because he doesn't like multiple stacks. As I said, I'm not at all convinced by his arguments, but it reassures me that he was at least able to talk about such approaches. When I brought them up during Douglas Crockford's talk on concurrency, Crockford just gave me a blank, dismissive stare.

Dahl argued that by using callbacks for all blocking calls, it's really obvious which functions can block. As much as I dislike "continuation-passing-style", he makes a good point.

Dahl argued that NodeJS has an advantage over EventMachine (in Ruby) and Twisted (in Python) because JavaScript programmers are inherently comfortable with event-based programming. Dahl also argued that JavaScript is well suited to event-based programming because it has anonymous functions and closures. Ruby and Python has those things too, but Dahl further argued that it's very easy to accidentally call something in Ruby or Python that is blocking since it's not easy to know if something blocks or not. In contrast, NodeJS is built from the ground up so that pretty much everything is non-blocking.

NodeJS has built in support for doing DNS asynchronously, and it supports TLS (i.e. SSL). It also supports advanced HTTP features such as pipelining, chunked encoding, etc.

NodeJS has an internal thread pool for making things like file I/O non-blocking. That was a bit of a surprise since I wasn't even sure it could do file I/O. The thread pool is only accessible from C since he doesn't feel most programmers can be trusted with threads. He feels most programmers should only be trusted with the asynchronous JavaScript layer which is harder to screw up.

Dahl still feels that it's important to put NodeJS behind a stable web server such as Nginx. He admits that NodeJS has lots of bugs and that it's not stable. He's not at all certain that it's free of security vulnerabilities.

In general, Dahl believes you'll only need one process running NodeJS per machine since it is so good at not blocking. However, it makes sense to use one process per core. It also makes sense to use multiple processes when you need to do heavy CPU crunching. At some point in the future, he hopes to add web workers a la HTML5.

NodeJS has a REPL.

Dealing with binary in NodeJS is non-optimal because dealing with binary in JavaScript sucks. NodeJS has a buffer class that sits outside V8. V8's memory management makes it impossible to expose pointers since memory may be moved around the heap by the garbage collector. Dahl would prefer to deal with binary in a string, but that's not currently possible. Furthermore, pushing a big string to a socket is currently slow.

Dahl works for Joyent.

Although I still feel Erlang has a real edge in the realm of asynchronous network servers, Erlang is difficult for most programmers to adapt to. I think NodeJS is interesting because it opens up asynchronous network programming to a much wider audience of programmers. It's also interesting because it allows you to use the same programming language and in some cases the same libraries on both the client and the server. I'm currently looking at NodeJS because I want to use socket.io for Comet aka "real time" programming. Only time will tell how NodeJS works out in practice.

12 comments:

Shannon -jj Behrens said...

NodeJS scales better than Twisted and Tornado according to http://news.ycombinator.com/item?id=1088699.

Will Moffat said...

Hi JJ, very interested to hear how your NodeJS adventures go. Keep us posted.

Ben Ford said...

Hi JJ,

The latest release of eventlet has something interesting if you're looking for async network programming options in python: non-blocking zeromq support.

Here's a little example:
http://bitbucket.org/which_linden/eventlet/src/tip/examples/distributed_websocket_chat.py

Which is a websocket based multiserver chat application in less than 140 lines of python.

I would disagree that eventlet is too magical - the internals are actually very sane and well put together. It only took me a couple of weeks part time to put together the zeromq hub. If anything it's greenlet where the magic lies :-)

Cheers,
Ben

Robin Berjon said...

Note that support for binary types is being added to JS through the WebGL work (which really needs it). I fully expect that trickle over to NodeJS sooner rather than later.

Samori Gorse said...

You should probably check out Faye ( http://faye.jcoglan.com/ ), which provides a Bayeux implementation for both Rack and Nodejs.

Anonymous said...

Small detail, but Python doesn't really have anonymous functions... it has lambdas, which are restricted to a single expression with an implicit return.

This might seem like a detail, but it means that in Python, you have to declare the callback before the call that uses it. In Node.js/ JavaScript, you declare them in the order they are used.

IMO it has a huge effect on readability.

CptPicard said...

I would take some issue with calling asynchronous callback programming style "continuation-passing-style"... I guess you could say that the callback is the function that receives the result of the computation and is in that sense a sort of continuation, but CPS really means that all calls are tail calls, and there are no "returns", only passing of values forward to a runtime-selected continuation...

Shannon -jj Behrens said...

Great comments guys. Thanks!

Brantley Harris said...

Once you get over the "magic" in Eventlet, it's absolutely marvelous to work with. One might not like "multiple stacks", but being able to code something linearly, without a network of callbacks, while getting all of the advantages and none of the disadvantages of threads is simply divine and everything that concurrent Python programming should be.

The only reason it hasn't taken off is because people can't get their damned heads out of the box that has define concurrent programming in Python for so long. Eventlet explodes that box into a thousand pieces.

And the Javascript callback routines are all fine and dandy, but they can become a real tangle after a certain amount of complexity.

Shannon -jj Behrens said...

> Once you get over the "magic" in Eventlet, it's absolutely marvelous to work with. One might not like "multiple stacks", but being able to code something linearly, without a network of callbacks, while getting all of the advantages and none of the disadvantages of threads is simply divine and everything that concurrent Python programming should be.

Yeah, you're preaching to the choir ;) Eventlet is how we did things at IronPort for many years. In fact, there's a common lineage of programmers ;)

Anonymous said...

NodeJS appears a little hype driven by tons of web-based/client javascript developers who feel now they can just jump into being server-side engineers IMHO

Remember from Comp Sci 101, raw performance is not everything when developing Professional Software. Also, just because a language is popular doesn't mean it's suited to be a hammer for every nail (McDonald's is popular too).

Take a look at some of the NodeJS code or plethera of modules. Looks like pasta of callbacks, closures, nested complication.

Seems Eventlet's only complaint from Ryan is it's "magic"? But NodeJS "magic" is Ryans/V8's really, so it's just swapping magicians.

NodeJS seems to be really only something to appeal to people who love Javascript, when other solutions (in different languages) are far superior.

Shannon -jj Behrens said...

Pretty decent comment. Thanks.