Posts

Showing posts from July, 2008

Computer Science: How to Implement Compilers and Interpreters

I'm stuck.

I just finished watching the SICP lectures. They were really good. (Thanks Mike Cheponis!) I've written partial implementations of Lisp in Python (thanks to PLY and some creativity) and Haskell (thanks to Write Yourself a Scheme in 48 Hours). I'd like to get a lot more serious about implementing languages, but I'm stuck.

I know that EOPL is a really good book. So is the dragon book. I've started both of them at separate times and gotten discouraged. One main problem is that I can only work on this stuff between 12-3AM because of work and kids, so I don't have enough time or enough mental energy. Currently, I think my options are:Give up until I magically have more time one day.Start working on EOPL and plan on it taking several years.Is there a more approachable book that will still teach me what I need?

One thing I'm really afraid of is that I'm always at risk for burnout, and I'm really afraid EOPL will send me over the edge. On …

Computer Science: Arbitrarily Fine Locking

This is a relatively simple idea concerning mutex usage. I imagine someone else has probably thought of it before. However, since I just thought of it, I figured I'd blog it. I have no clue why I was thinking about mutexes. I usually prefer share-nothing approaches like Erlang. Note, I am specifically not trying to comment on Python's GIL.

Imagine you have a bunch of resources (like objects or structs) that you wish to protect. One way to protect them is to use a single mutex. This is called coarse-grained locking. At the opposite end of the spectrum, you can create a new mutex for every resource. This is called fine-grained locking. However, what if you want something in the middle?

Having a single lock is unfortunate because it forces a lot of your code to be effectively single-threaded, even if you have multiple processors. However, perhaps creating a new mutex for every single resource might be overkill. (Personally, I'm unsure of why that might be the case.)

SICP: not not but rather unless

What happens if you search for "not foo" on Google? Surprisingly enough, it works, but it doesn't lead to what you might expect. It leads to pages that contain the phrase "not foo". If you want pages that don't contain the word "foo", the syntax is "-foo". Of course, if you type in "-foo" by itself, you don't get any interesting results. Rather you have to write something like "spam -foo", which leads to pages that contain the word spam, but not foo.

While I was at Foxmarks, I had to implement the logical operators for our own search engine, which is where I originally encountered this problem. If the user asks you for "not foo", you really can't efficiently give him any sort of answer. It only works if the user tries something like "spam and not foo", which is what I implemented.

The SICP lecture I was watching last night was about implementing logic languages like Prolog in Lisp. Th…

Software Engineering: Agile Programming and IDEs Considered Harmful

Put away your IDE, code using a pencil, and do a big design up front. Get it right the first time. It can be done. Naturally, I'm trolling a bit, but that link is a very short, very fun read.

It reminds me of an interview with Donald Knuth that I've been thinking about a lot lately. Knuth said:With the caveat that there’s no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything I’ve ever heard associated with the term "extreme programming" sounds like exactly the wrong way to go...with one exception. The exception is the idea of working in teams and reading each other’s code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.Somehow AndrĂ© Bensoussan and Donald Knuth are both capable of designing everything up front, and just getting it right. Since I've always been an agile coder, I find such…

SICP: Broken Math

Here's a great little math trick from SICP:v = 1 + 2 + 4 + 8 + ...Let's play around with that equation a bit:v - 1 = 2 + 4 + 8 + ...
(v - 1) / 2 = 1 + 2 + 4 + 8 + ...But that's the same as the first line. Hence, we see that:(v - 1) / 2 = vSolving for v, we get:v = -1
-1 = 1 + 2 + 4 + 8 + ...Crazy, eh? Sussman said:Arguments that are based on convergence are flaky, if you don't know the convergence beforehand. You can make wrong arguments. You can make deductions as if you know the answer and not be stopped somewhere by some obvious contradiction.I think the bad assumption in this case is thinking that some v could exist and be a normal integer. Clearly, there is no such normal integer v that equals 1 + 2 + 4 + 8 + ...

SICP: Lisp Inspired Python

I just finished the lecture on streams in SICP. Python has generators, which are quite lovely. However, it's fun to imagine reimplementing generators in various ways from scratch using some of the approaches in SICP.

First of all, there are promises. A promise says that you'll do something later. In Scheme, you create a promise with "delay" and you invoke that promise with "force":> (delay (/ 1 0))
#<promise>
> (force (delay (/ 1 0)))
division by zeroHere's one way to translate that to Python:class Promise(object):

"""This is a promise to execute something on demand."""

def __init__(self, f):
"""Wrap your expression in a lambda and pass it as f."""
self.f = f

def __call__(self):
"""Call the instance to get the value of the promise.

This is memoized so that it doesn't have to compute the value
more than once since promise…

Lisp: List Comprehensions

Whenever I explain Python's list comprehensions, I always say that they were taken from Haskell. Apparently, Lisp had them too using a syntactic sugar construct called collect. Here's the Python:[(i, j, i + j)
for i in range(1, n + 1)
for j in range(1, i - 1)
if is_prime(i + j)]And here's the Lisp:(collect
(list i j (+ i j))
((i (enum-interval 1 n))
(j (enum-interval 1 (-1+ i))))
(prime? (+ i j)))I found this in SICP, but I'm a little confused about exactly which dialects of Lisp and Scheme it's present in. I doesn't appear to be in R5RS.

Software Engineering: Facebook Thrift

I read Facebook's Thrift Whitepaper. Thrift is, basically, Facebook's open source replacement for CORBA. In short, it looks fine.

The whitepaper mentions CORBA only very briefly: "Relatively comprehensive, debatably overdesigned and heavyweight. Comparably cumbersome software installation."

Thrift feels very C++-ish. For instance, they use Boost. The Thrift IDL compiler is written in C++ with Lex and Yacc. They also implemented their own C++ thread primatives, including Mutex, Condition, and Monitor. The whitepaper also discusses their threading and memory management issues in C++.

They have an interesting scheme for versioning APIs such as struct fields and function parameters. They put numbers on the fields so that if a client and server aren't currently running the same version of the IDL, things can continue working. This is important for rolling upgrades. Here's an example:struct Example {
1:i32 number=10,
2:i64 bigNumber,
3:double decimals,…

Python: A Few Decorator Tips

Here are a few tips for working with Python decorators.

Here's a simple Python decorator, along with some code using it:def logged(f):

def wrapped(*args, **kargs):
print "I'm calling %s." % f.__name__
try:
return f(*args, **kargs)
finally:
print "Done."

return wrapped

@logged
def sum(a, b):
return a + b

sum(1, 2)
help(sum)If you run this, you get:I'm calling sum.
Done.
...Notice the way I used try, return, and finally. I'm either very cool or very naughty. It's hard to know.

The decorator package can make simple decorators (i.e. ones that don't have any of their own arguments) like this even easier:from decorator import decorator

@decorator
def logged(f, *args, **kargs):
print "I'm calling %s." % f.__name__
try:
return f(*args, **kargs)
finally:
print "Done."

@logged
def sum(a, b):
return a + b

sum(1, 2)
help(sum)Notice, I didn't need to write a &qu…

Lisp: Pattern Matching

As I mentioned earlier, I'm making my way through the SICP lectures.

Today, I'm excited about pattern matching. For instance, consider this:(define deriv-rules
'(
...
( (dd (+ (? x1) (? x2)) (? v))
(+ (dd (: x1) (: v))
(dd (: x2) (: v))) )
...))It says that the derivative of x1 + x2 with respect to v is the derivative of x1 with respect to v + the derivative of x2 with respect to v. Notice, it's all data. "No 'if' statements were harmed in the making of this film" ;)

I've played around with Haskell, Ocaml, and Erlang, so I'm familiar with match
statements. They make sense. However, I had never seen someone implement such a system, especially in such an elegant manner.

I'm so used to coding like "if this, then do this". Pattern recognition is all about "if the data matches this pattern, fill in this skeleton and then start over." I feel like I've just been "meta'd".

I've bee…

Linux: Running Paster under runit on Ubuntu

I needed to setup Paster, Pylon's default Web server, on Ubuntu. I had a heck of a hard time deciding whether I wanted to run it under Supervisor, daemontools, monit, runit, or a handmade rc script.

Supervisor has been getting a lot of press lately, but I couldn't find a standard way of starting it on Ubuntu. This seemed like a chicken and egg problem. Furthermore, I knew my buddy Mike Orr was having a hard time getting log rotation working with it.

I wanted something that could restart my process as necessary, so I decided against a standard rc script, even though I knew Paster has a --monitor-restart option. I eventually settled on runit. I had used it in the past, the directions were clear, and it just "felt right". My buddy Allan Bailey even gave me a quick cheat sheet.

If anyone can show me a nice comparison of all of these options, I'd love to read it. I wasn't able to find much on Google that compared them all.

Anyway, here's what I did in outli…

Lisp: Hackable by Default

Image
I'm watching the MIT lectures for the Structure and Interpretation of Computer Programs which my buddy Mike Cheponis was nice enough to give to me. I noticed a few interesting things.

In C, it's common to write "x->a". x is a pointer to a struct with a member a. In Lisp, you'd write (a x). a is a function that operates on some object x in order to return a value. That means that, by default, "a" has some "wiggle room" to do something special because it's a function. That's why in Java, the norm is to use getters and setters for everything. It provides "wiggle room" which I also call "hackability". In languages like Python, Ruby, and C#, "x.a" might or might not be an actual function call because those languages support properties. Depending on the details, in languages with inheritance, you might even say there's an extra dimension of hackability. For instance, in Java, x.getA() might be ex…