Posts

Showing posts from April, 2006

Python: Protecting UTF-8 Strings from Naive Code

"""Temporarily convert a UTF-8 string to Unicode to prevent breakage.

BASIC IDEA: protect_utf8 is a function decorator that can prevent naive
functions from breaking UTF-8.

"""


def protect_utf8(wrapped_function, encoding='UTF-8'):

"""Temporarily convert a UTF-8 string to Unicode to prevent breakage.

protect_utf8 is a function decorator that can prevent naive
functions from breaking UTF-8.

If the wrapped function takes a string, and that string happens to be valid
UTF-8, convert it to a unicode object and call the wrapped function. If a
conversion was done and if a unicode object was returned, convert it back
to a UTF-8 string.

The wrapped function should take a string as its first parameter and it may
return an object of the same type. Anything else is optional. For
example:

def truncate(s):
return s[:1]

Pass "encoding" if you want to protect something other than UTF-8.

Humor: bash-3.00$ jj < coffee.cup > code.py

This is my second all-nighter this week (well, technically, I slept for a couple hours the other day). I had two double espressos yesterday, a cup of coffee last night at 10:30 PM, and I got one this morning at 5:00 AM. I think I realized I had a problem when I discovered that I was irritated that Star Bucks wasn't open between 11:00 PM and 4:30 AM.

Python: Django Meeting at Google

I've organized a BayPIGies meeting to take place at Google tonight at 7:30PM. Jacob Kaplan-Moss, one of the lead developers of Django, will be giving a talk on Django. There's more information on the BayPIGies Web site.

UNIX: ssh + tar + gzip -q = goodness

To retrieve a hierarchy of files from a remote server (or to copy it back to a remote server), I often do something like:ssh servername "tar cvzf - dirname" | tar xvfz -However, I usually get the following error message:gzip: stdin: decompression OK, trailing garbage ignored
tar: Child returned status 2
tar: Error exit delayed from previous errorsStrangely enough, as I write this, I get the error message copying something from one FreeBSD system to another FreeBSD system, but I don't get it when copying something from one FreeBSD system to my Ubuntu system. Weird.

I put up with this problem for years. However, I recently needed to use it in a Makefile. Having an error like that is fine when you're a human, but a non-zero return code is a deal-breaker in a Makefile. I needed to clean up my act.

One easy way to make the problem go away is to not use the "z" flag for both instances of tar. This is somewhat icky, because it really would be nice to have the con…

Software Engineering: Professional Software Development

I've written a comprehensive summary of the book "Professional Software Development". You can find the slides here. I highly encourage everyone to take the time to read the slides as the signal to noise ratio is extremely high.

Hardware: Smarter Memory

I'm not a hardware guy, but it seems to me that it would be really nice if RAM could be a little smarter and implement a few simple commands:memcpy - Copy one area of memory to another.memcmp - Are the given two strings of memory equal?memzero - Zero out an area of memory.Now naturally, these commands can't directly be used by applications because of the difficulties of virtual memory. I also wouldn't expect them to be as smart as their C counterparts. However, given some support in the standard library and the kernel, these commands could be very useful optimizations.

Emacs: Syntax Highlighting

Image
It's that time again! Whether it's because I'm drinking coffee and that's causing my compulsive obsessive nature to do crazy things, or because I'm inspired by other smart programmers who use Emacs, I'm getting "must use Emacs" cravings again. However, as soon as I started it up, the syntax highlighting irritations hit me like a stop sign over the head. The time, it's just a normal Python file. Notice how Emacs doesn't understand that double quotes can be embedded in triple double quotes. By the way, this isn't my code, so don't send me hate mail because there's HTML embedded in Python ;)