Skip to main content


Showing posts from April, 2006

Python: Protecting UTF-8 Strings from Naive Code

"""Temporarily convert a UTF-8 string to Unicode to prevent breakage. BASIC IDEA: protect_utf8 is a function decorator that can prevent naive functions from breaking UTF-8. """ def protect_utf8(wrapped_function, encoding='UTF-8'): """Temporarily convert a UTF-8 string to Unicode to prevent breakage. protect_utf8 is a function decorator that can prevent naive functions from breaking UTF-8. If the wrapped function takes a string, and that string happens to be valid UTF-8, convert it to a unicode object and call the wrapped function. If a conversion was done and if a unicode object was returned, convert it back to a UTF-8 string. The wrapped function should take a string as its first parameter and it may return an object of the same type. Anything else is optional. For example: def truncate(s): return s[:1] Pass "encoding" if you want to protect somet

Humor: bash-3.00$ jj < coffee.cup >

This is my second all-nighter this week (well, technically, I slept for a couple hours the other day). I had two double espressos yesterday, a cup of coffee last night at 10:30 PM, and I got one this morning at 5:00 AM. I think I realized I had a problem when I discovered that I was irritated that Star Bucks wasn't open between 11:00 PM and 4:30 AM.

UNIX: ssh + tar + gzip -q = goodness

To retrieve a hierarchy of files from a remote server (or to copy it back to a remote server), I often do something like: ssh servername "tar cvzf - dirname" | tar xvfz - However, I usually get the following error message: gzip: stdin: decompression OK, trailing garbage ignored tar: Child returned status 2 tar: Error exit delayed from previous errors Strangely enough, as I write this, I get the error message copying something from one FreeBSD system to another FreeBSD system, but I don't get it when copying something from one FreeBSD system to my Ubuntu system. Weird. I put up with this problem for years. However, I recently needed to use it in a Makefile. Having an error like that is fine when you're a human, but a non-zero return code is a deal-breaker in a Makefile. I needed to clean up my act. One easy way to make the problem go away is to not use the "z" flag for both instances of tar. This is somewhat icky, because it really would be nice to hav

Hardware: Smarter Memory

I'm not a hardware guy, but it seems to me that it would be really nice if RAM could be a little smarter and implement a few simple commands: memcpy - Copy one area of memory to another. memcmp - Are the given two strings of memory equal? memzero - Zero out an area of memory. Now naturally, these commands can't directly be used by applications because of the difficulties of virtual memory. I also wouldn't expect them to be as smart as their C counterparts. However, given some support in the standard library and the kernel, these commands could be very useful optimizations.

Emacs: Syntax Highlighting

It's that time again! Whether it's because I'm drinking coffee and that's causing my compulsive obsessive nature to do crazy things, or because I'm inspired by other smart programmers who use Emacs, I'm getting "must use Emacs" cravings again. However, as soon as I started it up, the syntax highlighting irritations hit me like a stop sign over the head. The time, it's just a normal Python file. Notice how Emacs doesn't understand that double quotes can be embedded in triple double quotes. By the way, this isn't my code, so don't send me hate mail because there's HTML embedded in Python ;)