Skip to main content


Showing posts from January, 2007

CSS: Hacking Copy-and-Paste

If you copy-and-paste the contents of an HTML table into a text editor or Excel, it "does the right thing". This is a useful feature. What happens, though, if you want a column to appear in the copy-and-pasted copy, but not actually take up space on the screen? For instance, sometimes you might want to output the URL for a link in addition to the anchor text, and you want the URL for the link in a separate column. Sure, you can generate a report in CSV format, but the following trick can be bolted onto existing tables. Here's the HTML and CSS: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ""> <html> <head> <title>Hacking Cut-and-Paste</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta> <style type="text/css"> /* * The "cutandpaste" class makes

HTML: Browser Bug?

If you put an h1 inside a div, the spacing above the h1 caused by the h1 will go outside the div. However, subtle variations will make the spacing go inside the div. I'm confused. I would call this a browser bug, but it seems to be somewhat consistent among browsers. Here's a simple test case: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ""> <html> <head> <title>H1 Whitespace Test</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta> <style type="text/css"> #header { background-color: yellow; height: 20px; } #body { background-color: yellow; /* If you uncomment this, it behaves as I would expect. border: 1px solid black; */ } </style> </head> <body> <div id="header"></div> <div id="body"> <

Python: Running __main__ from Another Script

Let's say you have a script def f(n): print "n = %s" % n if __name__ == '__main__': f(5) How do you run the __main__ without calling the script on the command line? That is, how do you call it from within another Python script? Simply importing it isn't good enough. Here's how to run the other module's __main__: >>> import temp >>> execfile(temp.__file__) n = 5

Python: groupbysorted

Updated: It turns out that I was wrong about itertools.groupby. It works exactly the same as this code, so you should use it instead. This is a variation of itertools.groupby. The itertools.groupby iterator assumes that the input is not sorted but will fit in memory. This iterator has the same API, but assumes the opposite. Updated: __docformat__ = "restructuredtext" class peekable: """Make an iterator peekable. This is implemented with an eye toward simplicity. On the downside, you can't do things like peek more than one item ahead in the iterator. On the bright side, it doesn't require anything from itertools, etc., so it's less likely to encounter strange bugs, which occassionally do happen. Example usage:: >>> numbers = peekable(range(6)) >>> 0 >>> 1 >>> numbers.peek() 2 >&

Apple: Missing My Mac (Display)

I have a Dell Inspiron 6400. It's actually a really nice laptop. It has an Intel Core Duo, and its resolution is 1680x1050. For years, I've used various desktop backgrounds that were mostly gray. They all have something interesting to look at, but they all have very little color. In the past, I've had Apple notebooks, and I really liked the default blue Apple background; I find it quite comforting. I've tried to use the same background on a Dell, but for some reason it just irritates me. I've had two theories about this. One is that the background doesn't match the color of the rest of the notebook. The other is that the Apple display is nicer. Well, I'm sure everyone already knows the answer. Today, I held my Dell up to a big Dell cinema display being driven by a PowerBook. The difference was clear. Having seen them at the store, I wouldn't be surprised if the Apple cinema display was even nicer. It's depressing how faded my laptop loo

Python: Dealing with Huge Data Sets in MySQLdb

I have a table that's about 750mb. It has 25 million rows. To do what I need to do, I need to pull it all into Python. It's okay, the box has 8 gigs of RAM. However, when I do the query, "cursor.execute" never seems to return. I look at top, and I see that Python is taking up 100% of the CPU and a steadily increasing amount of RAM. Tracing through the code, I see that the code is hung on: # > /usr/lib/python2.4/site-packages/MySQLdb/ # -> return self._result.fetch_row(size, self._fetch_type) I was hoping to stream data from the server, but it appears some C code is trying to store it completely. After a few minutes, "show processlist;" in MySQL reports that the server is done, even with sending the data. So why won't "cursor.execute" hurry up and return? If you're wondering, unfortunately, I can't break this up into multiple queries. If I use a limits to go through the data one chunk at a time,

Python: Mako

There's a new Python templating engine called Mako . It's basically a modern, more-Pythonic version of Myghty , which is a Python version of Mason . It makes sense to switch if you're already using Myghty. It also makes sense to use if you're a Python guy who wants to avoid learning something new and just wants to dump a bit of Python in the middle of some HTML. I like Mike Bayer, Mako's author, but I prefer Genshi . Nonetheless, if Mike wants to go out and write another templating engine, more power to him! However, my feeling is that Python needs another templating engine like I need another open source kernel! <sarcasm>Yeah, thanks a lot Apple! Sure, Darwin's great! Too bad I can't use my airport card!</sarcasm> Seriously, I'd be a lot happier if they kept Darwin and released Cocoa. Now, that would be progress. *sigh* ;)

Vim: snippetsEmu

Just this morning, one of my buddies was ragging on me that TextMate was cool because of snippet expansion. I personally think this is optimizing the wrong thing since the typing part of programming is the easy part. Nonetheless, I'm happy to see that Vim has a knockoff. Best of all, it's easy to use and pretty useful. You can get the plugin here . Once you install it per the instructions, you can open up a Python file, insert the text "def", hit tab and get what's shown in the image. Hitting tab again jumps between the fields. Even better, there's a snippets file for Genshi .

Clustering: Hadoop

Google wrote a white paper called MapReduce: Simplified Data Processing on Large Clusters . It's a simple way to write software that works on a cluster of computers. Google also wrote a white paper on The Google File System . Hadoop is a framework for running applications on large clusters of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both map/reduce and the distributed file system are designed so that node failures are automatically handled by the framework. Put simply, Hadoop is an open-source implementation of Google's map/reduce and distributed file system wri