Skip to main content


Showing posts from October, 2008

Ruby: An Interesting Block Pattern

Ruby has blocks, which enable all sorts of interesting idioms. I'm going to show one that will be familiar to Rails enthusiasts, but was new to me. I was reading some code in a book, and it had the following: def if_found(obj) if obj yield else render :text => "Not found.", :status => "404 Not Found" false end end Here's how you call it: if_found(obj) do # We have a valid obj. Render something with it. end The code in the block will only execute if the obj was found. If it wasn't found, the response will already have been taken care of. I've been in the same situation in Python (using Pylons), and I coded something like: def handle_not_found(obj): if not obj: return render_404_page() return None Here's how you call it: response = handle_not_found(obj) if response: return response # Otherwise, continue normally. Pylons likes to return responses, whereas render in Ruby works as a side effect

Python: Some Notes on lxml

I wrote a webcrawler that uses lxml , XPath , and Beautiful Soup to easily pull data from a set of poorly formatted Web pages. In summary, it works, and I'm quite happy :) The script needs to pull data from hundreds of Web pages, but not millions, so I opted to use threads. The script actually takes the list of things to look for as a set of XPath expressions on the command line, which makes it super flexible. Let me give you some hints for the parts that I found difficult. First of all, here's how to install it. If you're using Ubuntu, then: apt-get install libxslt1-dev libxml2-dev # I also have python-dev, build-essentials, etc. installed. easy_install lxml easy_install BeautifulSoup If you're using MacPorts, do port install py25-lxml easy_install BeautifulSoup The FAQ states that if you use MacPorts, you may encounter difficulties because you will have multiple versions of libxml and libxslt installed. For instance, the following may segfault: python -c "

Python: Permission denied: '/var/www/.python-eggs'

I have a Pylons app, and I got the following exception in my logs: The following error occurred while trying to extract file(s) to the Python egg cache: [Errno 13] Permission denied: '/var/www/.python-eggs' The Python egg cache directory is currently set to: /var/www/.python-eggs Perhaps your account does not have write access to this directory? You can change the cache directory by setting the PYTHON_EGG_CACHE environment variable to point to an accessible directory. The problem is that the app was running as www-data (which was the user created for nginx and Apache). www-data's home directory is /var/www, but it doesn't have write access to it. (I'm afraid of allowing write access so that it can unpack eggs into that directory because that directory is the web root. In general, you should be careful of what you put in the web root.) There are a few ways to address this problem. One is to make sure to always use --always-unzip when installing eggs. An