However, when I do the query, "cursor.execute" never seems to return. I look at top, and I see that Python is taking up 100% of the CPU and a steadily increasing amount of RAM. Tracing through the code, I see that the code is hung on:
# > /usr/lib/python2.4/site-packages/MySQLdb/cursors.py(282)_fetch_row()I was hoping to stream data from the server, but it appears some C code is trying to store it completely. After a few minutes, "show processlist;" in MySQL reports that the server is done, even with sending the data. So why won't "cursor.execute" hurry up and return?
# -> return self._result.fetch_row(size, self._fetch_type)
If you're wondering, unfortunately, I can't break this up into multiple queries. If I use a limits to go through the data one chunk at a time, I have to continually resort the data on every query. I can't do the sorting in Python nearly as conveniently as I can do it in MySQL. Furthermore, one simple query can result in one simple table scan, which is faster than a lot of the alternatives.Anyway, I found out that MySQLdb has an under-documented streaming API. It all comes down to using a different type of cursor:
from MySQLdb.cursors import SSCursor
connection = MySQLdb.connect(...)
# Normally, you would use:
# cursor = connection.cursor()
# However, using this version, MySQLdb will read rows from the server one at a time.
cursor = SSCursor(connection)