Skip to main content

Python: Database Migrations

As part of my day job, I've written a Rails-style database migration script. This lets you write migrations from one version of a schema to the next. This allows you to develop schemas iteratively. It also lets you upgrade or downgrade the schema. Best of all, if an attempted upgrade fails, it can back it out even if you're not using transactions. Of course, this is based on writing "up" and "down" routines--it's practical, not magical.

I'm releasing this code in the hope that others will find it useful. It's well-written, solid, and well-tested. This is the type of thing you could probably write in a day. I took four, and polished the heck out of it.

It uses SQLAlchemy to talk to the database. However, that doesn't mean you have to use SQLAlchemy. Personally, I like writing table create statements by hand. You can do either.

My database configuration is stored in a .ini file ala Paste / Pylons. Hence, the script takes a .ini file to retrieve the database configuration. If you don't use Pylons, but you still want to use my script, that's an easy change to make. Migrations are stored in Python modules named like "${yourpackage}.db.migration_${number}.py". Again, I use Pylons to figure out what "${yourpackage}" is, but that's easy enough to change.

The name of my Pylons app is "multicosmic", and the script is installed in my application. You'll need to change the name to match your app.

Start by creating directories and files for "multicosmic/db" and "multicosmic/scripts".

First, there's a migrate script in "multicosmic/scripts/":
#!/usr/bin/env python

"""This is a script to apply database migrations.

Run this script with the -h flag to get usage information.

Migration Modules

Each migration is a module stored in
``${appname}/db/migration_${revision}.py`` where revision starts at 000
(i.e. an empty database). Each such module should have a module-level
global named migration containing a list of pairs of atoms. For

migration = [
# (up, down)

The up and down atoms may either be SQL strings, or they may be
functions that accept a SQLAlchemy connection.

Since I'm using SQLAlchemy, you might wonder why I'm writing actual SQL.
I like to use the SQLAlchemy ORM. However, when creating tables in
MySQL, there are so many fancy options that I find it easier to write
the SQL by hand.

Error Handling

* If something goes wrong when down migrating, just let the exception

* If something goes wrong when up migrating, complain, try to back it
out, and then let the exception propagate. If backing it out fails,
just let that exception propagate.

* Use transactions as appropriate. There are a lot of cases in
MySQL where transactions aren't supported. Hence, backing things
out is sometimes necessary. However, it's also possible that a
transaction might rollback, and then the code to back things out
runs anyway. It's best to make your down atoms idempotent. For
instance, use "DROP TABLE IF EXISTS" rather than just "DROP

Avoiding SQLAlchemy, Pylons, Paste, and Python 2.5

I'm using SQLAlchemy, but that doesn't force you to use SQLAlchemy in
the rest of your app. I'm using Paste's configuration mechanism because
that's how my database configuration information is stored. Passing a
CONFIG.ini to the script meets the needs of Paste and Pylons users
perfectly. If you're not one of those users and you want to use my
script, it's easy to subclass it and do something differently.
Similarly, if you're not using Python 2.5, I'm happy to remove the
Python 2.5-isms. Let's talk!


# Copyright: "Shannon -jj Behrens <>"
# License: I am contributing this code to the Pylons project under the same license as Pylons.

from __future__ import with_statement

from contextlib import contextmanager, closing
from glob import glob
from optparse import OptionParser
import os
import re
import sys
import traceback

from paste.deploy import loadapp
from pylons import config as conf
from pylons.database import create_engine

__docformat__ = "restructuredtext"

class Migrate:

"""This is the main class that runs the migrations."""

def __init__(self, args=None):
"""Set everything up, but don't run the migrations.

This defaults to ``sys.argv[1:]``.


def setup_option_parser(self, args):
"""Parse command line arguments."""
self.args = args
usage = "usage: %prog [options] CONFIG.ini"
self.parser = OptionParser(usage=usage)
self.parser.add_option('-r', '--revision', type='int',
help='schema revision; defaults to most current')
self.parser.add_option('-p', '--print-revision', action="store_true",
help='print current revision and exit')
self.parser.add_option("-v", "--verbose", action="store_true",
(self.options, self.args) = self.parser.parse_args(self.args)
if len(self.args) != 1:
self.parser.error("Expected exactly one argument for CONFIG.ini")

def run(self):
"""Run the migrations.

All database activity starts from here.

self.engine = create_engine()
self.engine.echo = bool(self.options.verbose)
with closing(self.engine.connect()) as self.connection:
if self.options.print_revision:
print self.current_revision
for migration in self.desired_migrations:

def load_configuration(self):
"""Load the configuration."""
loadapp('config:%s' % self.args[0], relative_to=os.getcwd())
except OSError, e:
dburi = conf.get('sqlalchemy.dburi')
if not dburi:
self.parser.error("%s: No sqlalchemy.dburi found" % self.args[0])

def find_migration_modules(self):
"""Figure out what migrations exist.

They should start at 000.

package = conf['pylons.package']
module = __import__(package + '.db', fromlist=['db'])
dirname = os.path.dirname(module.__file__)
glob_pattern = os.path.join(dirname, 'migration_*.py')
files = glob(glob_pattern)
basenames = map(os.path.basename, files)
for (i, name) in enumerate(basenames):
expected = '' % i
if name != expected:
raise ValueError("Expected %s, got %s" % (expected, name))
self.migration_modules = []
for name in basenames:
name = name[:-len('.py')]
module = __import__('%s.db.%s' % (package, name),

def find_desired_revision(self):
"""Find the target revision."""
len_migration_modules = len(self.migration_modules)
if self.options.revision is None:
self.desired_revision = len_migration_modules - 1
self.desired_revision = self.options.revision
if (self.desired_revision < 0 or
self.desired_revision >= len_migration_modules):
"Revision argument out of range [0, %s]" %
(len_migration_modules - 1))

def find_current_revision(self):
"""Figure out what revision we're currently at."""
if self.connection.execute(
"SHOW TABLES LIKE 'revision'").rowcount == 0:
self.current_revision = 0
result = self.connection.execute(
"SELECT revision_id FROM revision")
self.current_revision = int(result.fetchone()[0])

def find_desired_migrations(self):
"""Figure out which migrations need to be applied."""
self.desired_migrations = [
for i in self.migration_range

def find_migration_range(self):

"""Figure out the range of the migrations that need to be applied."""

if self.current_revision <= self.desired_revision:

# Don't reapply the current revision. Do apply the
# desired revision.

self.step = 1
self.migration_range = range(self.current_revision + self.step,
self.desired_revision + self.step)

# Unapply the current revision. Don't unapply the
# desired revision.

self.step = -1
self.migration_range = range(self.current_revision,
self.desired_revision, self.step)

def print_overview(self):
"""If verbose, tell the user what's going on."""
if self.options.verbose:
print "Current revision:", self.current_revision
print "Desired revision:", self.desired_revision
print "Direction:", ("up" if self.step == 1 else "down")
print "Migrations to be applied:", self.migration_range

def apply_migration(self, migration):
"""Apply the given migration list.

This is a migration module.

name = migration.__name__
revision = self.parse_revision(name)
if self.options.verbose:
print "Applying migration:", name
if self.step == -1:
with self.manage_transaction():
for (up, down) in reversed(migration.migration):
self.record_revision(revision - 1)
undo_atoms = []
with self.manage_transaction():
for (up, down) in migration.migration:
except Exception, e:
print >> sys.stderr, "An exception occurred:"
print >> sys.stderr, "Trying to back out migration:", name
with self.manage_transaction():
for down in reversed(undo_atoms):
print >> sys.stderr, "Backed out migration:", name
print >> sys.stderr, "Re-raising original exception."

def apply_atom(self, atom):
"""Apply the given atom. Let exceptions propagate."""
if isinstance(atom, basestring):

def parse_revision(self, s):
"""Given a string, return the revision number embedded in it.

Raise a ValueError on failure.

match ='(\d+)', s)
if match is None:
raise ValueError("Couldn't find a revision in: %s" % s)
return int(

def record_revision(self, revision):
"""Record the given revision.

The current revision is stored in a table named revision.
There's nothing to do if revision is 0.

if revision != 0:
self.connection.execute("UPDATE revision SET revision_id = %s",
self.current_revision = revision

def manage_transaction(self):
"""Manage a database transaction.


with self.manage_transaction():

transaction = self.connection.begin()

if __name__ == '__main__':
It comes with two migrations.

"""This is the first migration.

It doesn't really do anything; it represents an empty database. It
makes sense that a database at revision 0 should be empty.


__docformat__ = "restructuredtext"

migration = []
"""Create the revision table with a revision_id column."""

__docformat__ = "restructuredtext"

# I'm using a creative whitespace style that makes it readable both here
# and when printed.

migration = [
CREATE TABLE revision (
revision_id INT NOT NULL
DROP TABLE IF EXISTS revision"""),

# Subsequent migrations don't need to manage this value. The
# script will take care of it.

INSERT INTO revision (revision_id) VALUES (1)""",
DELETE FROM revision""")
Last of all, there are test cases in multicosmic/tests/functional/
"""Test that the migrate script works."""

# Copyright: "Shannon -jj Behrens <>"
# License: I am contributing this code to the Pylons project under the same license as Pylons.

from cStringIO import StringIO
import sys

from import assert_raises
from sqlalchemy.exceptions import SQLError

from multicosmic.scripts.migrate import Migrate
from multicosmic.db.migration_001 import migration as migration_001

__docformat__ = "restructuredtext"

BASE_ARGS = ['-v', 'test.ini']

def setup_module():

def teardown_module():

def test_setup_option_parser():
migrate = Migrate(['-r1'] + BASE_ARGS)
assert migrate.options.revision == 1
assert migrate.options.verbose

def test_bad_up_migration():
orig_stderr = sys.stderr
fake_stderr = StringIO()
migration_001.append(("INSERT INTO garbage", "DELETE FROM garbage"))
sys.stderr = fake_stderr
migrate = Migrate(['-r1'] + BASE_ARGS)
sys.stderr = orig_stderr
assert fake_stderr.getvalue()
migrate = Migrate(['-p'] + BASE_ARGS)
assert migrate.current_revision == 0

def test_bad_down_migration():
migration_001.append(("INSERT INTO garbage", "DELETE FROM garbage"))
migrate = Migrate(['-r0'] + BASE_ARGS)
migrate = Migrate(['-p'] + BASE_ARGS)
assert migrate.current_revision == 1

def _do_migration(revision=None):
"""Construct and run the Migrate class. Return it."""
args = BASE_ARGS
if revision is not None:
args = ['-r%s' % revision] + BASE_ARGS
migrate = Migrate(args)
if revision is None:
assert migrate.current_revision > 0
assert migrate.current_revision == revision
return migrate
I use nose for my tests. You can find out more about using nose with Pylons, including things to watch out for, here.

If this code works out for you, leave me a comment :)


Max Ischenko said…
Cool. I need something like this and was about to write one. I'll see if I can use your solution verbatim (the only twist I need -- my upgrade scripts are written as plain .sql files. could you expand your Migrate() class to support this behavior out of the box?). If you setup svn rep I could probably contribute back.
jjinux said…
Cool. Well, I hope it works out for you.

I'm using plain SQL, but I'm embedding it in Python files. There's a reason. A single migration might involve multiple steps. For instance, you might create three tables, and each CREATE TABLE is a step. In MySQL, there are a lot of things that ignore transactions. Hence, you might want to create, create, alter, create, but if anything fails, you want it to back out exactly what succeeded. Using Python, I can create a list of up/down pairs like: [(up, down), (up, down), (up, down)].
Anonymous said…
Any plans to mention this on the pylons or SA lists? Or did I miss it?

As you likely know it's an oft-requested feature...
jjinux said…
I sent email to Ben and Mike Bayer. I asked Ben to mention it on the Pylons mailing list. If you mention it on those mailing lists, I'd be quite grateful. I'm not currently subscribed because I'm in "just-had-a-baby-working-for-a-startup" mode. I'd also be happy to contribute this code to either of those projects.
Noah Gift said…
Nice script. OptionParser is a treat isn't it?

The "one" thing I wish it had was built in support for config file parsing...

Keep up the good work.
jjinux said…
> Nice script.

Thanks ;)
Unknown said…

Thanks !

this script seems to be the closest thing I've been looking for.

And frankly, I'm pretty sure TONS of other people are looking for something similar.

What about trying to make this script a completly standalone project, that is unrelated to a specific web framework ?
Indeed, the feeling of having it tied to another project would only be harmful to widen its use.

It could definitely be seen as the tool of choice to support agile database refactoring development. !!!

Thanks again !
Sami Dalouche
jjinux said…
Thanks for the comments, Sami.

I didn't have time to make this into a full project, so I figured the best I could do was blog about it and release the code. I hope that at least helps you out.

I do think there are a few parts that are going to be specific to your individual situations such as a) how to get the database connection parameters b) whether you want to use SQLAlchemy or something else c) where to get your migrations from, etc.

I think the truly generic parts are in just a few functions, such as knowing how to correctly apply a migration and back it out safely.

Happy Hacking!

Popular posts from this blog

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;) In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3 , Erlang , and the original Lisp machine . The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts. Now, I'm sure that many of these ideas have already been envisioned within , LLVM , Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway. Rather than wax philosophical, let me just dump out some ideas: Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code.&quo

Haskell or Erlang?

I've coded in both Erlang and Haskell. Erlang is practical, efficient, and useful. It's got a wonderful niche in the distributed world, and it has some real success stories such as CouchDB and Haskell is elegant and beautiful. It's been successful in various programming language competitions. I have some experience in both, but I'm thinking it's time to really commit to learning one of them on a professional level. They both have good books out now, and it's probably time I read one of those books cover to cover. My question is which? Back in 2000, Perl had established a real niche for systems administration, CGI, and text processing. The syntax wasn't exactly beautiful (unless you're into that sort of thing), but it was popular and mature. Python hadn't really become popular, nor did it really have a strong niche (at least as far as I could see). I went with Python because of its elegance, but since then, I've coded both p