Python: Clearing sys.modules of Stale Modules

I'm posting this here in case someone might find it useful. I also submitted it to the Cookbook.
"""Clear ``sys.modules`` of specific types of modules if one is stale.

BASIC IDEA: Clear ``sys.modules`` of stale code without having to restart your
server. It's a hell of a lot harder to do right then it sounds.

"""

__docformat__ = "restructuredtext"


_lastModuleUpdate = time.time()
def clearModules():
"""Clear ``sys.modules`` of specific types of modules if one is stale.

See ``properties.CLEAR_MODULES``.

I took this method out of the ``InternalLibrary`` class so that you can
call it *really* early, even before you create a ``Context`` to pass to
``InternalLibrary``.

History
-------

The problem that this method solves is simple: if I change a file, I don't
want to have to restart the server. It's a simple problem, but it's tough
to implement right. To prevent repeating mistakes, here's what has failed
in the past:

* Remove all modules from ``sys.modules`` on every page load.

- Some modules have state.

* Delete only those modules that don't have state.

- There's no convenient way to know which ones have state.

* Use module attributes.

- It's not convenient.

* Delete only those modules that have grown stale.

- If a parent class gets reloaded, child classes in other modules will
need to get reloaded, but we don't know which modules those classes are
in.

* Look through all the modules for module references to the modules
that'll get deleted and delete those too.

- Its very common to only import the class, not the whole module. Hence,
we still don't know which modules have child classes that need to get
reloaded.

* Just clear out ``sys.modules`` of all modules of certain types on every
page load.

- Even a moderate amount of kiddie clicking will result in exceptions.
I think the browsers hide the problem, but you'll see the exceptions
in the logs.

* Clear out ``sys.modules`` of all modules of certain types on every page
load, but only if at least one of the modules is stale.

- This is good because it handles the kiddie clicking case, and it also
handles the super class case.

"""
global _lastModuleUpdate
if not properties.CLEAR_MODULES:
return
deleteTheseTypes = properties.CLEAR_MODULES
if not isinstance(deleteTheseTypes, list):
# Update Seamstress Exchange's properties file if you change this.
deleteTheseTypes = ["aquarium.layout","aquarium.navigation",
"aquarium.screen", "aquarium.widget"]
deleteThese = [
moduleName
for moduleType in deleteTheseTypes
for moduleName in sys.modules.keys()
if (moduleName == moduleType or
moduleName.startswith(moduleType + "."))
]
for i in deleteThese:
try:
file = sys.modules[i].__file__
except AttributeError:
continue
if file.endswith(".pyc") and os.path.exists(file[:-1]):
file = file[:-1]
ST_MTIME = 8
if (_lastModuleUpdate < os.stat(file)[ST_MTIME]):
staleModules = True
break
else:
staleModules = False
if staleModules:
for i in deleteThese: # You can't modify a dictionary
del sys.modules[i] # during an iteration.
_lastModuleUpdate = time.time()

Comments

Anonymous said…
Managing reloading of modules is indeed very hard, more so when the modules you want to reload are directly associated with web pages in a web application. The code in mod_python tries to do it, but has a lot of problems/issues. See the following page for a list of the sort of problems that can occur:

http://www.dscpl.com.au/articles/modpython-003.html

For mod_python at least, a completely new module importer has been written, which hopefully will be included in a version of mod_python later this year some time, and which addresses the bulk of the problems. You may want to keep an eye out for information on it as it becomes available.