Tuesday, April 05, 2011

PyCon: Disqus: Serving 400 million people with Python

Disqus is a commenting system for blogs.

It's the largest Django application.

They get 500 million visitors a month.

It's used by CNN, IGN, MTV, etc.

Every engineer at the company is also a product manager.

They have a flat company structure.

They're experiencing exponential traffic growth.

They have 100 servers.

They rent hardware.

Their Python code is CPU bound.

They're using Apache + mod_wsgi.

Their background tasks are IO bound. They use Celery + gevent for background tasks. This saves memory over using separate processes for each background job.

They use Graphite for monitoring. They use Etsy's statsd proxy for Graphite.

"Measure anything and everything."

They had to fork and monkey patch Django in order to scale.

They deploy to production 3-7 times a day.

They use Hudson for continuous integration. They have a large test suite that takes 30 minutes to run. They love Hudson, but they said it's too Java oriented.

They do not suffer from not invented here syndrome (NIH).

They have rolling deploy with fast rollbacks.

They use the unittest module and nose.

They use coverage.py and Pyflakes.

They love pep8.py and Pyflakes.

They learned Django and then learned Python.

Python package management is a mess.

They're a good Python shop, and they have a good engineering culture.

They have open sourced a bunch of stuff.