Tuesday, April 05, 2011

PyCon: The Data Structures of Python

The Data Structures of Python

Use types idiomatically.

Sometimes you don't have a choice.

Be efficient when it doesn't cost you anything.

Think about set vs. frozenset, mutable vs. immutable.

There is now a collections.OrderedDict class in Python 2.7 and 3.1.

Sometimes you need your data structure to address more than one concern. Use combinations of things from the standard library.

collections.deque is a linked list. It's pop(0) and insert(0, item) operations are O(1), whereas those operations are slower with normal lists. In Python 2.7, there's a maxlen parameter for the deque class (which I assume turns it into a circular queue).

You can use the array type to efficiently represent an array of ints, etc.

heapq is also interesting.

collections.abc contains abstract base classes.

"Don't subclass dict ever!" He said this is true of other containers as well. He said there are too many edge cases. You should instead subclass things like collections.Mapping instead.

There's an ordered set class on the Python Cookbook site. (Presumably, it combines a set with a list.)

Don't do more than necessary. ABCs (abstract base classes) can help.

You can use a frozenset as a dict key, and you can use frozensets as members in other sets.

Tuples are more efficient than lists.

Side note: htraf.htsql.org is an insanely good WSGI / ReST interface for databases.

4 comments:

Nick said...

Minor point: "maxlen" on a deque turns it into a window rather than a circular queue (i.e. if it is full and you add something on the right, it implicitly drops the left-most elements and vice-versa)

metapundit.net said...

Thanks for the talk summaries JJ - this is great stuff.

I'm pretty sure its HTSQL you're looking for: I recently saw the demo at http://htraf.htsql.org/

Shannon -jj Behrens said...

> Minor point: "maxlen" on a deque turns it into a window rather than a circular queue (i.e. if it is full and you add something on the right, it implicitly drops the left-most elements and vice-versa)

Ooo, nice! I've needed that data structure multiple times.

Shannon -jj Behrens said...

> Thanks for the talk summaries JJ - this is great stuff.

Thanks.

> I'm pretty sure its HTSQL you're looking for: I recently saw the demo at http://htraf.htsql.org/

Ah, that would explain why I couldn't find it on Google. Thanks!