Skip to main content


Showing posts from April, 2011

Call Me Crazy: Calling Conventions

It's been said that the most significant contribution to computer science was the invention of the subroutine. Every major programming language has its own variation on how subroutines work. For instance, there are procedures, functions, methods, coroutines, etc. When implementing a compiled language or a virtual machine, you must decide how to implement subroutines at the assembly language level. For instance, you must decide how to pass arguments, what goes on the stack, what goes on the heap, what goes in registers, what goes in statically allocated memory, etc. These conventions are called calling conventions. You might not think of calling conventions very often, but they're an interesting topic. First of all, what does a subroutine provide over a simple goto? A subroutine provides a way to pass arguments, execute a chunk of code, and return to where you were before you called the subroutine. You can implement each of these yourself in assembly, but standardizing

Perl: Text-based Address Books

Over the years, I've stored my addresses in a variety of formats. I used to have a Palm Pilot. These days, I have an Android phone, and Google keeps track of my addresses. However, I've always had a second copy of my addresses in a text file. The format looks something like this: A & R Stock to Performance: Address: 2849 Willow Pass Rd. #A, Concord, CA 94519 Work Phone: (925) 689-1846 There are many advantages to using a text-based format. For instance, since I'm an expert Vim users, I can search and edit the file extremely quickly. Best of all, the format works on every operating system and never goes obsolete. The only downside is that I have to input the address twice: once for Google and once for my own notes. Long ago, I wrote a Perl script to convert my notes file into a Palm Pilot database. Even though I don't have a Palm Pilot anymore, I keep the script around. I can easily alter it to output the addresses in different formats. If you like t

Of Neuroscience and Jazz

I am neither a neuroscientist nor a very accomplished musician, but I'd like to talk about the intersection of neuroscience and music. I have a theory that there is a neuroscience basis for why it takes a mature musical palate to enjoy jazz. First, let me say a little something about neuroscience (based on the limited understanding I've gained by watching a bunch of talks). One of the things your brain is particularly good at is recognizing patterns and predicting patterns. At the lowest level, if two nerves are close to each other, and they both fire, it's counted as a pattern--i.e. those two things are connected. Similarly, if a nerve fires and then a short while later it fires again, that's a pattern as well. Hence, if both of my fingers feel something, there's a pattern, or if I feel a tapping on a single finger, that's a pattern as well. However, the brain is not limited to low level patterns. Rather, it can respond to a hierarchy of patterns. Para

Gaming: How Do Characters Know What to Say?

My wife and I like to play games like Paper Mario together. Paper Mario is a long game with a lot of dialog. At any point in the game, you can talk to any character, and that character will say something "sensible". For instance, they'll ask you to help them out, or they'll thank you if you've already helped them out. I've always wondered how that's coded. Similarly, I've always wondered how many ways I could think up to code it. The simplest approach is to use a complicated set of possibly nested if/else statements. For instance, if Mario has this item, then say this. Otherwise, if he has beaten this level, say that. Certainly that's a valid approach, and it doesn't even matter if it's slow. Since people read so slowly, trying to optimize how quickly you can come up with what text to show next is absolutely the last thing you would ever need to optimize in a video game. At the opposite extreme, this problem could be solved with

Ruby: My Take on Pivotal Labs, Part II

As I mentioned in my previous post , I have tremendous respect for how Pivotal Labs builds software. In this blog post, I want to cover why practices used at Pivotal Labs may not always be appropriate at other companies. The core of my argument is that Pivotal Labs is a consultancy; hence, their priorities are not always the same as the priorities for a startup building its own software. First of all, let me talk about full-time pair programming. In the book Professional Software Development , Steve McConnell states that NASA discovered that the single most effective way to reduce defects (in manufacturing, etc.) is to always have a second pair of eyes present (i.e. to work in pairs). However, NASA must go to extreme lengths to avoid defects because lives are on the line. That's rarely the case with most startups. Most defects are merely embarrassing. In many cases, code review may be more efficient than full-time pair programming. In some cases involving purely aesthetic

Ruby: My Take on Pivotal Labs, Part I

Pivotal Labs is one of my favorite companies. I have a tremendous amount of respect for how they develop software. They put the "extreme" in extreme programming, and more importantly, they get stuff done. However, there are some things that Pivotal Labs tend to do that I disagree with. I have such high regard for Pivotal Labs that I specifically try to find startups that started at Pivotal Labs when I'm looking for a job. Since these companies often make you do a three hour pair programming session on their code, I've seen the code of multiple companies that started at Pivotal Labs. Hence, although I don't know if Pivotal Labs has an official opinion on these topics, I've seen these things at enough companies that I feel it's worth commenting on. First of all, many of the companies that I interviewed at didn't use database-level foreign key constraints and database constraints in general. I know that it's trendy in the Rails world to try t

Math: Sierpinski's Triangle is a Variation of Pascal's Triangle

Here's a picture of Pascal's triangle from Wikipedia. It's animated just in case you don't remember how Pascal's triangle is created: Here's a picture of Sierpinski triangle, also from Wikipedia: You can see in the top image that to calculate a spot in Pascal's triangle, you just add the two above spots. To get Sierpinski's triangle instead of Pascal's triangle, you just xor the above two spots. Cute trick, eh? I learned this trick while reading Concepts, Techniques, and Models of Computer Programming , which is a fantastic book, by the way. Here's my Oz code for printing out Pascal's triangle: % This is a generic version of Pascal's triangle that let's you specify the % operation instead of just using "+". declare GenericPascal OpList ShiftLeft ShiftRight fun {GenericPascal Op N} if N==1 then [1] else L in L={GenericPascal Op N-1} {OpList Op {ShiftLeft L} {ShiftRight L}} end end fun {OpList

JavaScript: Jasmine

I went to a talk the other day on Jasmine : Jasmine is a behavior-driven development framework for testing your JavaScript code. It does not depend on any other JavaScript frameworks. It does not require a DOM. And it has a clean, obvious syntax so that you can easily write tests. describe("Jasmine", function() { it("makes testing JavaScript awesome!", function() { expect(yourCode).toBeLotsBetter(); }); }); Jasmine started life at Pivotal Labs and is like RSpec for JavaScript. Thanks to the flexibility of JavaScript, it has a powerful mocking / stubbing framework. Jasmine is not a replacement for Selenium. It's good for testing JavaScript functions that calculate things, but it doesn't try to make testing the DOM any easier. Just as unit tests written using RSpec are not a replacement for integration tests written using Cucumber and Webrat, similarly unit tests written using Jasmine are not a replacement for integration tests written using Seleni

Ruby: nil is a Billion Dollar Mistake

The fact is, I really like Ruby. However, there are some ways in which it uses nil that I really disagree with. For instance, in Ruby, if you try to look up something in a hash that doesn't exist, you get a nil. Similarly, if you try to reference an @attribute that hasn't been set yet, you'll get a nil. That reminds me of this article, Null References: The Billion Dollar Mistake : Tony Hoare introduced Null references in ALGOL W back in 1965 “simply because it was so easy to implement”, says Mr. Hoare. He talks about that decision considering it “my billion-dollar mistake”. Compounding this problem is Ruby's current lack of real keyword arguments (although, I know they're coming). Hence, if you pass a keyword argument like f(:foo => 1), and then try to use the keyword argument in the function like options[:foooo], the misspelling will result in a nil as if the argument hadn't been passed. This masks a real problem. All of these have resulted in real bu

Python: Adding New Methods to an Instance

It's easy to dynamically add new methods to a class in Python, but all my attempts to add new methods directly to an instance had always involved hacks. When I discovered that Ruby had syntax to add methods directly to an instance, my curiosity in the subject was rekindled. Fortunately, someone pointed me to this blog post that shows how. Why should you care? Most of the time, you shouldn't. You might want to use this trick if you were trying to do something like JavaScript's prototypal-based inheritance system. In my case, I think I was trying to do some funky monkey patching where I only wanted to monkey patch a particular instance instead of the class as a whole. Anyway, even if I can't come up with a good use case, I'm still glad I know how to do it ;)

Linux: Fish Hanging

fish is a "user friendly command line shell for UNIX-like operating systems such as Linux." I've been using fish for about a year, and I really like it. Unfortunately, until recently, I had a problem that I couldn't log into virtual consoles in Linux. If I hit Cntl-Alt-F1 and tried to log in, fish would just hang. This was mentioned on the mailing list a long time ago. I'm pleased to say that I finally solved the problem. I had the following code in my ~/.config/fish/ type fortune > /dev/null and begin fortune echo end Basically, this code says, "If fortune exists, run it. Otherwise, don't complain." I figured out through a process of elimination that was the culprit. I replaced it with: if status --is-interactive fortune echo end This code will result in an error if fortune doesn't exist, but it won't die. It turns out to be fine for me since I only run fish on my laptop, and I usually install fortune at the sa

JavaScript: Perfectly Encapsulated May Mean Perfectly Untestable

It's no secret that JavaScript is like Lisp in that you can accomplish amazing things using a huge number of small, nested functions. In fact, you can write things like: (function () { function pickNose() { } function fart() { } pickNose(); fart(); })(); In this code, an anonymous function is defined and immediately called. pickNose() and fart() are two internal functions that are used by the outer function, but they are not available to the outside world. It's amazing what you can get done using nested closures like this, but there's a cost. How do you write tests for pickNose() and fart()? Certainly, you can write a test for the outer function as a whole, but there's no way to test those inner functions in a standalone way without doing some refactoring. In a certain sense, the code is like a script in that you can test the thing as a whole, but you can't test the parts in a standalone way. What's the solution? I'm sure th

Python: Using Louie with Twisted

Louie "provides Python programmers with a straightforward way to dispatch signals between objects in a wide variety of contexts. It is based on PyDispatcher, which in turn was based on a highly-rated recipe in the Python Cookbook." Louie is like an event system used in a GUI toolkit. Similarly, you can think of Louie as an internal pubsub system. Twisted is a framework for building asynchronous network servers. Using Louie in your Twisted applications can make your applications a little less "twisted". When you code a Twisted application, you often put a lot of your logic in custom Factory and Protocol classes. However, if you have one application that has to talk to, say, three different types of servers, and you have a custom Protocol and Factory class for each server (perhaps each server speaks a different network protocol), it can be confusing to have your application logic broken up all over the place. Louie can help with that. When you are implementi

JavaScript: JS.IO is Solid, Hookbox Might be the Bee's Knees

Recently, I needed to add realtime (i.e. comet, websockets, flash sockets, etc.) support to an application. JS.IO is a JavaScript library that provides a long polling system using the Comet Session Protocol (CPS). It doesn't try to do what Socket.IO does, i.e. abstract all the various transport mechanisms such as websockets and flash sockets. Rather, it provides long polling and leaves the switch to websockets or flash sockets to the user. There are server-side libraries to integrate with JS.IO using Twisted, Eventlet, Erlang, etc. My experiences with JS.IO were very positive. Although the documentation was sorely lacking, Michael Carter (the author) was extremely helpful when I was getting started. Furthermore, my testing with proved that JS.IO was very reliable across an incredible range of browsers. It was the most reliable comet library I tried. By the way, I was doing this cross-domain. These days, Michael Carter has started a new project call

JavaScript: Socket.IO Didn't Meet Needs

Recently, I needed to add realtime (i.e. comet, websockets, flash sockets, etc.) support to an application. Socket.IO is a library built on top of Node.JS that "aims to make realtime apps possible in every browser and mobile device, blurring the differences between the different transport mechanisms." Since Socket.IO did exactly what I needed, I was hoping it would solve my problems easily and that I wouldn't have to implement what Socket.IO did myself. Unfortunately, things didn't work out so well. I had to do things in a cross-domain manner. Although the browser support list for Socket.IO is very good, that didn't match up with my actual experience. I built a simple application that tried to send and receive a message using Socket.IO, and then it reported on which transport was used. Unfortunately, many of the browsers that I wanted to support such as IE 6 and 7 and Opera just didn't work, even though they were supposed to. Here are some of my result

ZeroMQ is Amazing!

Recently, I had to improve the performance of mesh networking in a mesh of, say, 10 nodes. The original code used a simple RPC system built using JSON on top of netstring. Every message to every node involved a new connection. On a cluster of 10 nodes, I was getting 30 messages per second. I enhanced the code by using persistent connections. I also switched from RPC (i.e. using a whole roundtrip that blocks the whole connection) to message passing (i.e. passing a message doesn't necessarily result in a response and doesn't tie up the socket). This improved the performance to 300 messages per second. Next, my buddy encouraged me to try out ZeroMQ. Man was I amazed! I hit something like 1800 messages per second on a cluster of 10 nodes! I can only imagine what ZeroMQ was doing in order to hit this number. Perhaps it was batching messages more intelligently (from my experience, that's an amazingly effective technique). I ran the same test on a range of cluster size

Linux: ^\

Am I the only one who didn't know that you could use ^\ (i.e. control backslash) to kill a process when ^c doesn't work? Usually, I have to use ^z to background the process, and then type kill -9 %1. I think ^\ makes the process dump core, but since dumping core seems to be turned off by default, it works out well. Here's an example of my killing a process under fish (my shell): fish: Job 1, “nosetests” terminated by signal SIGQUIT (Quit request from job control with core dump (^\)) Thanks to Jeff Lindsay for the tip.

Videos: IBM Centennial Film: 100 X 100 - A century of achievements that have changed the world

IBM Centennial Film: 100 X 100 - A century of achievements that have changed the world The film features one hundred people, who each present the IBM achievement recorded in the year they were born. The film chronology flows from the oldest person to the youngest, offering a whirlwind history of the company and culminating with its prospects for the future. I found the film to be very moving.

Python: I'm Looking for a Python Instructor

I'm looking for a talented, friendly Python programmer to do corporate training. I'm helping out a company called Marakana. They do corporate training and have multiple gigs lined up for Python. Unfortunately, their normal Python instructor is getting a little bit busy right now, and so am I. If you're interested in giving a four day training session on Python, please send email to me at jjinux at gmail dot com. My buddy Robert Zuber has already prepared the course materials, and the pay is pretty decent. They're also looking for people to teach Ruby on Rails, HTML5, JavaScript, and Android, but I figured most of my readers are Python programmers. I'm sure this is obvious, but you must be comfortable with public speaking. You get bonus points if you've given a talk at PyCon or a local users group. You get double bonus points if you're well known in the Python community, especially if I know you ;) Update: The response was pretty overwhelming, so I

PyCon: Closing Lightning Talks

PyCon will be held in Montreal in 2014 and 2015. Twiggy is a new Pythonic logger. It has a totally new design (i.e. it's not like log4j). It's the first really new logging design in 15 years. It uses lots of chaining method calls like jQuery. It makes parsing logs easier. It has a modern config system. It has better traceback printing. It has an asynchronous logger. Askbot is a Stack Overflow clone in Python. The Python Miro Community has Python videos. They're rolling out universal subtitle support. Minuteman is a tool to replace your It acts as a workspace and project manager. It is like zc.buildout. It's also like Maven and Gentoo. It's a "metabuild system." It has no docs, no tests, and no users. Hold Old is My Kid? is a website that helps you figure out your kid's age in days, months, and years. flufl.enum is an enum library written by Barry Warsaw. MOE_write is a library for dealing with Python2 vs. Python3. It'

PyCon: Hidden Treasures in the Standard Library

Hidden Treasures in the Standard Library The talk was by Doug Hellmann. He writes Python Module of the Week . By the way, that is the most popular Python blog according to Google Reader, at least the last time I checked. Doug is publishing the series of blog posts as a book. Oh, and he's a nice guy :) Side note: I met a guy who worked at a company that provided mapping software. The software was used by Google maps. He worked on the routing algorithms. He said the whole system consisted of 200 million lines of code. (I'm not sure how that's possible.) However, he also said that they distribute an SDK that contains .NET, the JRE, and Python within it. Use the csv module with your own "dialects" to handle data that has fields with characters between each field. SQLite3 has been added to the standard library. You can create custom column types in Python. You can create signed, serialized data using the hmac library. You can serialize data using the

PyCon: Greasing the Wheels of Exploration with Python

Greasing the Wheels of Exploration with Python Here's the talk summary: The control of the Mars Exploration Rovers (MER) requires a complex set of coordinated activites by a team. Early in the MER mission the author automated in Python much of the task of one of the operation positions, the Payload Uplink Lead, for 7 of the 9 cameras on each rover. This talk describes the MER rovers, the operation tasks and that implemented system. They used gigapan images. They use virtual reality to visualize what's going on. Dust was a serious problem for the rovers. There's lots of Python on the rovers used to control the rovers. The speaker's background is in machine learning and robotics. The rovers have been running for 6-7 years. They find 1-2 bugs a year. Bugs are usually fixed in a matter of hours. They uses Ames Vision Workbench, Nebula, and OpenStack. All three of these are open source. The speaker was from Ames Research Center, NASA. Side note: unfortunately,

PyCon: Fun with Python's Newer Tools

Fun with Python's Newer Tools collections.namedtuple works just like normal tuples, but it lets you assign a name to each field. It's fast, and there is no additional per-tuple space cost. There's a lot of cool stuff in the collections package. Named tuples can be used for "instance prototypes" by using "other_tuple._replace(field=5)". Don't be discouraged by the "_"--they had to do that so as to avoid a namespace conflict. You can subclass a named tuple. It's based on __slots__. In Python 3.2, there is a functools.lru_cache. It's a decorator. It accepts a maxsize argument. Side note: I missed the first five minutes of the talk.

PyCon: An Open success for the cloud: OpenStack

Side note: I missed part of the talk because I had to leave early to go to mass. OpenStack turns a pile of hardware into a cloud. Swift is their object storage system. Nova is their system for provisioning virtual machines. Glance is their system for image storage. Burrow is their distributed message system. Dash is a Django user interface for managing virtual machines. NASA uses OpenStack. Rackspace is going to use it. They were acquired by Rackspace.

PyCon: Disqus: Serving 400 million people with Python

Disqus is a commenting system for blogs. It's the largest Django application. They get 500 million visitors a month. It's used by CNN, IGN, MTV, etc. Every engineer at the company is also a product manager. They have a flat company structure. They're experiencing exponential traffic growth. They have 100 servers. They rent hardware. Their Python code is CPU bound. They're using Apache + mod_wsgi. Their background tasks are IO bound. They use Celery + gevent for background tasks. This saves memory over using separate processes for each background job. They use Graphite for monitoring. They use Etsy's statsd proxy for Graphite. "Measure anything and everything." They had to fork and monkey patch Django in order to scale. They deploy to production 3-7 times a day. They use Hudson for continuous integration. They have a large test suite that takes 30 minutes to run. They love Hudson, but they said it's too Java oriented. They do not s

PyCon: Going Full Python - Threadless

The keynote was by a guy from Threadless. Threadless sells clever, artistic T-shirts. The company has lots of culture. It's very hip. The designs are community driven. The community votes on them. They don't actually print the shirts themselves. Threadless is in Chicago. Rahm Emanuel visited. After he visited, he Twittered , "Seriously, this city used to build things. Now we're just assholes with novelty t-shirts. I'm with motherfucking stupid." They switched from PHP to Django. They're still fighting a lot of technical debt. Threadless is an "art community". The speaker was hilarious.

PyCon: Lightning Talks

Read the Docs is a site that hosts documentation. The speaker said "shit" a lot. Chef or Puppet--choose one. Haystack is a project for doing full-text search. "Emacs pinky" is a real problem. DjangoZoom enables turnkey deployment for Django. PyCon did a donation drive for Japan. People could donate by texting to 90999. It wasn't actually very successful, but the Python Software Foundation pitched in to help out. JavaScript is like English--it's a real mess, but it works. In ECMAScript 5 (ES5), you can start your script with '"use strict";'. Firefox (because of the ES Harmony project) has added all sorts of weird additions to JavaScript. Many of them were taken from Python. flufl.i18n is a high level API for i18n. It's higher level than gettext. It makes it easy to handle multiple languages at the same time. It was written by Barry Warsaw at Canonical. PyWO is the "Python Window Organizer". It works on top of

PyCon: Supercomputer and Cluster Application Performance Analysis using Python

Supercomputer and Cluster Application Performance Analysis using Python The speaker was from Sandia labs. The goal of his project was measure and track the performance of super computers. His software used Python, matplotlib, MySQL, etc. It was a Tkinter application. The interface was definitely created by an engineer ;) It was somewhat overwhelming. There was some UI in the application to manage database schemas. It's weird to see Tk applications when you're so accustomed to "dot-comish" web apps. It seems painful to me to have to recreate an admin to manage database schemas when good ones already exist. I think the speaker was running Windows 2000. There was some UI in the application for creating queries to query the database. The application could show graphs plotted by matplotlib. Kiviat charts are hard to code, but they're very useful for plotting multivariable data. His projects were called Pylot, Co-Pylot, and eCo-Pylot. He plans on open so

PyCon: HTTP in Python: which library for what task?

HTTP in Python: which library for what task? The talk was by a Mercurial guy. He works on Google Code. When you think of HTTP, what you're probably thinking of is RFC 2616, i.e. HTTP/1.0. HTTP/1.1 has features that you're probably not remembering to handle correctly. This resulted in a bug that took him a very long time to diagnose. HTTP/1.1 allows pipelining, which means you send all the requests right away and then wait for the responses to come in serialized over the same socket. Using chunked encoding lets you keep the connection open between requests even if you don't know how much data will be streamed. Did you know that you can specify additional headers between chunks? Did you know the "100 Continue" response gets sent before you finish sending the body? httplib (used by urllib2) is very minimal. It doesn't even do SSL certificate validation! It doesn't support keepalive. It doesn't have unit tests. httplib2 is just a wrapper aro

PyCon: Advanced Network Architectures With ZeroMQ

Advanced Network Architectures With ZeroMQ This was a talk on ZeroMQ by Zed Shaw. He and his talk were a lot more sedate than I expected. He's also older than I thought he was. He uses Emacs, screen, and Ubuntu. ZeroMQ servers should not be exposed to the naked internet. If you send it invalid protocol data, assertions will kill the process. The talk was very fast, but most of it was like the ZeroMQ guide.

PyCon: Exhibition of Atrocity

Exhibition of Atrocity Here's the summary: Believe it or not, but you can write pretty horrendously awful code even in a language as elegant as Python. Over the years, I've committed my share of sins; now it's time to come clean. Step right up for a tour of twisted, evil, and downright wrong code, and learn some strategies to avoid writing criminally bad code--if you dare! This was a fun talk full of fun examples. The speaker showed code he had worked on over the course of 11 years. They used Hungarian notation at one point! snake-guice is a simple, lightweight Python dependency injection framework based on google-guice.

PyCon: Using Coroutines to Create Efficient, High-Concurrency Web Applications

Using Coroutines to Create Efficient, High-Concurrency Web Applications Here's the summary: Creating high-concurrency python web applications is inherently difficult for a variety of reasons. In this talk, I'll discuss the various iterations of application server paradigms we've used at meebo, the advantages/disadvantages of each approach, and why we've settled on a coroutine-based WSGI setup to handle our high-concurrency web applications going forward. They started with CGI. Then they switched to mod_wsgi. Then they switched to Twisted. Finally, they switched to gevent and Gunicorn. They said that this was the best of both worlds. Guido said, "I hate callback based programming." The speaker showed a great chart which showed the strengths and weaknesses of the various approaches. Gunicorn is a lightweight WSGI server. It has worker processes. It supports gevent. mod_wsgi is fast. However, if you use Gunicorn with multiple processes, it'll bea

I Code Sooooo Slowly!

One thing I've learned over and over is that a programmer's skill with his preferred editor is no indication of his skill as a programmer. One of the best programmers I know stuck with Pico for years! Certainly many of the programmers mentioned in "Coders at Work" use Emacs without even trying to learn it well. If you've been reading my blog for a while, you know that I'm obsessed with productivity, especially when it comes to editors. There's a reason why. I'm slow...really slow! When I was at PyCon, I participated in a coding challenge. The grand prize was a MacBook Air, and there were only 8 participants. I figured I stood a good chance at winning the MacBook Air which I needed since I was leaving Twilio. The challenge used SingPath : SingPath is the most FUN way to practice software languages! SingPath provides a platform to those that want to test their programming skills in a competitive and fun environment. (By the way, did you notice

Python: Python IDEs Panel

Python IDEs Panel Side note: There were surprisingly few people at this talk. It seems like most Python programmers get started with either Vim or Emacs and then don't change. It's ironic that I'm so obsessed with programmer productivity given that I'm such a slow coder ;) The panel consisted of representatives who worked on Python Tools for Visual Studio, PyCharm, Komodo IDE, Wing IDE, and a Python mode for Emacs (pythonmode.el). There was no one present to champion PyDev or NetBeans. Michael Foord prefers Wing IDE. Python Tools for Visual Studio has debugging support for high performance computing (HPC). It supports MPI. It can debug a program that uses multiple processes. It supports both IronPython and cPython. You can use iPython within Visual Studio to control a cluster of machines. You can write Python code to analyze the variables in the individual frames of a stack. PyCharm makes test driven development (TDD) fast! The speaker was using PyCharm t

PyCon: The Data Structures of Python

The Data Structures of Python Use types idiomatically. Sometimes you don't have a choice. Be efficient when it doesn't cost you anything. Think about set vs. frozenset, mutable vs. immutable. There is now a collections.OrderedDict class in Python 2.7 and 3.1. Sometimes you need your data structure to address more than one concern. Use combinations of things from the standard library. collections.deque is a linked list. It's pop(0) and insert(0, item) operations are O(1), whereas those operations are slower with normal lists. In Python 2.7, there's a maxlen parameter for the deque class (which I assume turns it into a circular queue). You can use the array type to efficiently represent an array of ints, etc. heapq is also interesting. contains abstract base classes. "Don't subclass dict ever!" He said this is true of other containers as well. He said there are too many edge cases. You should instead subclass things like col

PyCon: Ten Years of Twisted

Ten Years of Twisted The talk was by Glyph Lefkowitz, the original author of Twisted. He mentioned Medusa and Sam Rushing. Twisted started because Glyph wanted to add email and HTTP handling to his video game, Twisted Reality. He started by using the "select" function call. Twisted is good because it unifies the protocols. Glyph is only the 3rd most prolific Twisted committer. When you write tests for Twisted code, you can have both the client and server in the same process. Twisted is event-driven and asynchronous. Before switching to Python, Glyph wrote Twisted Reality in Java using threads. Twisted is powerful and flexible. Twisted is switching from the term "framework" to "engine". "Reactor included." Use "twistd --help" to see which servers are available for free with Twisted. Glyph said, "I hope we have another 10 years of Twisted." Side note: Gevent was really popular this year. During his keynote (whic

PyCon: Lightning Talks

Qtile is a tiling window manager written in Python. It's an alternative to Awesome. In Tunesia, they were using flame throwers against protesters. In a "do-ocracy", you don't govern, you just do. PyParsing and SPARK are good parsing libraries. Resolver is a Python-powered spreadsheet that works on Windows. Dirigible is the same thing as Resolver, but it runs as a web application.

PyCon: Hookbox: All Python web-frameworks, now real-time. Batteries Included

Hookbox: All Python web-frameworks, now real-time. Batteries Included Hookbox is a new server written by Michael Carter (of JS.IO and Orbited fame) for doing comet. It is a simple comet server that uses web hooks to talk to your normal web server. Your normal web server can take care of all the business logic, and Hookbox can take care of the comet. Side note: uses MongoDB and has ugly denormalization problems. Hookbox replaces what you would have had to do with RabbitMQ + Orbited. Michael called Hookbox a "web-enabled message queue". It's based on Eventlet. It uses the comet session protocol (CSP). It takes care of pub/sub, history, presence, moderation, etc. The documentation is not very complete. It doesn't support clustering. Hookbox is really good at delegating business logic to your application.

PyCon: Porting to Python 3

Porting to Python 3 has a list of the top 50 Python projects and which of them support Python3. As of the talk, 34% of the top 50 Python projects supported Python3. I just checked, and it's up to 54%. There are multiple strategies to porting to Python3: Only support Python3. Use separate trees for Python2 and Python3. Include both versions in a single download and set package_dir in Implement "continuous conversion" using 2to3. This approach is recommended for libraries. Distribute can help. Use a single codebase with no conversion. This requires loads of compatibility hacks. It's fun, but it's ugly. Check out the "six" project if that's what you want to do. Try 2to3 first. If in doubt, use distribute. Libraries should port as soon as possible. In order to prepare, use Python 2.7 with the -3 flag. Fix all the deprecation warnings. Use separate variables for string vs. binary data. Add "b" an

PyCon: Status of Unicode in Python 3

Status of Unicode in Python 3 The talk was by Victor Stinner. I went to dinner with him and a few other people. He was a nice, French guy. The encoding for source code defaults to UTF-8 in Python 3. Surrogate escapes are a new feature in Python 3.2. They let you deal with stuff that can't be decoded as UTF-8. For instance, you can decode a filename string to a unicode object without losing data even if the decoding isn't clean. There are still issues to work on. Victor had bootstrap issues implementing all this stuff. It took a lot of hard work to improve all this stuff. Check out Programming with Unicode , which is a book that Victor wrote. Victor has event more Unicode fixes in store for Python 3.3. Side note: I had an idea. It'd be cool to create a tool that shows you a call tree for your application. In the call tree, it can show you where all the encodes and decodes are done. This would help you know where to do encodes and decodes. This would really

PyCon: Opening the Flask

Opening the Flask The talk was by Armin Ronacher, one of the authors of Flask. Flask started as an elaborate April Fool's joke. The guys who did Flask did a lot of cool stuff before doing Flask, such as Jinja2. Flask was originally Jinja2, Werkzeug, and some glue code cleverly embedded in a single file download. Marketing beats quality. Originally, they didn't do any testing or any code review. That's changed. They wanted to restart with good docs and good tests. They documentation and tests are pretty good these days. The name "Flask" is a play on another framework, "Bottle". Flask is still based on Werkzeug and Jinja2. Flask can use Blinker as a signalling system (?). Flask is about 800 lines of code, 1500 lines of tests, and 200 A4-sized pages of documentation. There are lots of extensions. Flask uses decorator-based routes, but there are other options. Flask supports URL routing as well as URL generation. Jinja2 uses template inheri

PyCon: State of Pylons/TurboGears 2/repoze.bfg

State of Pylons/TurboGears 2/repoze.bfg There are about 2000 people on the Pylons mailing list. Ben Bangert said that Pylons relies too heavily on subclassing. When people subclass and override stuff in a framework's parent class, it makes it difficult to alter the parent class without breaking people's code. Side note: The new Pyramid T-shirt is beautifully done, but evil aliens (or anything else for that matter) are a big turn off for me. Pylons is big, but Ben said that too much of Pylons is dependent on him. Pylons is merging with repoze.bfg. It's going to be called "The Pylons Project", but it's based on the code in repoze.bfg. Chris McDonough wrote repoze.bfg. It's a great, but relatively unknown framework. Pylons has better name recognition. TurboGears 2 is built on Pylons. The new framework is called Pyramid and is part of "The Pylons Project". TurboGears 2 and Pylons are going to be maintained together so as not to strand a