Wednesday, March 26, 2008

Vim: Why I Like Vim

I'm not trying to start a flame war. I'm trying to be honest and open minded. Here are things I really like about Vim:
  • I actually really like modal editing. Like most Vim experts, I spend almost all my time in command mode. I exit insert mode as soon as I'm done adding new text. That means I don't have to spend a lot of time holding down the control (i.e. caps lock) key.

  • I find hjkl to be a very convenient replacement for the arrow keys. They're only one key press each, and they're on the home row.

  • Vim's notion of combinable commands are intuitive, fast, and powerful. For instance, >aB means shift the block around the current cursor to the right. It is not a function in and of itself. Rather, it's a combination of a few pieces. Like in UNIX, you use small tools and put things together to achieve big things. >aB is like a UNIX pipeline in a way.

  • I love the fact that Vim has simple syntax highlighting built in for even very strange languages. Sometimes I don't need a full mode. Nice syntax highlighting is enough.

  • I like the fact that it's trivially easy to tell Vim that for such and such a type of file, it should be indented in such and such a way. The type of file need not be code. Even in cases where there is no special mode, Vim can still be very helpful in making the tab do what I want. Sometimes I don't want the editor to indent the code for me. I want it to help me indent the code by understanding how I want the tab key to behave. Of course, smart indentation is available for mainstream languages if I want it.

  • I like that Vim has a builtin scripting language, but it can also be scripted with Python. Unlike Vi, it's enough like Emacs that I find it to be a very happy middleground.

  • I like the fact that Vim has such a nice infrastructure for supporting multiple widget toolkits. It looks reasonably good looking on any platform.
Ok, like I said, this is not a flame war. If you say nice things about your editor, that's cool too.

Emacs Question: Indenting Non-code

I can't figure out how to get Emacs to do what I want to do. I've tried Googling and reading the manual, but the problem is that I want to control indentation of something that isn't code and doesn't have an existing mode devoted to it.

I have a bunch of files that end in .otl that are basically outlines. They are indented using tabs. Tabs are set to 4 spaces. (Yes, I know. Setting tabs to be anything other than 8 spaces is evil.)

When I'm editing an .otl file, I want Emacs to:
  • Treat tabs as 4 spaces wide.
  • Actually insert a tab whenever I hit tab.
  • Indent newlines to the same indentation level as the line above.
I don't want these rules applied to anything other than .otl files.

Can any Emacs gurus out there help me? Thanks.

Tuesday, March 25, 2008

Vim: Weird OS X (10.5) Problem

I went to check something into Subversion using Vi (i.e. Vim) as my EDITOR, and I got the following:
$ svn ci
svn: Commit failed (details follow):
svn: system('vi svn-commit.tmp') returned 256
Sure enough, even entering Vim and immediately exiting would return a non-zero exit status:
$ vi
$ echo $?
I read somewhere that this might be because of a bad plugin. Sure enough, the following fixed it:
cd ~/.vim
mv doc/rails.txt plugin/rails.vim ~/.Trash

Friday, March 21, 2008

PyCon: IronPython: The Road Ahead

IronPython: The Road Ahead

Jim Huginin seemed a little less enthusiastic than normal. I hope he's okay.

IronPython makes it really easy to use SQLServer.

He showed Django running under IronPython with only minimal patching necessary.

Microsoft doesn't ship Silverlight for Linux. You have to use Mono's Moonlight project.

Silverlight will soon be coming to Nokia and Windows Mobile.

The "Dynamic Language Runtime" makes it easier to code a scripting language for the CLR.

IronPython's license was approved by the OSI. However, Microsoft's .NET is still proprietary, of course. This reminds me of Flex and Flash.

IronPython now supports Python 2.5.

If you save and reload, you'll see your updates when using IronPython under Silverlight in the browser.

He showed a Python shell running in the browser. It supported method autocomplete using a drop down. What was even neater was when he showed Python interacting with JavaScript thanks to the CLR. He called a Python function from JavaScript.

Expect IronPython running under Silverlight 2 to be released before the next Olympics.

Microsoft apparently contributed some codecs to the Moonlight project since it's difficult to implement those without the specs.

PyCon: Core Python Containers--Under the Hood

Core Python Containers--Under the Hood

This was perhaps my favorite talk.

A list is a fixed-length array of pointers.

realloc is called occassionally to grow the list. However, overallocation is used in a very intelligent way to minimize the number of times this is necessary. Rather than simply doubling the size of the list, it's more of a curve. No more than 12.5% of the list is ever empty. The list shrinks when it is half empty.

Memory allocation under Windows is slow.

Inserting to the middle of or shrinking from the middle of a list is O(n). Appending is (on average) O(1).

Use a deque if you want to push or pop from the ends of a list. That's much more efficient than trying to inserting an element at index 0 of a normal list.

Sets are based on fixed-length hash tables. They are kept very sparse. Anytime it becomes 2/3 full, it is grown by a factor of 4. The hash table never needs to be resized for the keyword args dict (assuming you don't modify it).

The builtin set type is faster than implementing your own with dicts because he's able to use some shortcuts when implementing set operations.

On average, there are no more than 1.5 probes per lookup.

Building dicts and sets is expensive.

Dicts are the most finely tuned data structure in the language.

Using a dict in Python is way faster than using a poorly implemented map-like object in C.

PyCon: Consuming HTML

Consuming HTML

The HTML out in the wild may be messy, but it is of vital importance.

Don't use HTMLParser. minidom is horrible. Beautiful Soup is nicer. html5lib is theoretically fantastic, but it's very slow. libxml is really nice. It's similar to html5lib, but way, way faster.

PyCon: Plenary: OLPC Update

Plenary: OLPC Update

In summary, there are still some rough edges on the software, but the hardware is production-quality.

About 360,000 laptops are being deployed.

The UI work is still in flux.

They're still having problem with software bloat. The machine doesn't have much memory. They're preforking Python.

OLPC drastically changed things in one village. It engaged the kids and the teachers. "The fathers [who wanted their kids to come work in the fields] were ultimately convinced."

Unfortunately, Ivan Krstic has left the OLPC project due to some political changes.

PyCon: Plenary Keynote: Mark Hammond, "Snake Charming the Dragon: the Past, Present, and Future of Python and Mozilla"

Plenary Keynote: Mark Hammond, "Snake Charming the Dragon: the Past, Present, and Future of Python and Mozilla"

Brendan Eich, the author of JavaScript, said that he was "standing on Python's shoulders" in order to add generators and iterators to the newest version of JavaScript.

Python is now a first-class language for XUL development. Unfortunately, cPython can't be used for normal Web page development since it's not sandboxable.

Python + XUL + XHTML + CSS is nice.

Tamarin is a unified language runtime, like .NET. Unfortunately, compared to cPython, the batteries are not included. Tamarin could allow Python to be used for normal Web page development in the same way Silverlight permits IronPython.

Proper cross-language garbage collection is very difficult, if not impossible. (I assume he is not refering to environments like the JVM and the CLR.)

The speaker urged the audience to continue using PyXPCOM to implement components.

"Mozilla loves Python."

PyCon: Lighting Talks

Resolver is a Rapid Application Development tool with a spreadsheet interface. Imagine a spreadsheet where you can embed Python objects in the cells. It's written in IronPython. It's commercial.

FIVEDASH is fully-featured, general-purpose, open source accounting software. It's brilliant. Competing with QuickBooks Pro has got to be tough. However, QuickBooks Pro can't possibly address the long tail of accounting needs. For instance, who's going to write tax software for Brazil? Someone can come along and extend FIVEDASH to do it. Since FIVEDASH is GPLed, they'll benefit anytime someone else extends it. It's the classic win win situation for open source. Accounting software is one thing I never thought I'd ever see someone bother doing open source, but now I see that it makes perfect sense for a company to do so.

Dragon NaturallySpeaking is speech recognition software. They code their prototypes in Python. It's used by Google 411 and Microsoft Sync.

Leo is an outline-based Python IDE. Imagine the entire project being treated as an outline with project-wide code folding. One drawback is that it leaves metadata in the code in the form of Python comments.

"Why Does Client Side Python Suck" was a lightning talk focused on improving software distribution for Python-based Windows applications. Python is a large download. The output of py2exe isn't extensible once it's downloaded. The speaker wants to treat Python as a platform and have it installed separately of the main application under Windows.

Python: Python in Your Browser with IronPython and Silverlight

Python in Your Browser with IronPython and Silverlight

Considering IronPython and Silverlight are from Microsoft, it was interesting to see Michael Foord using a Mac and TextMate. Apparently, a lot of Windows developers use Macs with VMware. Unfortunately, VMware crashed on one of the other speakers, and he couldn't get it to work with the projector.

Silverlight is both cross-platform and cross-browser. Microsoft even supports Safari. However, they do not support Linux. The Moonlight project is an open source project to clone Silverlight for Linux using Mono.

Silverlight 2 contains a cut down version of the .NET CLR.

It's now possible to write Silverlight applications in IronPython. That means you can script the browser with Python :)

A "Hello World" application written in Python for Silverlight is 700K if the IronPython DLL isn't cached yet.

Silverlight 2 now has nice XAML widgets.

Foord prefers writing code by hand over using XAML.

The demo was pretty cool.

He showed a Python shell running in the browser.

There's a book on IronPython on the way.

PyCon: Case Study of Python Application Development--Humanized Enso

Case Study of Python Application Development--Humanized Enso

Humanized Enso isn't a normal application. It's more like Quicksilver on the Mac. It's open source.

They used lots of unit tests, Buildbot, and design by contract.

It's a desktop application.

When Python exceptions occur, they get mailed to a support address.

They used SCons and liked it. However, the speaker admitted that he didn't understand make.

The speaker is "young", but he's friendly and easy to understand.

Pycairo is a cross-platform GUI library.

They used py2exe + NSIS to make a Windows installer.

PyCon: Don't Call Us, We'll Call You: Callback Patterns and Idioms in Python

Don't Call Us, We'll Call You: Callback Patterns and Idioms in Python

This was a typical, high-quality, interesting talk by Alex Martelli.

Using the "key" argument of the "[].sort" function is far more efficient than using the "cmp" argument because the key only needs to be calculated once.

There are all sorts of uses for callbacks.

Callbacks come from a functional programming mindset.

The template method design pattern is more rigid than simply passing a callback. (2 points for Mike Cheponis if he's reading this.)

Callbacks can be used to customize a function's behavior.

Callbacks can be used for event handling.

Use "functools.partial(callable, *args, **kw)" for partial function application.

Twisted error callbacks are awesome.

PyCon: Managing Complexity (and Testing)

Managing Complexity (and Testing)

This was a talk on various metrics for code complexity.

Knowing the number of lines of code is not enough. For instance, how many of those lines are well tested? sloccount is a project to count lines of code.

The number of unittests that a project has is also not enough information.

There are many approaches to testing code coverage. For instance, do your tests test every line? Every branch? Every path? Each of these is successively harder. Even 100% path coverage doesn't guarantee you have no bugs.

Your gut feeling for how much code coverage you have is usually too optimistic.

Remember that human brains have not kept up with Moore's law ;)

He mentioned McCabe complexity as a way to measure code complexity.

Complexity != length. A function can be very long without being complex. This is the case if it has no if statements, no loops, no early returns, etc.

There is no way to test path coverage in Python.

He mentioned PyMetrics as a Python package to test code complexity. It was written by Reg Charney, a fellow member of BayPiggies.

High code complexity is correlated with a high bug count. Duh! ;)

Dead and redundant code is 40-100% more likely to be buggy.

Halstead is another metric, but the speaker said it was difficult for him to understand.

Code reviews are good.

Figleaf is a tool for testing code coverage.

PyCon: SQLAlchemy 0.4 and Beyond

SQLAlchemy 0.4 and Beyond

This was a whirlwind talk by Mike Bayer that covered both the past and the future of features in SQLAlchemy. It was good, but I'd need about three times as much time in order to understand it all.

0.1 was released in 2006.

0.4 will be faster and have better, smarter support for transactions.

SQLAlchemy now has crazy SQL generators for all sorts of weird situations. Inheritance is the least of these. There are tons of weird tricks that SQLAlchemy can automate for you.

SQLAlchemy now has a new "declarative" layer. You can use it if you don't want to go as far as Elixir goes. It's pretty simple. I think it's only 90 lines of code, or something like that. It unifies the tables and the classes.

SQLAlchemy provides lots and lots of abstraction. For me, at least, it's helpful to step back and look at what the SQL is really doing.

SQLAlchemy now has support for transaction sessions and transaction nesting (i.e. savepoints). However, it doesn't know how to rollback in-memory object state yet.

It also has support for two-phase commits. Use this if you need to commit a transaction to multiple databases.

It has support for lots of databases.

It now has some horizontal sharding support, modeled after Hibernate.

Support for migrating your database schema using Python syntax is now back.

There are multiple SQLAlchemy books on the way.

PyCon: Plenary Keynote: Van Lindberg, "Intellectual Property and Open Source"

Plenary Keynote: Van Lindberg, "Intellectual Property and Open Source"

I was up really late, so I missed the first two keynotes and half of this one.

The talk was interesting. He referred to a lot of interesting concepts such as zero sum games, the tragedy of the commons, the free rider problem, the prisoner's dilemma, etc.

He said that open source is not a zero sum game. It can be win win. Furthermore, it doesn't suffer from the prisoner's dilemma because it rewards cooperation and punishes defectors--or at least the GPL does.

He said that as far as economics goes, open source really is revolutionary.

PyCon: High Performance Network IO with Python + Libevent

High Performance Network IO with Python + Libevent

This is a library to wrap low-level asynchronous APIs like kqueue.

It's faster than Twisted, however it doesn't provide the whole deferred infrastructure. It's still based on callbacks.

It can be used underneath Django to make it three times faster.

It is incompatible with the Python GIL. That is, don't try to make use of it with Python threads.

There is little documentation.

They can do Comet.

If you are considering using Libevent, please read my article Concurrency and Python. Also, have a look at Eventlet. Asynchronous APIs are a good start, but there's so much more you can do on top of them than simple callbacks!

PyCon: The State of Django

The State of Django

A surprising number of Django users are new to Python.

Adrian said that "everyone should use trunk". The latest release is a year old.

Updating Django to use unicode properly was a herculean task.

Django now has autoescaping in templates.

There are lots of cool things going on, often on their own repository branch.

There are 2 existing Django books, and there are 5 more on the way.

They are refactoring the query module. Apparently, it was a big hairy mess.

They are removing the direct coupling between the model and the admin interface. However, he's not promising that you'll be able to use the admin interface on top of SQLAlchemy.

They are adding model validation. This is one area where Django is very different than Ruby on Rails. Rails tends to put more stuff in the model, and validation is just one example.

1.0 is coming.

They started a non-profit Django foundation. (So is Twisted, by the way.)

They're adding database inheritance.

They're adding better support for eager loading to the ORM.

It seems to me that Django's ORM isn't nearly as sophisticated as SQLAlchemy.

PyCon: Python - All a Scientist Needs

Python - All a Scientist Needs

He used Python to gather data, organize it, and run computations against it.

Unfortunately, Biopython isn't as nice, as large, or as well organized as BioPerl. Despite this, he still preferred to code in Python over Perl.

He also made use of Matplotlib and Numpy.

He rewrote some performance critical code in C and interfaced with it using SWIG. This saved about a week of computation time.

PyCon: Using Optparse, Subprocess, and Doctest to Make Agile Unix Utilities

Using Optparse, Subprocess, and Doctest to Make Agile Unix Utilities

Use if you don't care about stdout. Otherwise, use subprocess.popen.

Noah really emphasized the importance of making full use of the Python standard library. There's a lot of good stuff in there!

When giving a talk, remember to use large fonts and small code snippets.

Here's a neat idea: consider adding an --upgrade flag to your program that uses setuptools to do a software update.

PyCon: Plenary Keynote: Guido van Rossum, "Python 3000 and You"

Plenary Keynote: Guido van Rossum, "Python 3000 and You"

Matz (the author of Ruby) has a great quote, "Open Source needs to move or die."

Python is 18 years old.

Py3K will bring more predictable unicode handling.

It will be a slightly smaller language, since they're getting rid of a lot of things such as the difference between ints and longs.

One goal is to remove common traps. There will be fewer exceptions to rules.

Some changes, such as changing print from a statement to a function, are aimed at allowing future evolution of the language. Afterall, it's a lot easier to add another keyword argument to a function than it is to add new syntax related to the print statement.

Python 2.6 will be around and supported for about 5 years, so there's no rush to convert to Py3K. Take your time!

One strategy for open source libraries is to update their Python 2.6 code in such a way that the Python 2 to 3 converter can be used over and over again. That way, they can provide both a Python 2.6 and a Py3K release.

PyCon: Plenary Diamond Keynote: Why Python Sucks (But Works Great For Us)

Plenary Diamond Keynote: Why Python Sucks (But Works Great For Us)

There aren't enough Python coders, but on average they're of higher quality.

The company prefers smart, open-minded coders over Python zealots.

Sometimes switching to C++ for speed is worth it. In one case, they had to process 10 billion records. They gave up after waiting 15 hours for the Python version to finish. After rewriting it in C++, they were done in 30 minutes. This makes sense when the C++ code doesn't take too long to write.

PyCon: Plenary: Chair's Opening Remarks

Plenary: Chair's Opening Remarks

PyCon is all volunteer.

There were over 1000 people this time. This is up 70%.

They converted the women's restrooms to a men's restroom downstairs. I wonder why ;)

The Python Software Foundation provided $20,000 in financial aid this year to enable 40 Pythonistas to make it to PyCon.

PyCon: Tutorial: Mastering Pylons and TurboGears 2: Moving Beyond the Basics

Tutorial: Mastering Pylons and TurboGears 2: Moving Beyond the Basics

The tutorial was centered on creating a wiki in TurboGears 2. As you may know, TurboGears 2 is based on Pylons.

When in a template, TG2 doesn't have anything like Pylons' url_for function.

In TG2 you receive the form parameters as method arguments. This is arguably more tightly coupled with the form. If you expect to receive some random arguments, make sure you use **kargs.

I asked about the code snippet "redirect('/' + pagename)". The speaker didn't know if this was a temporary redirect or a see other redirect. Make sure you don't pass a full URL to the redirect function.

The whole talk was sort of strange. They would say, "Here's how you do it in TG2, and here's how you do it in Pylons." TG2 still has a fairly different API than Pylons. The idea is to sell Pylons as the stable, low-level API and to provide tons of extras on top of it in TG2. The extras might include an authentication framework, an admin interface, etc.

The new Pylons project creation template now asks questions during project creation such as whether to use SQLAlchemy and what templating engine to use.

The version of Pylons they were using was created just that morning. This resulted in a few crashes. Eventually, they passed around the egg on a USB key.

The tutorial wasn't quite as smooth as Titus's testing tutorial. I think trying to get people to actually follow along with a sample exercise by typing code on their computers is a tough thing to do.

Paste's exception handling still amazes the newbies.

Functions such as "pylons.templating.render_mako" are replacing Buffet, the previous templating engine abstraction layer.

Pylons is beginning to feel very polished.

TG1 had an entire infrastructure for forms that they're hoping to port to TG2. It could even introspect the database to create the form.

CherryPy is still considered very nice and very fast.

PyCon: Random Comments

First, here are some extremely random comments not related to any of the talks.

There were even more Ubuntu users than last year. Among Windows, Mac, and Linux, Windows was the least common.

I was grateful to find out that lots of people read my blog. One person even stored two years of my blog on his laptop so that he would have something to read on a plane trip! I'll do my best to try to keep you all informed and entertained. Thanks for reading!

I was not pleased when the hotel tried to charge me $3.25 for a plain bagel. I brought 4 Cliff Bars with me for the trip. Next time, I will bring a lot more. I eventually gave up on getting enough food at the hotel and started going to good restaurants during lunch ;) Yes, I can attest to the fact that Chicago-style, deep-dish pizza at Gino's East rules!

There was one speaker whose computer became "possessed". It started typing random characters all over the place. That's kind of a pain when you're trying to do a demo ;)

All the PyCon talks were video taped. Expect them to be released at some point.

Python: PyCon 2008

Stand back! I'm going to start blogging about PyCon!

Rather than try to summarize every talk I attended (which would take me forever), I'm going to just throw out random bits of information that I found interesting. I'm going to organize them one talk per post so that you can skip the ones you're not interested in.

Wednesday, March 19, 2008

Emacs: Mixing it Up

A lot of you know that I'm a hardcore Vim fanatic. However, I'm also burnt out right now, so I'm mixing things up. I'm going to switch to Emacs for a while. Help me out by leaving a comment with a couple of your favorite "power" commands.

I'm especially interested in figuring out how to tell Emacs things like "When coding in C, the tab key indents 4 spaces, but change every list of 8 spaces into a real tab. Also, when I go down a line, indent to exactly where I was on the line above." Intelligent indentation is nice, but for cases where it doesn't do what I want, I'd like it to still be helpful. In Vim, I can just enter ":set shiftwidth=4 tabstop=8 autoindent".

Vi is like have capslock for your control key.

Tuesday, March 18, 2008

Ideas: A Plugin

My wife wants a plugin.

Don't you get tired of people sending you urban legends? It takes a minute or two to recognize that something is an urban legend, find the right page on, and then send an elitist response that includes a link to it. It sure would be nice to have software that could automate this process.

It'd be nice if IronPort and Gmail could add this feature. I don't think it'd be that hard to create a Thunderbird plugin.

Apple: MacBook Manual Surprises

This is a list of things that surprised me while reading the manual for my MacBook:
  • Use two fingers on the touchpad to scroll.

  • It doesn't come with a modem.

  • The lowest model MacBook can burn CDs but not DVDs.

  • Hit F3 (without fn) or F10 (with fn) to use expose.

  • Turn it off if you're not going to use it for a day or two.

  • Putting it to sleep decreases the chances of damaging the hard drive while moving it.

  • Use "fn delete" to delete characters to the right of the cursor.

  • The manual use to say that if you added memory yourself you would void your warranty. It now says that you'll only void your memory if you mess up ;-)

  • Hold down D while booting to use Apple Hardware Test.

  • The printed manual is only in English.