Skip to main content

Python: Limitations of Coroutines via Enhanced Generators

I've been thinking a lot about tricks such as the coroutine system in this article. PEP 342 improves the situation for Python 2.5. In fact, Example 3 in the PEP is exactly what was on my mind. I came to the same conclusion as the PEP:
In effect, this example emulates simple tasklets as are used in Stackless Python, as long as you use a yield expression to invoke routines that would otherwise "block".
However, what happens if you can't use a yield expression to invoke routines that would otherwise block?

For instance, say coro() is a coroutine that calls third_party_1() that calls third_party_2() that calls blocker() where blocker() blocks. If third_party_1() and third_party_2() are intermediary functions that exist in third-party code, they won't know how to "do the right" thing using yield. If third_party_1() and third_party_2() don't do the right thing with yield, there's no way for blocker() to yield back to the scheduler.

Ideally, I'd like for coro() to be a coroutine that blocks within blocker(), leaving third_party_1() and third_party_2() ignorant of the whole process. If you know how, please tell me!


Anonymous said…
Let's assume blocker() is python code and that it is blocking on file I/O. You should be able to replace the __builtin__.file() function to return a file object that replaces read(), write(), etc. with implementations using yield and nonblocking_read(), nonblocking_write(), etc.

You could do the same trick with socket.socket() and any other common blocking offenders.

In which case, you could have completely-transparent non-blocking coroutine goodness. Which means that third-party (python) code wouldn't need to be re-written to take advantage of your non-blocking coroutine scheduler. Assuming this works (disclaimer: I haven't tried it), you could *almost* write the echo server example in PEP 342 as a textbook accept/read/write loop, but which actually handles multiple concurrent connections via non-blocking I/O under the hood. I can't think of a sneaky way offhand to hide the Trampoline run loop, though.

Of course, in your example, if blocker() was implemented via a C library that did blocking I/O, your SOL.
jjinux said…
I absolutely had envisioned overriding the socket module and file module. That's a given ;) However, the problem is that a context switch can only happen with a yield *directly back to the scheduler*, and the third party code doesn't know how to follow this "coding style".

If coro(), third_party_1(), third_party_2(), and blocker() all called yield, then the "chaining" would work. However, if third_party_1() and third_party_2() don't call yield (they don't know about this coro business), then there's no way for blocker() to call yield. Afterall, third_party_2() doesn't know that blocker() is a generator. blocker() has to be called from the scheduler which does know that blocker() is a generator.

Confusing, eh?
simon said…
The scheduler system I use was derived from the round-robin scheduler presented in the deque documentation. I added extra facilities to reschedule tasks based on the yielded value from a task. I can choose the value the task is going to yield on its next iteration, from other parts of the code, as the generator is wrapped in a class which can dynamically switch the .next method.

This lets me perform actions inside the scheduler based on yielded task values. The values might be anything, however, if an UNBLOCK instance is returned, the scheduler spawns a thread which iterates the task once, then joins back into scheduler thread (which is usually the main thread).

This means that if I have to make a blocking call, and it really bothers me, I surround the potential blocking call with yield statements which look like this:

yield UNBLOCK()
yield None

This approach does have some problems, however in my simple use cases, I haven't experienced any.

The framework I use does have some limitations. I currently treat all tasks as a big flat list. If a task yields another task, it gets ignored. I'll probably change this sometime soon, now that we have real coroutines to play with. You can see the implementation in google SVN.
Anonymous said…
Ah, I see. The problem is how to return a value back to a frame several frames above the current frame while retaining a reference to the current frame so you can resume execution later (which then leads to the question of how to identify the outer frame to return to).

Generators fit the frame resumption half of this description without supporting returning a value to a different (parent) frame. I guess that is why it feels like they should do the trick, but don't. :(
jjinux said…
Guido has said that "generators get you 80% of the way to continuations." In this case, the 20% is clearly missing. Continuations would totally solve this problem, but I guess that's old news.
jjinux said…
> yield UNBLOCK()
> i_might_block()
> yield None

This is nice and probably the best that can be done given the circumstances. However, if you assume that at some point during handling the request, you're *always* going to need to do this, you've really decayed into normal multi-threading :-/ Your technique is very helpful in cases where you can *mostly* stay away from blocking code.

Thank you so much for taking the time to answer my question.

Richard said…
> I absolutely had envisioned overriding
> the socket module and file module.

This is something I am trying to get done for Stackless at the moment, providing versions of socket, file and maybe more which operate asynchronously in the background while stalling the caller on a Stackless channel.
jjinux said…
/me is quite interested.
Brandon L. Golm said…
JJ asked me to comment, so I will. Normally I try to say something funny (usually making fun of JJ), but that seems difficult considering the fact that I don't know what he's talking about.

That said, you are talking about synonizing two antonyms in one process. This is like having half your threads use cooperative multitasking, and half of them use preemptive.

That's a nice idea, but the preemption is going to happen with lower-level scheduler than the cooperative stuff; that's by the definition of "cooperative multitasking." That is to say, the executing code does the scheduling.

In a way, JJ, you are asking how to impose cooperative multitasking on code that assumes a preemptive environment. "kelly" et. al. would suggest ways to re-superclass, or subclass your third-party libraries (akin to the C++ superclass pointer-change trick of old vesions of uControl).

I would suggest this: put your third_party stuff in it's own process, and talk to it however you like. The kernel, after all, has a pretty good preemptive scheduler.

I do have plans for a kernel-support for userland scheduling that will support both preemption and cooperation, but it won't be released in time for you.
B. said…
I tought about using the PEP to implement cororoutines but Armin Rigo has such a nice implementation in his "greenlets" C module that I chose to use that for my python cororoutine needs.

However, the problem is that a context switch can only happen with a yield *directly back to the scheduler*

Let's take an example - a socket read. Could you split the read into two parts? I think so. The trick is that you'll need to set up a trampoline in the socket interface to avoid trying to yield back thru the third-party code.

"socket.recv" calls "socket._yieldy_recv". The internal function is a generator. The "recv" call first needs to set up a trampoline: First "recv" creates a genator object "r = self._yieldy_recv()". "Recv" then needs to give the schedule a read "s.sched_read(r)". Finally "recv" will wait for the scheduler for some data: "". The next call then needs to ask the scheduler to do some useful work until it returns ... maybe "s.poller_gen.send(None)" or something like that. The poller eventually finds work and then calls "r.send(n_bytes)" or whatever.
jjinux said…
I've read it twice, but I'm having a hard time understanding that post. Can you use the coro(),third_party_1(), etc. example that I used above?

Thanks for taking the comment.
jjinux said…
Yep, it sounds like the greelets code has the "smart" stack copying that I was looking for.

Popular posts from this blog

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;) In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3 , Erlang , and the original Lisp machine . The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts. Now, I'm sure that many of these ideas have already been envisioned within , LLVM , Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway. Rather than wax philosophical, let me just dump out some ideas: Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code.&quo

Haskell or Erlang?

I've coded in both Erlang and Haskell. Erlang is practical, efficient, and useful. It's got a wonderful niche in the distributed world, and it has some real success stories such as CouchDB and Haskell is elegant and beautiful. It's been successful in various programming language competitions. I have some experience in both, but I'm thinking it's time to really commit to learning one of them on a professional level. They both have good books out now, and it's probably time I read one of those books cover to cover. My question is which? Back in 2000, Perl had established a real niche for systems administration, CGI, and text processing. The syntax wasn't exactly beautiful (unless you're into that sort of thing), but it was popular and mature. Python hadn't really become popular, nor did it really have a strong niche (at least as far as I could see). I went with Python because of its elegance, but since then, I've coded both p