Python: Limitations of Coroutines via Enhanced Generators
I've been thinking a lot about tricks such as the coroutine system in this article. PEP 342 improves the situation for Python 2.5. In fact, Example 3 in the PEP is exactly what was on my mind. I came to the same conclusion as the PEP:
For instance, say coro() is a coroutine that calls third_party_1() that calls third_party_2() that calls blocker() where blocker() blocks. If third_party_1() and third_party_2() are intermediary functions that exist in third-party code, they won't know how to "do the right" thing using yield. If third_party_1() and third_party_2() don't do the right thing with yield, there's no way for blocker() to yield back to the scheduler.
Ideally, I'd like for coro() to be a coroutine that blocks within blocker(), leaving third_party_1() and third_party_2() ignorant of the whole process. If you know how, please tell me!
In effect, this example emulates simple tasklets as are used in Stackless Python, as long as you use a yield expression to invoke routines that would otherwise "block".However, what happens if you can't use a yield expression to invoke routines that would otherwise block?
For instance, say coro() is a coroutine that calls third_party_1() that calls third_party_2() that calls blocker() where blocker() blocks. If third_party_1() and third_party_2() are intermediary functions that exist in third-party code, they won't know how to "do the right" thing using yield. If third_party_1() and third_party_2() don't do the right thing with yield, there's no way for blocker() to yield back to the scheduler.
Ideally, I'd like for coro() to be a coroutine that blocks within blocker(), leaving third_party_1() and third_party_2() ignorant of the whole process. If you know how, please tell me!
Comments
You could do the same trick with socket.socket() and any other common blocking offenders.
In which case, you could have completely-transparent non-blocking coroutine goodness. Which means that third-party (python) code wouldn't need to be re-written to take advantage of your non-blocking coroutine scheduler. Assuming this works (disclaimer: I haven't tried it), you could *almost* write the echo server example in PEP 342 as a textbook accept/read/write loop, but which actually handles multiple concurrent connections via non-blocking I/O under the hood. I can't think of a sneaky way offhand to hide the Trampoline run loop, though.
Of course, in your example, if blocker() was implemented via a C library that did blocking I/O, your SOL.
If coro(), third_party_1(), third_party_2(), and blocker() all called yield, then the "chaining" would work. However, if third_party_1() and third_party_2() don't call yield (they don't know about this coro business), then there's no way for blocker() to call yield. Afterall, third_party_2() doesn't know that blocker() is a generator. blocker() has to be called from the scheduler which does know that blocker() is a generator.
Confusing, eh?
This lets me perform actions inside the scheduler based on yielded task values. The values might be anything, however, if an UNBLOCK instance is returned, the scheduler spawns a thread which iterates the task once, then joins back into scheduler thread (which is usually the main thread).
This means that if I have to make a blocking call, and it really bothers me, I surround the potential blocking call with yield statements which look like this:
yield UNBLOCK()
i_might_block()
yield None
This approach does have some problems, however in my simple use cases, I haven't experienced any.
The framework I use does have some limitations. I currently treat all tasks as a big flat list. If a task yields another task, it gets ignored. I'll probably change this sometime soon, now that we have real coroutines to play with. You can see the implementation in google SVN.
Generators fit the frame resumption half of this description without supporting returning a value to a different (parent) frame. I guess that is why it feels like they should do the trick, but don't. :(
> i_might_block()
> yield None
This is nice and probably the best that can be done given the circumstances. However, if you assume that at some point during handling the request, you're *always* going to need to do this, you've really decayed into normal multi-threading :-/ Your technique is very helpful in cases where you can *mostly* stay away from blocking code.
Thank you so much for taking the time to answer my question.
*sigh*
> the socket module and file module.
This is something I am trying to get done for Stackless at the moment, providing versions of socket, file and maybe more which operate asynchronously in the background while stalling the caller on a Stackless channel.
That said, you are talking about synonizing two antonyms in one process. This is like having half your threads use cooperative multitasking, and half of them use preemptive.
That's a nice idea, but the preemption is going to happen with lower-level scheduler than the cooperative stuff; that's by the definition of "cooperative multitasking." That is to say, the executing code does the scheduling.
In a way, JJ, you are asking how to impose cooperative multitasking on code that assumes a preemptive environment. "kelly" et. al. would suggest ways to re-superclass, or subclass your third-party libraries (akin to the C++ superclass pointer-change trick of old vesions of uControl).
I would suggest this: put your third_party stuff in it's own process, and talk to it however you like. The kernel, after all, has a pretty good preemptive scheduler.
I do have plans for a kernel-support for userland scheduling that will support both preemption and cooperation, but it won't be released in time for you.
However, the problem is that a context switch can only happen with a yield *directly back to the scheduler*
Let's take an example - a socket read. Could you split the read into two parts? I think so. The trick is that you'll need to set up a trampoline in the socket interface to avoid trying to yield back thru the third-party code.
"socket.recv" calls "socket._yieldy_recv". The internal function is a generator. The "recv" call first needs to set up a trampoline: First "recv" creates a genator object "r = self._yieldy_recv()". "Recv" then needs to give the schedule a read "s.sched_read(r)". Finally "recv" will wait for the scheduler for some data: "r.next()". The next call then needs to ask the scheduler to do some useful work until it returns ... maybe "s.poller_gen.send(None)" or something like that. The poller eventually finds work and then calls "r.send(n_bytes)" or whatever.
Thanks for taking the comment.