Non-Preemptive Threads in Python

Simon Funk
Jul, 2008


Python could benefit from syntactically-integral cooperative threading. And here they are.


Usually when people think of threads*, they're thinking of ways to use multiple CPUs to make their program run faster (graphics rendering being a classic example), or splitting their program into multiple threads to allow each to interact synchronously with some I/O without interfering with each other (such as a web server allocating a thread to each request that comes in). In those cases there is usually some added complexity in exchange for the benefit of speed or responsiveness. (Though very occasionally highly separable cases like a web server are actually easier to code threaded.) The main problems that threading introduces are the need for semaphores (to keep two threads from conflicting over shared memory) and other synchronization issues, and possibly communication between threads if there is no shared memory. Those sound like minor issues, but threaded applications are notoriously aggravating to develop, maintain, and debug.

But that's all for what's called preemptive threading, where the threads are all running simultaneously (or effectively so, if not in fact).

Cooperative threading, while similar at a glance, is a completely different beast, and has been unduly overlooked in computer language design for decades. Cooperative threading actually makes many common things far easier than without--so much so that entire classes of programming, such as graphical user-interfaces, games, simulations, and more, would probably be done completely differently if preemptive threads were standard in the language. People are starting to cotton on with the rise of co-routines*, but every example and implementation I've seen of those makes them feel like a trick for the wily, not a central feature of the language that you'd want to use every day.

In the past, I have used Modsim* extensively for simulation work, but it's a pretty obscure language. I once wrote my own cooperative threading package for C, but that requires some stack twiddling and other assembly-level hacks which need to be maintained from architecture to architecture, and even then it's a butt-puckering solution because if you allocate pseudo-stacks small enough not to waste memory, you run the risk they'll overflow (try to debug that). The problem lies in the stack-allocated frame* premise of most modern languages--a premise which is embodied all the way down to the CPUs which provide specialized instructions for stack manipulation. However, if we are willing to forgo the stack and switch to heap-allocated* frames, then it's a cinch.

Yarns in Python

Python, it turns out, almost provides this, via their generators*. All that's missing are a couple of simple syntactic additions, which I might call spawn and waitfor. To illustrate how it might work, I cobbled together a little shell script called spin which will "compile" a would-be-python source file (.pyy) into an is-python source file (.py).

Here is the doc string from my supporting library Yarn.py:

    A Yarn is just a non-preemptive thread.

    It is recommended that you spin .py files from .pyy files
    using the 'spin' compiler (which is really just a short shell
    script), which also requires that you "import Yarn".

    Assuming you have done those things, this module provides the following:

    Starting and stopping Yarns:

        spawn func(params)  # Call func(params) asynchronously now.
        spawn func(p) at t  # Call func(p) at sim-time t.
        waitfor func(p)     # Call func(p) synchronously (wait for it)

        Note that in all cases above, func() must be a Yarn func.
        Using waitfor in any function makes it a Yarn func.

    Pre-defined Yarns (must be waitfor'd or spawn'd):

        Yarn.sleep(s)           # Sleep s seconds in sim-time.

    Other functions:

        Yarn.run()              # Start the simulation.  Returns when last Yarn done.
        Yarn.simtime()          # The current simulation time, starts at 0.

Here's an example of using waitfor and spawn (I also had to throw in a new keyword "exit", but in a final version this would probably just be "return"). The example is a bit ugly 'cause I'm trying to demo all the features (like exception handling) in one short program. But use your imagination... truly lots of neat stuff you can do with this:

# This is yarn.pyy
# Run this file through "spin" to get a .py file.

import Yarn
from Yarn import simtime
import random

def trylickin(thing):
        print "%f: I'm gonna try lickin' %s"%(simtime(), thing)
        for i in range(1,21):
                r = random.uniform(1, 2)
                waitfor Yarn.sleep(r)
                print "%f: Lick %d of %s took %f seconds..."%(simtime(),i,thing,r)
                if random.uniform(0, 10) < 1:
                        print "%f: But that was enough, I'm satisfied!"%(simtime())
                        exit i
                if random.uniform(0, 10) < 1:
                        print "%f: But... gasp... poison!"%(simtime())
                        raise Exception, thing

def reportLicks(thing):
                tries = waitfor trylickin(thing)
        except Exception, e:
                print "%f: Lickin' tester of %s DIED! (exception=%s)"%(simtime(), thing, e)
                print "%f: Starting over..."%(simtime())
                spawn reportLicks(thing)
                if tries:
                        print "%f: Lickin' tester reports back %d tries for %s"%(simtime(), tries, thing)
                        print "%f: Lickin' tester didn't report # of tries for %s!"%(simtime(), thing)

spawn reportLicks("Tootsie Roll Pop")
spawn reportLicks("Rover") at 5.


When run, the output looks something like this:

0.000000: I'm gonna try lickin' Tootsie Roll Pop
1.815134: Lick 1 of Tootsie Roll Pop took 1.815134 seconds...
3.291259: Lick 2 of Tootsie Roll Pop took 1.476125 seconds...
4.689975: Lick 3 of Tootsie Roll Pop took 1.398716 seconds...
5.000000: I'm gonna try lickin' Rover
5.787508: Lick 4 of Tootsie Roll Pop took 1.097533 seconds...
5.787508: But... gasp... poison!
5.787508: Lickin' tester of Tootsie Roll Pop DIED! (exception=Tootsie Roll Pop)
5.787508: Starting over...
5.787508: I'm gonna try lickin' Tootsie Roll Pop
6.107890: Lick 1 of Rover took 1.107890 seconds...
6.107890: But that was enough, I'm satisfied!
6.107890: Lickin' tester reports back 1 tries for Rover
6.893336: Lick 1 of Tootsie Roll Pop took 1.105828 seconds...
8.111944: Lick 2 of Tootsie Roll Pop took 1.218607 seconds...
9.127466: Lick 3 of Tootsie Roll Pop took 1.015523 seconds...
10.196320: Lick 4 of Tootsie Roll Pop took 1.068853 seconds...
11.719473: Lick 5 of Tootsie Roll Pop took 1.523153 seconds...
13.562226: Lick 6 of Tootsie Roll Pop took 1.842754 seconds...
14.660472: Lick 7 of Tootsie Roll Pop took 1.098245 seconds...
14.660472: But that was enough, I'm satisfied!
14.660472: Lickin' tester reports back 7 tries for Tootsie Roll Pop

You can poke around the source to see how it works or try it out. Aside from the fact that the crufty spin script might not always work, this should be ready to use. If you do try it, let me know how it goes and what you do with it. I did all this back in January of 2007 and then forgot about it until some recent reminder, so there might be some gotcha I've forgotten about (but none that I recall...).

Simon Funk / simonfunk@gmail.com