- Trio is a modern Python library for writing asynchronous applications
- https://trio.readthedocs.io/en/stable/tutorial.html
Summary
- The “sandwich” structure is typical for async code; in general, it looks like:
trio.run -> [async function] -> ... -> [async function] -> trio.whatever
- It’s exactly the functions on the path between
trio.run()andtrio.whateverthat have to be async. - Trio provides the async bread, and then your code makes up the async sandwich’s tasty async filling.
- Other functions (e.g., helpers you call along the way) should generally be regular, non-async functions.
Example usage
- Running multiple async functions at the same time
- This code will run in 1 second.
# tasks-intro.py
import trio
async def child1():
print(" child1: started! sleeping now...")
await trio.sleep(1)
print(" child1: exiting!")
async def child2():
print(" child2: started! sleeping now...")
await trio.sleep(1)
print(" child2: exiting!")
async def parent():
print("parent: started!")
async with trio.open_nursery() as nursery:
print("parent: spawning child1...")
nursery.start_soon(child1)
print("parent: spawning child2...")
nursery.start_soon(child2)
print("parent: waiting for children to finish...")
# -- we exit the nursery block here --
print("parent: all done!")
trio.run(parent)-
async with- It’s actually pretty simple. In regular Python, a statement like
with someobj: ...instructs the interpreter to callsomeobj.__enter__()at the beginning of the block, and to callsomeobj.__exit__()at the end of the block. We callsomeobja “context manager”. - An
async withdoes exactly the same thing, except that where a regularwithstatement calls regular methods, anasync withstatement calls async methods: at the start of the block it doesawait someobj.__aenter__()and at that end of the block it doesawait someobj.__aexit__(). In this case we callsomeobjan “async context manager”.
- It’s actually pretty simple. In regular Python, a statement like
-
nursery object
- On line 20, we use
trio.open_nursery()to get a “nursery” object, and then inside theasync withblock we callnursery.start_soontwice, on lines 22 and 25. - There are actually two ways to call an async function: the first one is the one we already saw, using
await async_fn(); the new one isnursery.start_soon(async_fn): it asks Trio to start running this async function, but then returns immediately without waiting for the function to finish. - So after our two calls to
nursery.start_soon,child1andchild2are now running in the background. And then at line 28, the commented line, we hit the end of theasync withblock, and the nursery’s__aexit__function runs. - What this does is force
parentto stop here and wait for all the children in the nursery to exit. This is why you have to useasync withto get a nursery: it gives us a way to make sure that the child calls can’t run away and get lost.- One reason this is important is that if there’s a bug or other problem in one of the children, and it raises an exception, then it lets us propagate that exception into the parent; in many other frameworks, exceptions like this are just discarded. Trio never discards exceptions.
- On line 20, we use
How does it work?
-
Now, if you’re familiar with programming using threads, this might look familiar – and that’s intentional. But it’s important to realize that there are no threads here.
-
All of this is happening in a single thread.
- To remind ourselves of this, we use slightly different terminology: instead of spawning two “threads”, we say that we spawned two “tasks”.
- There are two differences between tasks and threads:
- (1) many tasks can take turns running on a single thread
- (2) with threads, the Python interpreter/operating system can switch which thread is running whenever they feel like it;
- with tasks, we can only switch at certain designated places we call “checkpoints”.
-
The interpreter will give the two childs a chance to run
>>> about to run one step of task: __main__.child2
child2 started! sleeping now...
<<< task step finished: __main__.child2
>>> about to run one step of task: __main__.child1
child1: started! sleeping now...
<<< task step finished: __main__.child1
-
Each task runs until it hits the call to
trio.sleep(), and then immediately suddenly we’re back intrio.run()deciding what to run next. How does this happen?- The secret is that
trio.run()andtrio.sleep()work together to make it happen:trio.sleep()has access to some special magic that lets it pause itself, so it sends a note totrio.run()requesting to be woken again after 1 second, and then suspends the task. - And once the task is suspended, Python gives control back to
trio.run(), which decides what to do next.
- The secret is that
-
Only async functions have access to the special magic for suspending a task, so only async functions can cause the program to switch to a different task.
-
This is a checkpoint!
-
What this means is that if a call doesn’t have an
awaiton it, then you know that it can’t be a place where your task will be suspended.- This makes tasks much easier to reason about than threads, because there are far fewer ways that tasks can be interleaved with each other and stomp on each others’ state.
- (For example, in Trio a statement like
a += 1is always atomic – even ifais some arbitrarily complicated custom object!) Trio also makes some further guarantees beyond that, but that’s the big one.
Checkpoints
- When writing code using Trio, it’s very important to understand the concept of a checkpoint. Many of Trio’s functions act as checkpoints.
A checkpoint is two things:
-
It’s a point where Trio checks for cancellation. For example, if the code that called your function set a timeout, and that timeout has expired, then the next time your function executes a checkpoint Trio will raise a
Cancelledexception. See Cancellation and timeouts below for more details. -
It’s a point where the Trio scheduler checks its scheduling policy to see if it’s a good time to switch to another task, and potentially does so. (Currently, this check is very simple: the scheduler always switches at every checkpoint. But this might change in the future.)
Since checkpoints are important and ubiquitous, we make it as simple as possible to keep track of them. Here are the rules:
-
Regular (synchronous) functions never contain any checkpoints.
-
If you call an async function provided by Trio (
await <something in trio>), and it doesn’t raise an exception, then it always acts as a checkpoint. (If it does raise an exception, it might act as a checkpoint or might not.)- This includes async iterators: If you write
async for ... in <a trio object>, then there will be at least one checkpoint in each iteration of the loop, and it will still checkpoint if the iterable is empty. - Partial exception for async context managers: Both the entry and exit of an
async withblock are defined as async functions; but for a particular type of async context manager, it’s often the case that only one of them is able to block, which means only that one will act as a checkpoint. This is documented on a case-by-case basis.trio.open_nursery()is a further exception to this rule. Only the exit blocks and is a checkpoint
- This includes async iterators: If you write
-
Third-party async functions / iterators / context managers can act as checkpoints; if you see
await <something>or one of its friends, then that might be a checkpoint. So to be safe, you should prepare for scheduling or cancellation happening there. -
The reason we distinguish between Trio functions and other functions is that we can’t make any guarantees about third party code