In Structured Concurrency, I talk about what structured concurrency is and why it’s a big deal for C++ especially. In this post I discuss some more interesting properties of asynchronous code that is structured: async stacks and async scopes.
Concurrency is structured when “callee” async functions complete before their “caller” functions resume. This can be done without blocking a thread: the caller (parent) launches the callee (child) task and passes it a handle to itself, effectively telling the child, “When you have your result, call me back. Until then, I’m going to sleep.”
Immediately after the parent launches the child, the parent function does an ordinary return, often to something like an event loop that is churning through async tasks.
When we talk about parent/child async tasks, we are talking about a notional caller/callee relationship: there is a sequence of async operations that has caused the current one to be executing. This chain of operations is exactly like a call stack, but asynchronous. The actual program stack will look nothing like it.
Anyone who has debugged a multithreaded application knows that the actual program stack doesn’t really tell you what you want to know: How did I get here? All it generally shows is that some event loop is currently processing a certain function. The notional async stack tells you why. From the PoV of the event loop, async work is getting scheduled onto it willy-nilly. The structure of the async computation is a higher-level property of your program’s execution.
Or it isn’t, as often is the case in multithreaded C++ applications written today. Until C++20, C++ provided no language support for writing structured async code, and so that code is typically unstructured: no parent/child relationships exist at all. Work is scheduled with fire-and-forget semantics, using ad hoc out-of-band mechanisms to synchronize work, propagate values and errors, and keep data alive. It’s like programming with
jmp instructions instead of functions — no stack at all.
C++ programmers have simply accepted this state of affairs because they didn’t have anything better. Until C++20 introduced coroutines, that is. Coroutines are transformative, not because the syntax is nice, but because they cause async scopes to coincide with lexical scopes.
What’s an async scope? If an async stack is a chain of async function activations, then an async scope corresponds to the activation of a single async function. It encompasses all the state — variables and whatnot — that need to live for the duration of an async operation and all of its nested child operations. With callbacks, the async scope spans disjoint lexical scopes: it starts when an async function is called and ends when the callback returns — that is, if your code is structured.
If your async code is unstructured, there are no async scopes at all because there’s no notion of child operations that nest within parents. Or you could say there are overlapping scopes. Unsurprisingly, this makes resource management hard, which is why so much async C++ is littered with
Which brings us back to coroutines. For coroutines, the async scope starts when the coroutine is first called and it ends when the coroutine returns (or
co_returns I should say). Well, that’s just like ordinary functions with ordinary scopes! Which is exactly the point.
Forget that coroutines make async code read like synchronous code. Forget that the syntax is nice. The overwhelming benefit of coroutines in C++ is its ability to make your async scopes line up with lexical scopes because now we get to leverage everything we already know about functions, scopes, and resource management. Do you need some piece of data to live as long as this async operation? No problem. Make it a local variable in a coroutine.
Coroutines make the idea of structured concurrency obvious by manifesting it in code. We don’t have to worry about notional stacks and scopes.1 There’s the scope right there, between the curly braces! Here’s the mindbender though: Just as Dorothy could have gone home to Kansas any time she wanted, so too could we have been structuring our async code all along.
Here’s a dirty secret about coroutines: they’re just sugar over callbacks; everything after the
co_await in a coroutine is a callback. The compiler makes it so. And damn, we’ve had callbacks forever, we’ve just been misusing them. Structured concurrency has been just three heel-clicks away all this time.
Language support makes it much easier to ensure that child operations nest within parents, but with the right library abstractions, structured concurrency in C++ is totally possible without coroutines — and damn efficient.
Next post, I’ll introduce these library abstractions, which are the subject of the C++ standard proposal P2300, and what the library abstractions bring over and above C++20 coroutines.
Well, actually we still do until debuggers grok coroutines and can let us view the async stack. ↩
“A sender factory is an algorithm that takes no senders as parameters and returns a sender.”
Not being fæcetious, but I’m struggling to parse “takes no senders”. Is that “zero or more non-sender parameters”?
And section 4.1.11 concerns me, “Most sender adaptors are pipeable”
Highlighting a specific operator overload in a vaguely normalize-an-informal-convention way, that needs you to specifically highlight that it can be confusing and shouldn’t be used sometimes, feels awfully wrong.
You could almost transcribe it with an early pitch for “NULL” with demonstrations of how much more readable some constructs with “NULL” vs “0” and then after 2 pages a brief line noting there could be some confusion if people use this integer in a pointer context.
It means, “takes exactly zero arguments that are senders.”
It’s true for C++20 ranges as well. For the most part
a | b(c)is equivalent to
b(a, c). But take the case of
views::zip(which hasn’t been proposed yet, but it does what
zipdoes in functional languages). If you say
zip(a, b, c)there’s an ambiguity: does
zip(a, b, c)make a range, or does it simply curry the arguments so that
r | zip(a, b, c)performs
zip(r, a, b, c).
So we support pipe syntax where it makes sense, but not everywhere universally. Pipe syntax makes sense when reading the code left-to-right is illuminating. If it isn’t, as is the case for the
on()algorithm, then we’d do harm by supporting it.
Clarified my comments about the latter in a twitter thread, but “zero arguments that are senders” still flies over my head; specifically that I don’t understand the intent of specifying the type of zero arguments?
It takes no sender arguments, but it takes other kinds of arguments. Like
just(42)returns a sender that, when started, immediately completes with the value 42.
justis a factory, not an adaptor.
Hi Eric, do you accept suggestions for future work on ranges here?
If so please consider fixing this error prone std::iota problem:
const int64_t n = 12345LL10241024*1024;
auto vals = std::ranges::views::iota(0, n); // oops, should have been (0LL, n);
If not then thank you for blogs on libunifex, I was just reading some proposals and wished somebody would produce easier to read intro.