In my opinion this is the best presentation about coroutines with a few basic example that covers different ways of passing/getting values from coroutine. Thank you Andreas
The amount of existing knowledge about C++ and concepts like multi threading, async io, future and promise needed to just understand the behind the scenes of a co-routine is enormous. Good luck with collaboration with beginners and developers with different backgrounds with this. The learning curve is really steep and is a recipe for a debugging nightmare if everyone don't fully understand all the concepts!
To understand coroutines you don't need any multithreading or async. Don't know where this idea is coming from. They are just functions. So you need understand function frames the stack/automatic variables and lambdas. It's just very bad taught so far. They are a state machine and you have to give some code points to the compiler.
It's really not that complex. It's literally just a function which you can suspend and jump out of and then jump back in and continue where you left off.
After having to tweak a coroutine once for work, I’ve watched a bunch of videos to try to understand the concepts and how they work. Most videos have been confusing. Some I kind of get. One explained it in a way that made it easy to understand, talking about things like the way the compiler sees things and what-not, with the audience asking great questions, and it really was simple. I think the problem with a lot of videos is they start with use cases (sometimes good ones) without explaining what coroutines are first. It’s as simple as Invader said, as a tool, a coroutine is a function that can suspend and resume itself, can be resumed, and can optionally return data. That’s it at its core. Part of what I think makes it off-putting is I’m not used to functions being able to exit mid-function and then resume later. It stores its local data in a frame, that in C++ is usually stored on the heap (making it “stackless”). The compiler adds control code to the function that, for me, knowing made it easier to understand. What makes it important are its use cases. There 2 I keep seeing come up. As a generator where you repeatedly call it to generate the next piece of data, such as in a sequence. And, when paired with a good multi-threaded scheduler, as a task that lets you write a chain of asynchronous calls in the way you would normally write blocking synchronous ones, making it easier to read, but without actually blocking (because of the suspend and resume functionality). Since coroutines themselves are really flexible, I’m sure there’s other use cases. I’m still watching videos to learn more, and I still find many confusing even after learning the basics. I think a lot of it is using them requires thinking differently and they presume you know things you don’t or try to explain them in the context of confusing use examples. Many presume a multi-threaded scheduler but don’t explain it because libraries like Boost come with one, but while useful with coroutines, are not apart of coroutines themselves. That’s been my experience so far. I’m not sure how clear I was as I’m still a novice with coroutines, but hopefully it’s helpful to someone. I’m going to go watch my next video to see if I can learn more.
9:50 Technically both co_yield and co_await can be used for input and output. In fact co_yield can be considered as a co_await for the coroutine’s promise object (‘co_yield x’ is like ‘co_await promise.yield_value(x)’)
Now I want to know how the compiler interperets coroutines as machine code, and therefore how they interact with CPU cache and pipelining(branch prediction). I am assuming the primary benefit of coroutines is essentially to combine the tasks of two threads into a single thread. This way a generator would create the generated object directly in a core register ready for the caller to use; as opposed to passing an object, writing the output to memory so another thread can load it from mem (Even as a cache hit the write-load of a shared objrct will require several cycles plus a few more cycles for locks or atomics to sync.) As such, will the coroutine instructions and caller instructions be in direct contention for the limited L1i? I mean will the caller and a large coro constantly be displacing each other as big chunks whenever the coro is called and suspended, or will the overall thread be unrolled and caller interleaved with the coro leading to a fairly linear set of instructions from a machine perspective (and thus simpler cache behavior)?
"primary benefit of coroutines is essentially to combine the tasks of two threads into a single thread. " Well, my use case was to use coroutines to buffer reads from the Web and writes to a database. Where the cooroutine would make the request and yield to the thread while awaiting the results of the http request, or the write to the DB. (I did this in GoLang where there is less customization, in fact almost none, and what seems like less boilerplate.). The general theory is that you use a cooroutine where the cost of the call is significant, ie the program could do useful work while awaiting the result. The way GoLang recommends communication with the coroutine is through a "channel" which is essentially a vector with a mutex on it.
Why in C++ things are always more complex than in other languages? Why not have a ready to use solution without coding like in Node.js, Python or even Rust? IMHO the language (or the committee?) has lost its way long time ago.
This is just the framework around which the ready solution wilk be built (it is comming in c++ 23 as far as I know). The availability of the underlying framework allows some really crazy shit such as sharing a single thread between multiple coroutines or batching yielded results before actually suspending.
python's coroutines create their own stack, so are huge and clunky when speed is required like, f(i for i in range()) can be a lot slower than f([i for i in range()]) just because of all generator's use (and that's without comparing creation costs) c++ coroutines are bone structure for multiple high speed applications and normal users will get their std::generator in ++23
You get convience by sacrificng performance in some cases, for most programmers its enough. But C++ is for those who might need performance in those cases, and thats why it looks how it looks.
It's *SUPER POWERFUL* now I get it. Thanks a lot Mr. Fertig for this informative talk, and a huge thanks to CppCon for being a great platform for these wonderful knowledge :)
Very nice talk! I'm curious as to how the coroutine lifetime is handled in the scheduler example, as no one is calling handle.destroy() explicitly. Is this a leak or do we hav some sort of built-in lifetiem management?
I believe that the fact that final_suspend() returns std::suspend_never (slide 27) indicates to the compiler to destroy the frame (and destruct the promise) when the coroutine ends
No it will leak if you don't provide promise_type::return_void(). That's why you wrap your coroutine handle and provide all the fancy destructor / std::exchange stuff.
Isn't there a problem with this 'Scheduler'? Lets imagine you have 3 functions: a() -> co_await b; b() -> co_await c; c() -> while(true) co_yield; we run a(); and what happens is that coroutines for [c, b, a] are pushed into the scheduler's list in that order, then we yield in c once more; And now our order becomes [b, a, c], and it would erroneously continue with b ahead of time.
Disappointed by how verbose and "loaded" this new C++ co-routine feature is. It's a far cry from how easily you can define these same concepts in Rust or Python. I could see co-routines being used by API authors of a very core library but the majority of people are better off writing a class with some state, than they are spending time trying to grok this mess.
This feature is complete cryptic garbage. I really hope no one ever uses this nonsense. It would be way easier to write a class that holds state and has functions that you call to get values out.
What you are used to is always easier than something you have to learn first. That's why "easy" often is not a quality criterion, in neither direction. After having written the wrapper type ("Chat"), writing different coroutines is much easier than writing different classes for implementing different behaviours.
What the committee has done is to provide: a) a transformation of the code that uses co_await, co_yield and co_return. b) a way to save and restore the CPU context. That's all they did. The actual implementations are left to the user of the language, i.e. the provider of libraries. Eventually the language will provide simple coroutine objects, but first they need to provide the low level API. You can certainly do without coroutines...if I need to program my own object that generates numbers or iterates a vector or two vectors, why not simply write an object that does the job and holds the appropriate state? But then there is stuff like the scheduler and tasks which are not doable with simple objects, because c++ does not provide an official way to save and restore contexts (there is stuff from C, like getjmp/setjmp, but that has a lot of limitations). C++ co-routines can do this kind of task scheduling though, because they keep the appropriate contexts and allow suspension and resumption of those contexts.
I like how he finished with "I am Fertig." I also appreciate the explanation for us non-German-speakers.
In my opinion this is the best presentation about coroutines with a few basic example that covers different ways of passing/getting values from coroutine. Thank you Andreas
My pleasure! Thank you!
I loved the singing 'Hello ....', 'How are you?' ... with the German accent :)
nobody: How complicated do you want it to be?
committee: yes
The amount of existing knowledge about C++ and concepts like multi threading, async io, future and promise needed to just understand the behind the scenes of a co-routine is enormous. Good luck with collaboration with beginners and developers with different backgrounds with this. The learning curve is really steep and is a recipe for a debugging nightmare if everyone don't fully understand all the concepts!
To understand coroutines you don't need any multithreading or async. Don't know where this idea is coming from. They are just functions. So you need understand function frames the stack/automatic variables and lambdas. It's just very bad taught so far. They are a state machine and you have to give some code points to the compiler.
It's really not that complex. It's literally just a function which you can suspend and jump out of and then jump back in and continue where you left off.
After having to tweak a coroutine once for work, I’ve watched a bunch of videos to try to understand the concepts and how they work. Most videos have been confusing. Some I kind of get. One explained it in a way that made it easy to understand, talking about things like the way the compiler sees things and what-not, with the audience asking great questions, and it really was simple. I think the problem with a lot of videos is they start with use cases (sometimes good ones) without explaining what coroutines are first.
It’s as simple as Invader said, as a tool, a coroutine is a function that can suspend and resume itself, can be resumed, and can optionally return data. That’s it at its core. Part of what I think makes it off-putting is I’m not used to functions being able to exit mid-function and then resume later. It stores its local data in a frame, that in C++ is usually stored on the heap (making it “stackless”). The compiler adds control code to the function that, for me, knowing made it easier to understand.
What makes it important are its use cases. There 2 I keep seeing come up. As a generator where you repeatedly call it to generate the next piece of data, such as in a sequence. And, when paired with a good multi-threaded scheduler, as a task that lets you write a chain of asynchronous calls in the way you would normally write blocking synchronous ones, making it easier to read, but without actually blocking (because of the suspend and resume functionality).
Since coroutines themselves are really flexible, I’m sure there’s other use cases. I’m still watching videos to learn more, and I still find many confusing even after learning the basics. I think a lot of it is using them requires thinking differently and they presume you know things you don’t or try to explain them in the context of confusing use examples. Many presume a multi-threaded scheduler but don’t explain it because libraries like Boost come with one, but while useful with coroutines, are not apart of coroutines themselves.
That’s been my experience so far. I’m not sure how clear I was as I’m still a novice with coroutines, but hopefully it’s helpful to someone. I’m going to go watch my next video to see if I can learn more.
Very useful for a beginners like me. These new features being added always take some time to get your head around
This was with a wide margin the best presentation I have seen on the topic. Thank you
Very pleased to hear that the presentation was helpful.
Thank you! That means a lot to me!
I agree. Being new to C++ Coroutines I found this very helpful.
9:50 Technically both co_yield and co_await can be used for input and output. In fact co_yield can be considered as a co_await for the coroutine’s promise object (‘co_yield x’ is like ‘co_await promise.yield_value(x)’)
Funny!
Was just reading chapter two from your book and took a break and this video popped up.
I hope you enjoyed both. This talk uses a different approach and different examples compared to the one in my C++20 book.
The best ever explanation of what the beast it is!
I'm glad I could shed some light on coroutines!
Thank you Andreas. A very nice presentation!
Thank you! I'm very pleased to hear that you liked my presentation.
11:55 technically, the compiler deduces the promise type as ‘std::coroutine_traits::promise_type’ which defaults to ‘wrapper_type::promise_type’
Andreas puts the Fun in Function.
Thank you for this lecture, very informative
This is an amazing introduction. Thank you all who participated in this!
I'm glad you enjoyed my presentation!
Has the lambda foot gun been addressed yet?
Now I want to know how the compiler interperets coroutines as machine code, and therefore how they interact with CPU cache and pipelining(branch prediction).
I am assuming the primary benefit of coroutines is essentially to combine the tasks of two threads into a single thread. This way a generator would create the generated object directly in a core register ready for the caller to use; as opposed to passing an object, writing the output to memory so another thread can load it from mem (Even as a cache hit the write-load of a shared objrct will require several cycles plus a few more cycles for locks or atomics to sync.)
As such, will the coroutine instructions and caller instructions be in direct contention for the limited L1i? I mean will the caller and a large coro constantly be displacing each other as big chunks whenever the coro is called and suspended, or will the overall thread be unrolled and caller interleaved with the coro leading to a fairly linear set of instructions from a machine perspective (and thus simpler cache behavior)?
"primary benefit of coroutines is essentially to combine the tasks of two threads into a single thread. "
Well, my use case was to use coroutines to buffer reads from the Web and writes to a database. Where the cooroutine would make the request and yield to the thread while awaiting the results of the http request, or the write to the DB. (I did this in GoLang where there is less customization, in fact almost none, and what seems like less boilerplate.). The general theory is that you use a cooroutine where the cost of the call is significant, ie the program could do useful work while awaiting the result. The way GoLang recommends communication with the coroutine is through a "channel" which is essentially a vector with a mutex on it.
Why in C++ things are always more complex than in other languages? Why not have a ready to use solution without coding like in Node.js, Python or even Rust? IMHO the language (or the committee?) has lost its way long time ago.
This is just the framework around which the ready solution wilk be built (it is comming in c++ 23 as far as I know). The availability of the underlying framework allows some really crazy shit such as sharing a single thread between multiple coroutines or batching yielded results before actually suspending.
python's coroutines create their own stack, so are huge and clunky when speed is required
like, f(i for i in range()) can be a lot slower than f([i for i in range()]) just because of all generator's use (and that's without comparing creation costs)
c++ coroutines are bone structure for multiple high speed applications
and normal users will get their std::generator in ++23
For historical reasons probably. C++ is a really old language and keeping it modern is really hard.
You get convience by sacrificng performance in some cases, for most programmers its enough. But C++ is for those who might need performance in those cases, and thats why it looks how it looks.
I would assume the best application for this is for hybrid CPU + accelerators ???
It's *SUPER POWERFUL* now I get it.
Thanks a lot Mr. Fertig for this informative talk, and
a huge thanks to CppCon for being a great platform for these wonderful knowledge :)
Your comments are much appreciated, thank you.
Great! I'm glad that I could show the powers of coroutines.
Great talk! Can we get the presentation PDF ?
Is the code available for download?
It's a great talk about coroutine. where can i find the keynote in the talk ?
Great and helpful video 👍👍
Well elaborated.
Are the slides uploaded for this talk? There is so much to keep track of at the same time.
Very nice talk! I'm curious as to how the coroutine lifetime is handled in the scheduler example, as no one is calling handle.destroy() explicitly. Is this a leak or do we hav some sort of built-in lifetiem management?
I believe that the fact that final_suspend() returns std::suspend_never (slide 27) indicates to the compiler to destroy the frame (and destruct the promise) when the coroutine ends
No it will leak if you don't provide promise_type::return_void(). That's why you wrap your coroutine handle and provide all the fancy destructor / std::exchange stuff.
23:30 I think performing#include inside a struct is quite unorthodox
Not if it's just a presentation
Isn't there a problem with this 'Scheduler'?
Lets imagine you have 3 functions:
a() -> co_await b;
b() -> co_await c;
c() -> while(true) co_yield;
we run a(); and what happens is that coroutines for [c, b, a] are pushed into the scheduler's list in that order, then we yield in c once more; And now our order becomes [b, a, c], and it would erroneously continue with b ahead of time.
good stuff for a technical interview :-( I hope to find some library wrapper in order to simplify this mess, thanks.
Is this pseudo code? Are there really "not" and "or" keywords added to the language?
Please make short videos for beginners.
I feel like more slides were needed
not even for more info, but to pull atrention to specific parts of the code as presentation goes along
I think it is a design issue if we need a talk titled for beginners.
Code is unreadable. At least on Chrome/ios
Disappointed by how verbose and "loaded" this new C++ co-routine feature is. It's a far cry from how easily you can define these same concepts in Rust or Python.
I could see co-routines being used by API authors of a very core library but the majority of people are better off writing a class with some state, than they are spending time trying to grok this mess.
this isn't python's generators
that would be ++23's std::generator
these coroutines are bone structure for many more, much faster constructions
Could you point me to the rust coroutines? Last time I was looking there were none.
This feature is complete cryptic garbage. I really hope no one ever uses this nonsense. It would be way easier to write a class that holds state and has functions that you call to get values out.
What you are used to is always easier than something you have to learn first. That's why "easy" often is not a quality criterion, in neither direction. After having written the wrapper type ("Chat"), writing different coroutines is much easier than writing different classes for implementing different behaviours.
while I agree I hope you first read some boost asio code so you will see things can be much much worse...
What the committee has done is to provide:
a) a transformation of the code that uses co_await, co_yield and co_return.
b) a way to save and restore the CPU context.
That's all they did.
The actual implementations are left to the user of the language, i.e. the provider of libraries.
Eventually the language will provide simple coroutine objects, but first they need to provide the low level API.
You can certainly do without coroutines...if I need to program my own object that generates numbers or iterates a vector or two vectors, why not simply write an object that does the job and holds the appropriate state?
But then there is stuff like the scheduler and tasks which are not doable with simple objects, because c++ does not provide an official way to save and restore contexts (there is stuff from C, like getjmp/setjmp, but that has a lot of limitations). C++ co-routines can do this kind of task scheduling though, because they keep the appropriate contexts and allow suspension and resumption of those contexts.