Hubbry Logo
CoroutineCoroutineMain
Open search
Coroutine
Community hub
Coroutine
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Coroutine
Coroutine
from Wikipedia

Coroutines are computer program components that allow execution to be suspended and resumed, generalizing subroutines for cooperative multitasking. Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes.

They have been described as "functions whose execution you can pause".[1]

Melvin Conway coined the term coroutine in 1958 when he applied it to the construction of an assembly program.[2] The first published explanation of the coroutine appeared later, in 1963.[3]

Definition and types

[edit]

There is no single precise definition of coroutine. In 1980 Christopher D. Marlin[4] summarized two widely-acknowledged fundamental characteristics of a coroutine:

  1. the values of data local to a coroutine persist between successive calls;
  2. the execution of a coroutine is suspended as control leaves it, only to carry on where it left off when control re-enters the coroutine at some later stage.

Besides that, a coroutine implementation has 3 features:

  1. the control-transfer mechanism. Asymmetric coroutines usually provide keywords like yield and resume. Programmers cannot freely choose which frame to yield to. The runtime only yields to the nearest caller of the current coroutine. On the other hand, in symmetric coroutines, programmers must specify a yield destination.
  2. whether coroutines are provided in the language as first-class objects, which can be freely manipulated by the programmer, or as constrained constructs;
  3. whether a coroutine is able to suspend its execution from within nested function calls. Such a coroutine is a stackful coroutine. One to the contrary is called stackless coroutines, where unless marked as coroutine, a regular function can't use the keyword yield.

The paper "Revisiting Coroutines"[5] published in 2009 proposed term full coroutine to denote one that supports first-class coroutine and is stackful. Full Coroutines deserve their own name in that they have the same expressive power as one-shot continuations and delimited continuations. Full coroutines are either symmetric or asymmetric. Importantly, whether a coroutine is symmetric or asymmetric has no bearing on how expressive it can be, though full coroutines are more expressive than non-full coroutines. While their expressive power is the same, asymmetrical coroutines more closely resemble routine based control structures in the sense that control is always passed back to the invoker, which programmers may find more familiar.

Comparison with

[edit]

Subroutines

[edit]

Subroutines are special cases of coroutines.[6] When subroutines are invoked, execution begins at the start, and once a subroutine exits, it is finished; an instance of a subroutine only returns once, and does not hold state between invocations. By contrast, coroutines can exit by calling other coroutines, which may later return to the point where they were invoked in the original coroutine; from the coroutine's point of view, it is not exiting but calling another coroutine.[6] Thus, a coroutine instance holds state, and varies between invocations; there can be multiple instances of a given coroutine at once. The difference between calling another coroutine by means of "yielding" to it and simply calling another routine (which then, also, would return to the original point), is that the relationship between two coroutines which yield to each other is not that of caller-callee, but instead symmetric.

Any subroutine can be translated to a coroutine which does not call yield.[7]

Here is a simple example of how coroutines can be useful. Suppose you have a consumer-producer relationship where one routine creates items and adds them to a queue and another removes items from the queue and uses them. For reasons of efficiency, you want to add and remove several items at once. The code might look like this:

var q := new queue

coroutine produce
    loop
        while q is not full
            create some new items
            add the items to q
        yield to consume

coroutine consume
    loop
        while q is not empty
            remove some items from q
            use the items
        yield to produce

call produce

The queue is then completely filled or emptied before yielding control to the other coroutine using the yield command. The further coroutines calls are starting right after the yield, in the outer coroutine loop.

Although this example is often used as an introduction to multithreading, two threads are not needed for this: the yield statement can be implemented by a jump directly from one routine into the other.

Threads

[edit]

Coroutines are very similar to threads. However, coroutines are cooperatively multitasked, whereas threads are typically preemptively multitasked. Coroutines provide concurrency, because they allow tasks to be performed out of order or in a changeable order, without changing the overall outcome, but they do not provide parallelism, because they do not execute multiple tasks simultaneously. The advantages of coroutines over threads are that they may be used in a hard-realtime context (switching between coroutines need not involve any system calls or any blocking calls whatsoever), there is no need for synchronization primitives such as mutexes, semaphores, etc. in order to guard critical sections, and there is no need for support from the operating system.

It is possible to implement coroutines using preemptively-scheduled threads, in a way that will be transparent to the calling code, but some of the advantages (particularly the suitability for hard-realtime operation and relative cheapness of switching between them) will be lost.

Generators

[edit]

Generators, also known as semicoroutines,[8] are a subset of coroutines. Specifically, while both can yield multiple times, suspending their execution and allowing re-entry at multiple entry points, they differ in coroutines' ability to control where execution continues immediately after they yield, while generators cannot, instead transferring control back to the generator's caller.[9] That is, since generators are primarily used to simplify the writing of iterators, the yield statement in a generator does not specify a coroutine to jump to, but rather passes a value back to a parent routine.

However, it is still possible to implement coroutines on top of a generator facility, with the aid of a top-level dispatcher routine (a trampoline, essentially) that passes control explicitly to child generators identified by tokens passed back from the generators:

var q := new queue

generator produce
    loop
        while q is not full
            create some new items
            add the items to q
        yield

generator consume
    loop
        while q is not empty
            remove some items from q
            use the items
        yield

subroutine dispatcher
    var d := new dictionary(generatoriterator)
    d[produce] := start consume
    d[consume] := start produce
    var current := produce
    loop
        call current
        current := next d[current]

call dispatcher

A number of implementations of coroutines for languages with generator support but no native coroutines (e.g. Python[10] before 2.5) use this or a similar model.

Mutual recursion

[edit]

Using coroutines for state machines or concurrency is similar to using mutual recursion with tail calls, as in both cases the control changes to a different one of a set of routines. However, coroutines are more flexible and generally more efficient. Since coroutines yield rather than return, and then resume execution rather than restarting from the beginning, they are able to hold state, both variables (as in a closure) and execution point, and yields are not limited to being in tail position; mutually recursive subroutines must either use shared variables or pass state as parameters. Further, each mutually recursive call of a subroutine requires a new stack frame (unless tail call elimination is implemented), while passing control between coroutines uses the existing contexts and can be implemented simply by a jump.

Common uses

[edit]

Coroutines are useful to implement the following:

  • State machines within a single subroutine, where the state is determined by the current entry/exit point of the procedure; this can result in more readable code compared to use of goto, and may also be implemented via mutual recursion with tail calls.
  • Actor model of concurrency, for instance in video games. Each actor has its own procedures (this again logically separates the code), but they voluntarily give up control to central scheduler, which executes them sequentially (this is a form of cooperative multitasking).
  • Generators, and these are useful for streams – particularly input/output – and for generic traversal of data structures.
  • Communicating sequential processes where each sub-process is a coroutine. Channel inputs/outputs and blocking operations yield coroutines and a scheduler unblocks them on completion events. Alternatively, each sub-process may be the parent of the one following it in the data pipeline (or preceding it, in which case the pattern can be expressed as nested generators).
  • Reverse communication, commonly used in mathematical software, wherein a procedure such as a solver, integral evaluator, ... needs the using process to make a computation, such as evaluating an equation or integrand.

Native support

[edit]

Coroutines originated as an assembly language method, but are supported in some high-level programming languages.

Java does not have native or library support for coroutines, but it can call Kotlin coroutines kotlinx.coroutines (though this is not ideal and would require a Java wrapper over Kotlin).

Since continuations can be used to implement coroutines, programming languages that support them can also quite easily support coroutines.

Implementations

[edit]

As of 2003, many of the most popular programming languages, including C and its derivatives, do not have built-in support for coroutines within the language or their standard libraries. This is, in large part, due to the limitations of stack-based subroutine implementation. An exception is the C++ library Boost.Context, part of boost libraries, which supports context swapping on ARM, MIPS, PowerPC, SPARC and x86 on POSIX, Mac OS X and Windows. Coroutines can be built upon Boost.Context.

In situations where a coroutine would be the natural implementation of a mechanism, but is not available, the typical response is to use a closure – a subroutine with state variables (static variables, often boolean flags) to maintain an internal state between calls, and to transfer control to the correct point. Conditionals within the code result in the execution of different code paths on successive calls, based on the values of the state variables. Another typical response is to implement an explicit state machine in the form of a large and complex switch statement or via a goto statement, particularly a computed goto. Such implementations are considered difficult to understand and maintain, and a motivation for coroutine support.

Threads, and to a lesser extent fibers, are an alternative to coroutines in mainstream programming environments today. Threads provide facilities for managing the real-time cooperative interaction of simultaneously executing pieces of code. Threads are widely available in environments that support C (and are supported natively in many other modern languages), are familiar to many programmers, and are usually well-implemented, well-documented and well-supported. However, as they solve a large and difficult problem they include many powerful and complex facilities and have a correspondingly difficult learning curve. As such, when a coroutine is all that is needed, using a thread can be overkill.

One important difference between threads and coroutines is that threads are typically preemptively scheduled while coroutines are not. Because threads can be rescheduled at any instant and can execute concurrently, programs using threads must be careful about locking. In contrast, because coroutines can only be rescheduled at specific points in the program and do not execute concurrently, programs using coroutines can often avoid locking entirely. This property is also cited as a benefit of event-driven or asynchronous programming.

Since fibers are cooperatively scheduled, they provide an ideal base for implementing coroutines above.[23] However, system support for fibers is often lacking compared to that for threads.

C

[edit]

In order to implement general-purpose coroutines, a second call stack must be obtained, which is a feature not directly supported by the C language. A reliable (albeit platform-specific) way to achieve this is to use a small amount of inline assembly to explicitly manipulate the stack pointer during initial creation of the coroutine. This is the approach recommended by Tom Duff in a discussion on its relative merits vs. the method used by Protothreads.[24][non-primary source needed] On platforms which provide the POSIX sigaltstack system call, a second call stack can be obtained by calling a springboard function from within a signal handler[25][26] to achieve the same goal in portable C, at the cost of some extra complexity. C libraries complying to POSIX or the Single Unix Specification (SUSv3) provided such routines as getcontext, setcontext, makecontext and swapcontext, but these functions were declared obsolete in POSIX 1.2008.[27]

Once a second call stack has been obtained with one of the methods listed above, the setjmp and longjmp functions in the standard C library can then be used to implement the switches between coroutines. These functions save and restore, respectively, the stack pointer, program counter, callee-saved registers, and any other internal state as required by the ABI, such that returning to a coroutine after having yielded restores all the state that would be restored upon returning from a function call. Minimalist implementations, which do not piggyback off the setjmp and longjmp functions, may achieve the same result via a small block of inline assembly which swaps merely the stack pointer and program counter, and clobbers all other registers. This can be significantly faster, as setjmp and longjmp must conservatively store all registers which may be in use according to the ABI, whereas the clobber method allows the compiler to store (by spilling to the stack) only what it knows is actually in use.

Due to the lack of direct language support, many authors have written their own libraries for coroutines which hide the above details. Russ Cox's libtask library[28] is a good example of this genre. It uses the context functions if they are provided by the native C library; otherwise it provides its own implementations for ARM, PowerPC, Sparc, and x86. Other notable implementations include libpcl,[29] coro,[30] lthread,[31] libCoroutine,[32] libconcurrency,[33] libcoro,[34] ribs2,[35] libdill.,[36] libaco,[37] and libco.[26]

In addition to the general approach above, several attempts have been made to approximate coroutines in C with combinations of subroutines and macros. Simon Tatham's contribution,[38] based on Duff's device, is a notable example of the genre, and is the basis for Protothreads and similar implementations.[39] In addition to Duff's objections,[24] Tatham's own comments provide a frank evaluation of the limitations of this approach: "As far as I know, this is the worst piece of C hackery ever seen in serious production code."[38] The main shortcomings of this approximation are that, in not maintaining a separate stack frame for each coroutine, local variables are not preserved across yields from the function, it is not possible to have multiple entries to the function, and control can only be yielded from the top-level routine.[24]

C++

[edit]
  • C++20 introduced standardized coroutines as stackless functions that can be suspended in the middle of execution and resumed at a later point. The suspended state of a coroutine is stored on the heap.[40] Implementation of this standard is ongoing, with the G++ and MSVC compilers currently fully supporting standard coroutines in recent versions.[41]
  • concurrencpp - a C++20 library which provides third-party support for C++20 coroutines, in the form of awaitable-tasks and executors that run them.
  • Boost.Coroutine - created by Oliver Kowalke, is the official released portable coroutine library of boost since version 1.53. The library relies on Boost.Context and supports ARM, MIPS, PowerPC, SPARC and X86 on POSIX, Mac OS X and Windows.
  • Boost.Coroutine2 - also created by Oliver Kowalke, is a modernized portable coroutine library since boost version 1.59. It takes advantage of C++11 features, but removes the support for symmetric coroutines.
  • Mordor - In 2010, Mozy open sourced a C++ library implementing coroutines, with an emphasis on using them to abstract asynchronous I/O into a more familiar sequential model.[42]
  • CO2 - stackless coroutine based on C++ preprocessor tricks, providing await/yield emulation.
  • ScummVM - The ScummVM project implements a light-weight version of stackless coroutines based on Simon Tatham's article.
  • tonbit::coroutine - C++11 single .h asymmetric coroutine implementation via ucontext / fiber
  • Coroutines landed in Clang in May 2017, with libc++ implementation ongoing.[43]
  • elle by Docker
  • oatpp-coroutines - stackless coroutines with scheduling designed for high-concurrency level I/O operations. Used in the 5-million WebSocket connections experiment by Oat++. Part of the Oat++ web framework.

C#

[edit]

C# 2.0 added semi-coroutine (generator) functionality through the iterator pattern and yield keyword.[44][45] C# 5.0 includes await syntax support. In addition:

  • The MindTouch Dream REST framework provides an implementation of coroutines based on the C# 2.0 iterator pattern.
  • The Caliburn (Archived 2013-01-19 at archive.today) screen patterns framework for WPF uses C# 2.0 iterators to ease UI programming, particularly in asynchronous scenarios.
  • The Power Threading Library (Archived 2010-03-24 at the Wayback Machine) by Jeffrey Richter implements an AsyncEnumerator that provides simplified Asynchronous Programming Model using iterator-based coroutines.
  • The Unity game engine implements coroutines.
  • The Servelat Pieces project by Yevhen Bobrov provides transparent asynchrony for Silverlight WCF services and ability to asynchronously call any synchronous method. The implementation is based on Caliburn's Coroutines iterator and C# iterator blocks.
  • StreamThreads is an open-source, light-weight C# co-routine library based on iterator extension methods. It supports error handling and return values.

Clojure

[edit]

Cloroutine is a third-party library providing support for stackless coroutines in Clojure. It's implemented as a macro, statically splitting an arbitrary code block on arbitrary var calls and emitting the coroutine as a stateful function.

D

[edit]

D implements coroutines as its standard library class Fiber A generator makes it trivial to expose a fiber function as an input range, making any fiber compatible with existing range algorithms.

Go

[edit]

Go has a built-in concept of "goroutines", which are lightweight, independent processes managed by the Go runtime. A new goroutine can be started using the "go" keyword. Each goroutine has a variable-size stack which can be expanded as needed. Goroutines generally communicate using Go's built-in channels.[46][47][48][49] However, goroutines are not coroutines (for instance, local data does not persist between successive calls).[50]

Java

[edit]

There are several implementations for coroutines in Java. Despite the constraints imposed by Java's abstractions, the JVM does not preclude the possibility.[51] There are four general methods used, but two break bytecode portability among standards-compliant JVMs.

  • Modified JVMs. It is possible to build a patched JVM to support coroutines more natively. The Da Vinci JVM has had patches created.[52]
  • Modified bytecode. Coroutine functionality is possible by rewriting regular Java bytecode, either on the fly or at compile time. Toolkits include Javaflow, Java Coroutines, and Coroutines.
  • Platform-specific JNI mechanisms. These use JNI methods implemented in the OS or C libraries to provide the functionality to the JVM.[citation needed]
  • Thread abstractions. Coroutine libraries which are implemented using threads may be heavyweight, though performance will vary based on the JVM's thread implementation.

JavaScript

[edit]

Since ECMAScript 2015, JavaScript has support for generators, which are a special case of coroutines.[53]

Kotlin

[edit]

Kotlin implements coroutines as part of a first-party library.

import kotlinx.coroutines.*

fun main() = runBlocking {
    launch {
        delay(1000L)
        print("Hello from within coroutine!")
    }

    print("Hello world!")
}

Lua

[edit]

Lua has supported first-class stackful asymmetric coroutines since version 5.0 (2003),[54] in the standard library coroutine.[55][56]

Modula-2

[edit]

Modula-2 as defined by Wirth implements coroutines as part of the standard SYSTEM library.

The procedure NEWPROCESS() fills in a context given a code block and space for a stack as parameters, and the procedure TRANSFER() transfers control to a coroutine given the coroutine's context as its parameter.

Mono

[edit]

The Mono Common Language Runtime has support for continuations,[57] from which coroutines can be built.

.NET Framework

[edit]

During the development of the .NET Framework 2.0, Microsoft extended the design of the Common Language Runtime (CLR) hosting APIs to handle fiber-based scheduling with an eye towards its use in fiber-mode for SQL server.[58] Before release, support for the task switching hook ICLRTask::SwitchOut was removed due to time constraints.[59] Consequently, the use of the fiber API to switch tasks is currently not a viable option in the .NET Framework.[needs update]

OCaml

[edit]

OCaml supports coroutines through its Thread module.[60] These coroutines provide concurrency without parallelism, and are scheduled preemptively on a single operating system thread. Since OCaml 5.0, green threads are also available; provided by different modules.

Perl

[edit]

Coroutines are natively implemented in all Raku backends.[61]

PHP

[edit]

Python

[edit]

Python has support for coroutines using the asyncio.create_task function.[62]

  • Python 2.5 implements better support for coroutine-like functionality, based on extended generators (PEP 342)
  • Python 3.3 improves this ability, by supporting delegating to a subgenerator (PEP 380)
  • Python 3.4 introduces a comprehensive asynchronous I/O framework as standardized in PEP 3156, which includes coroutines that leverage subgenerator delegation
  • Python 3.5 introduces explicit support for coroutines with async/await syntax (PEP 0492).
  • Since Python 3.7, async/await have become reserved keywords.[63]
  • Eventlet
  • Greenlet
  • gevent
  • Stackless Python
async def main():
    task1 = asyncio.create_task(say_after(1, "hello"))
    task2 = asyncio.create_task(say_after(2, "world"))

    print(f"started at {time.strftime('%X')}")

    # Wait until both tasks are completed (should take around 2 seconds)
    await task1
    await task2

    print(f"finished at {time.strftime('%X')}")

Racket

[edit]

Racket provides native continuations, with a trivial implementation of coroutines provided in the official package catalog. Implementation by S. De Gabrielle

Ruby

[edit]

Scheme

[edit]

Since Scheme provides full support for continuations, implementing coroutines is nearly trivial, requiring only that a queue of continuations be maintained.

Smalltalk

[edit]

Since, in most Smalltalk environments, the execution stack is a first-class citizen, coroutines can be implemented without additional library or VM support.

Tcl

[edit]

Since version 8.6, Tcl supports coroutines in the core language. [65]

Vala

[edit]

Vala implements native support for coroutines. They are designed to be used with a GTK Main Loop, but can be used alone if care is taken to ensure that the end callback will never have to be called before doing, at least, one yield.

Assembly languages

[edit]

Machine-dependent assembly languages often provide direct methods for coroutine execution. For example, in MACRO-11, the assembly language of the PDP-11 family of minicomputers, the "classic" coroutine switch is effected by the instruction "JSR PC,@(SP)+", which jumps to the address popped from the stack and pushes the current (i.e that of the next) instruction address onto the stack. On VAXen (in VAX MACRO) the comparable instruction is "JSB @(SP)+". Even on a Motorola 6809 there is the instruction "JSR [,S++]"; note the "++", as 2 bytes (of address) are popped from the stack. This instruction is much used in the (standard) 'monitor' Assist 09.

See also

[edit]
  • Async/await
  • Pipeline, a kind of coroutine used for communicating between programs[66]
  • Protothreads, a stackless lightweight thread implementation using a coroutine like mechanism

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A coroutine is a computer program component that generalizes subroutines by allowing execution to be suspended and later resumed from the point of suspension, facilitating cooperative multitasking without preemption. The term was coined by Melvin Conway in 1958 and first detailed in his 1963 paper, where coroutines are defined as autonomous subroutines operating at the same level, communicating through discrete information along fixed, one-way paths without a central master program. Coroutines enable modular program designs, as exemplified in Conway's work on a one-pass COBOL compiler, where they allow separable modules to transfer control via read/write operations, supporting efficient compilation through flexible pass configurations. They differ from traditional subroutines by maintaining local state across suspensions and resumptions, providing expressive power equivalent to one-shot continuations while being simpler to implement in procedural languages. Classified into symmetric and asymmetric types, coroutines support varied control flows; asymmetric coroutines, common in modern implementations, establish a hierarchical relationship where one coroutine resumes a subordinate, as seen in Lua's facilities for first-class coroutines with separate stacks and yield/resume operations. In Lua, they enable applications like generators for iterative data production and cooperative scheduling for multitasking. Contemporary languages integrate coroutines for asynchronous programming: Python uses them for generators, C++20 exposes primitives like co_await and coroutine lambdas for suspendible functions with explicit state management, and Kotlin employs them for non-blocking concurrency. These features underscore coroutines' role in simplifying complex control structures, from compilers to high-performance async systems.

Fundamentals

Definition

A coroutine is a generalization of a subroutine that permits multiple entry points for suspending and resuming execution, allowing cooperative multitasking without blocking the caller. This enables a program component to pause at designated points and later continue from exactly where it left off, treating coroutines as peer-level routines rather than hierarchical calls. Key characteristics of coroutines include non-preemptive scheduling, where execution yields control explicitly rather than through interrupts; explicit managed via yield operations (to suspend and return ) and resume operations (to restart and pass ); and bidirectional communication, allowing exchange between the coroutine and its invoker during these transfers. The term "coroutine" was coined by Conway in 1958 and first detailed in his 1963 paper on . In formal distinction from processes, coroutines operate within the same address space as the main program, sharing memory and resources, which makes them lighter-weight with minimal overhead for suspension and resumption compared to the separate address spaces and heavier context switches of processes. The following pseudocode illustrates a basic asymmetric coroutine using yield and resume, where a producer coroutine yields values to its caller:

coroutine producer = create(function() local i = 1 while i <= 3 do yield(i) -- Suspends and sends i to the resumer i = i + 1 end return "done" end) -- In the main program: resume(producer) -- Starts execution, receives 1 from first yield resume(producer) -- Resumes, receives 2 from second yield resume(producer) -- Resumes, receives 3 from third yield and "done" from return

coroutine producer = create(function() local i = 1 while i <= 3 do yield(i) -- Suspends and sends i to the resumer i = i + 1 end return "done" end) -- In the main program: resume(producer) -- Starts execution, receives 1 from first yield resume(producer) -- Resumes, receives 2 from second yield resume(producer) -- Resumes, receives 3 from third yield and "done" from return

This example demonstrates suspension at each yield and resumption with value passing, as implemented in languages like Lua.

Historical Development

The concept of coroutines originated with Melvin Conway's 1963 paper, where he proposed them as cooperative subroutines that could suspend and resume execution to facilitate modular compiler design, particularly for handling separable transition diagrams in a COBOL compiler. In the mid-1960s, coroutines saw early practical implementations in simulation languages, most notably Simula, developed by Ole-Johan Dahl and Kristen Nygaard at the Norwegian Computing Center, where they enabled efficient modeling of concurrent processes in discrete event simulations. By the 1970s, coroutines gained broader adoption in systems and general-purpose programming languages, such as BLISS, a typeless language for systems programming introduced at Carnegie Mellon University, which included explicit coroutine calls to support flexible control flow in operating system development. Similarly, the Icon programming language, conceived in 1977 by Ralph and Madge Griswold, popularized coroutines through its "co-expressions" feature, which allowed goal-directed execution and backtracking for string processing and non-numerical applications. During the and , coroutines' prominence waned as hardware advances, including the rise of affordable multiprocessor systems, favored threads for concurrency, which offered better alignment with parallel architectures and kernel-level support in operating systems like Unix and Windows. in coroutines revived in the early with their integration into scripting languages for lightweight concurrency; , released in , introduced full asymmetric coroutines to enhance scripting and embedded applications by without threads. This trend continued in with Python 3.5, which added native support for coroutines via the async/await through PEP 492, facilitating and improving in web servers and .

Types and Variations

Stackful and Stackless Coroutines

Coroutines can be implemented using either stackful or stackless architectures, which differ fundamentally in how they manage execution context and suspension. Stackful coroutines allocate a separate call stack for each instance, enabling the preservation of the entire call chain during suspension and resumption. This allows a coroutine to suspend execution from deep within nested function calls and resume precisely at that point, mimicking the behavior of cooperative multitasking without relying on operating system threads. In contrast, stackless coroutines do not maintain independent stacks; instead, they rely on the program's single call stack and suspend only at explicit yield points, typically at the top level of the coroutine function. Suspension in this model transforms the coroutine into a state machine or continuation, where local state is explicitly saved and restored by the compiler or runtime, limiting resumption to tail-call-like positions rather than arbitrary nested depths. Technically, stackful coroutines operate with their own stack pointer, facilitating full switching that includes all nested , which supports recursive or deeply nested invocations without additional machinery. Stackless coroutines, however, achieve state preservation through compiler-assisted rewrites that compile the coroutine body into a series of states, often using heap-allocated for , which avoids the overhead of stack duplication but requires language-level support for such transformations. The trade-offs between these approaches center on flexibility versus efficiency. Stackful coroutines provide greater expressiveness for complex control flows, such as recursive coroutines or simulations requiring nested suspensions, but incur higher memory and creation overhead due to per-instance stacks—typically larger frames that can consume more resources in multi-coroutine scenarios. Stackless coroutines are lighter-weight, with faster context switching (up to 3.5 times quicker in some runtimes) and smaller memory footprints, making them suitable for high-concurrency environments, though they demand careful design to handle state across yields and may require auxiliary structures for nested behaviors. These stack management choices are orthogonal to whether coroutines employ symmetric or asymmetric control transfer.

Symmetric and Asymmetric Coroutines

Symmetric coroutines treat all instances as peers, allowing any coroutine to directly resume any other through a single control-transfer operation, which facilitates mutual suspensions and complex interactions among equal entities. This peer-to-peer model enables flexible collaboration, as seen in applications requiring independent units to cooperate without a fixed hierarchy, such as simulation models where multiple entities advance in tandem through cooperative multitasking. For instance, in full-duplex communication channels, symmetric coroutines support bidirectional data exchange by allowing each party to suspend and resume the other seamlessly, mimicking concurrent processes. In contrast, asymmetric coroutines impose a hierarchical structure, where a master coroutine resumes subordinate "slave" coroutines using a resume operation, but the subordinates can only yield control back to their invoker via a suspend operation, preventing direct resumption of the master. This directed control flow simplifies management in scenarios with clear caller-callee relationships, making it ideal for structured workflows like producer-consumer patterns, where one coroutine generates data and yields it to a consumer that processes and resumes the producer only indirectly. An example is iterator-like behaviors, such as traversing a binary tree, where the master iterates by resuming the subordinate coroutine that yields successive nodes without the ability to control the master directly. The primary advantage of symmetric coroutines lies in their support for egalitarian interactions, enabling intricate mutual dependencies that are harder to express asymmetrically without additional machinery, though asymmetric models offer simpler, more predictable directed execution suitable for sequential or pipelined tasks. Both models provide equivalent expressive power, as asymmetric coroutines can simulate symmetric ones through layered invocations, but symmetric designs promote cleaner code for peer collaborations. Stackful implementations often support symmetry more naturally due to their ability to handle direct peer transfers.

Comparisons

With Subroutines

Traditional subroutines, also known as procedures or functions, are fundamental program components characterized by a single entry point and a single exit point, executing from start to completion without intermediate suspension. Upon invocation, a subroutine blocks the calling program until it returns control, maintaining no persistent state between separate calls beyond any explicitly passed parameters or global variables. Coroutines generalize and extend subroutines by introducing multiple entry and exit points, facilitated through operations like yield and resume, which allow execution to pause at designated points and resume later without losing the subroutine's internal state. This suspension capability enables coroutines to transfer control voluntarily to another coroutine, rather than always returning to the original caller, thus supporting symmetric interactions among multiple program units. The key distinction lies in their control flow: while subroutines enforce a blocking, hierarchical caller-callee relationship that completes fully before proceeding, coroutines permit cooperative yielding, allowing non-blocking pauses that enhance efficiency in sequential multitasking scenarios. Historically, coroutines were conceived as a means to generalize subroutines for multitasking applications without relying on operating system intervention, as introduced by Melvin Conway in his 1963 work on separable compilers.

With Threads

Threads represent operating system-managed units of execution that operate preemptively, each allocated a separate stack and capable of true parallelism across multiple processor cores. This preemptive scheduling, handled by the kernel, allows threads to interleave execution automatically but introduces significant overhead from context switches involving register saves, stack manipulations, and potential cache invalidations. In contrast, coroutines are lightweight, user-space constructs that rely on cooperative scheduling, where execution yields control explicitly at defined points, often sharing a single stack in stackless implementations or using minimal, segmented stacks in stackful variants to avoid the full resource allocation of threads. This design enables efficient simulation of concurrency within a single thread, bypassing kernel involvement and reducing synchronization complexities like locks, though it forfeits automatic parallelism. Stackful coroutines more closely mimic thread behavior by supporting full subroutine calls and returns across yield points. The primary distinction in resource usage lies in overhead: thread context switches typically incur costs in the microseconds range due to kernel transitions, whereas coroutine switches operate at user level in nanoseconds, often below 50 ns, making coroutines far cheaper for frequent yielding scenarios but unsuitable for inherent parallelism. Threads demand more memory per unit—often kilobytes to megabytes for stacks—limiting scalability to thousands at most, while coroutines support tens or hundreds of thousands with minimal footprint. Coroutines excel in I/O-bound workloads, where tasks spend time waiting on external events like network or disk operations, allowing efficient multiplexing of many concurrent activities without blocking the underlying thread. Threads, however, are preferable for CPU-bound tasks requiring parallel computation across cores to leverage hardware concurrency and maximize throughput.

With Generators

Generators represent a common implementation of asymmetric, stackless coroutines, where execution yields control and values unidirectionally to the caller, suspending the routine until resumption via iteration or explicit next calls. In this model, the generator function uses a yield statement to produce a sequence of values on demand, maintaining its local state across suspensions without requiring a separate stack, which makes it memory-efficient for iterative computations. For instance, in Python, introduced via PEP 255 in version 2.2, a generator function like def count_up_to(n): for i in range(n): yield i allows lazy evaluation, returning one value at a time when iterated over. Full coroutines build upon generators by enabling bidirectional communication, allowing the caller to send values back into the suspended routine upon resumption, thus supporting more interactive control flows beyond simple value production. This extension addresses key limitations of plain generators, which cannot accept input values when resumed—resumption simply continues from the yield point without altering the yielded expression's result. In Python, PEP 342 enhanced generators to function as coroutines by redefining yield as an expression that can receive sent values and introducing the send(value) method; for example, a coroutine might use result = yield value to both output value and input result from the caller. The introduction of generators via the yield keyword in languages like Python popularized coroutine-like abstractions in mainstream programming, paving the way for their use in asynchronous and event-driven patterns while remaining limited to unidirectional yields in their basic form. Asymmetric coroutines are often implemented as these enhanced generators, providing a lightweight mechanism for cooperative multitasking.

With Mutual Recursion

Mutual recursion involves two or more functions that invoke one another, potentially leading to deep call nesting and stack overflow in the absence of tail-call optimization, as each recursive call allocates a new stack frame. Coroutines address this limitation by allowing suspension at explicit points across recursive boundaries, thereby preserving the execution state without requiring a full stack unwind upon resumption. This mechanism simulates mutual recursion efficiently, avoiding the exponential stack growth that can occur in traditional recursive implementations, such as in producer-consumer scenarios where alternating calls between functions would otherwise exhaust stack resources quickly. A key application of coroutines in this context is modeling state machines or parsers, where recursive interactions alternate between components without risking overflow; for instance, a coroutine-based parser can suspend after processing input from one module and resume in another, maintaining interleaved state across multiple "recursive" levels. Symmetric coroutines particularly facilitate such mutual interactions by enabling bidirectional control transfer. Compared to plain recursion, coroutines offer the advantage of explicit yield points that prevent infinite loops by allowing timely suspension and external checks, while also promoting cooperation in multitasking environments through controlled resumption. This explicit control enhances reliability in interdependent recursive flows, as demonstrated in coroutine designs that use snapshots to capture and restore state without stack dependencies.

Applications

Common Uses

Coroutines facilitate cooperative multitasking in event-driven systems, such as GUI frameworks, where they enable routines to yield control voluntarily at designated points, allowing the event loop to process user input or other events without blocking the main thread. This approach ensures responsive interfaces by suspending execution during idle periods, such as awaiting mouse clicks or keyboard events, and resuming seamlessly upon resumption. In producer-consumer patterns, coroutines support efficient data streaming by allowing one coroutine to yield produced data items while another suspends to await and process them, enabling non-blocking coordination without shared mutable state or locks. Channels implemented via coroutines optimize this interaction, using buffered or rendezvous mechanisms to handle variable production rates, as seen in message-passing systems where sends and receives synchronize cooperatively. For simulation and modeling, coroutines enable alternating execution in discrete event simulations, such as modeling network traffic or system behaviors, by suspending at event boundaries and resuming based on simulated time advances. Stackless coroutines, in particular, provide a portable way to implement process-oriented models, bridging the gap between high-level abstractions and efficient event-driven execution without requiring full thread stacks. Coroutines aid in parsing and state machines by suspending at token boundaries in compilers or interpreters, allowing modular handling of nested or interleaved syntactic constructs without recursive descent or explicit stack management. This coroutine-based approach supports one-pass parsing of complex grammars, such as those involving comments, macros, or conditional compilation, where multiple parsing routines cooperate by yielding control to process overlapping syntaxes. In error handling, coroutines allow try-catch mechanisms to span suspension points without full stack unwinding, propagating exceptions across yields and resumptions to maintain structured control flow in suspended computations. This preserves local state during error recovery, avoiding the overhead of restarting from checkpoints in long-running or interleaved routines. Modern extensions of coroutines often integrate with asynchronous I/O for non-blocking operations, though their core utility remains in these general-purpose patterns.

Relation to Asynchronous Programming

Coroutines form the foundational mechanism for modern asynchronous programming paradigms, particularly through the async/await syntax, which serves as syntactic sugar over coroutine-like state machines or promise chains. In this model, an async function desugars into a state machine that suspends execution (yielding control) at await points, allowing non-blocking operations while maintaining readable, sequential code structure. This approach enables efficient handling of I/O-bound tasks, such as in web servers where multiple concurrent requests can be processed without callback hell or thread proliferation; for instance, a single-threaded event loop can manage thousands of suspended coroutines awaiting network responses, improving scalability over traditional blocking calls. The evolution of coroutines in asynchronous programming traces back to Lua's introduction of lightweight, cooperative coroutines in version 5.0 (2003), which facilitated non-blocking I/O in scripting environments like game engines. This influenced later adoptions, such as Python's asyncio module in 3.4 (2014) and native async/await coroutines in 3.5 (2015) via PEP 492, enabling structured concurrency in high-level applications. JavaScript followed with async/await in ECMAScript 2017, building on ES6 generators to integrate promise-based asynchrony seamlessly into browser and server-side code. Despite these advances, challenges persist in debugging coroutine-based async code, particularly tracing execution across suspended states where control flow appears non-linear and thread switches obscure stack traces. Coroutines also relate closely to green threads or fibers in runtimes like Go, where goroutines provide user-space scheduling akin to stackful coroutines, multiplexing many lightweight tasks onto fewer OS threads for better resource efficiency. In contemporary systems, coroutines underpin scalable event-driven architectures in cloud computing and microservices, allowing services to handle high-throughput, reactive workloads—such as real-time data processing—by suspending on events like API calls or message queues without blocking underlying infrastructure. As of August 2025, research has identified security risks in C++ coroutines, including vulnerability to code-reuse attacks despite control flow obfuscation, prompting ongoing improvements in secure implementations.

Implementations

Native Language Support

Several programming languages provide built-in support for coroutines through core syntax keywords or primitives, enabling developers to implement cooperative multitasking without external libraries. This native integration often simplifies asynchronous programming by reducing boilerplate code compared to library-based approaches. Lua has supported first-class, stackful, and asymmetric coroutines since version 5.0, released in 2003, via functions like coroutine.create, coroutine.yield, and coroutine.resume in its standard library. These primitives allow pausing and resuming execution at yield points, making Lua suitable for scripting in resource-constrained environments, such as game development; for instance, Lua coroutines facilitate non-blocking operations in World of Warcraft add-ons. Python introduced generators with the yield keyword in version 2.2 (2001), which provide a form of one-way coroutine for lazy iteration and memory-efficient data production. Full bidirectional coroutines, supporting both yielding and receiving values, arrived with the async and await syntax in version 3.5 (2015) as part of the asyncio module, enabling structured concurrent programming. Kotlin integrates coroutine support directly into its language features since version 1.3 (2018), using suspend functions that can pause execution without blocking threads, resulting in stackless implementation for lightweight concurrency. Go offers goroutines as lightweight, managed threads since its initial release in 2009, with channel-based communication and the select statement providing coroutine-like multiplexing for non-blocking I/O operations. Ruby introduced Fibers in version 1.9 (2009), which act as stackful coroutines for implementing cooperative concurrency, allowing code blocks to pause and resume for handling blocking operations without full threading overhead. C++ introduced standardized coroutines in C++20 (2020), providing stackless coroutines through the core language keywords co_await, co_yield, and co_return. These enable functions to suspend execution and resume later, facilitating efficient asynchronous and non-blocking programming without requiring separate library frameworks.

Library-Based Implementations

In languages lacking native coroutine support, libraries provide mechanisms for implementing coroutines through manual context switching and stack management. In C, the ucontext.h header offers functions like makecontext() and swapcontext() to create and switch between execution contexts, enabling stackful coroutines where each coroutine maintains its own stack allocated manually by the programmer. These functions save and restore the processor state, allowing suspension and resumption, but require explicit stack allocation to avoid overflows or underflows. Libraries such as libtask build on similar principles, providing a higher-level API for coroutine creation and scheduling on Unix-like systems, with portable assembly for context switching across architectures like x86 and ARM. For C++, the Boost.Coroutine library (now in version 2) implements stackful coroutines using assembly-based context switching, allowing users to define pull- and push-type coroutines without native language keywords. It handles stack allocation and deallocation automatically, supporting symmetric and asymmetric coroutine models, and was widely used prior to C++20 standardization. While C++20 introduces native coroutines via keywords like co_await and co_yield, their implementation relies on standard library components such as promise_type and awaitable objects, often integrated with libraries like Boost.Asio for asynchronous I/O. These library extensions address pre-standardization gaps by providing portable, exception-safe coroutine traits. Java, without built-in coroutines, uses libraries like Kilim, which employs bytecode instrumentation at compile-time to transform methods into lightweight tasks capable of suspending and resuming like coroutines. Kilim's actors model integrates coroutine-style suspension for non-blocking operations, enabling efficient concurrency without OS threads. Similarly, the Quasar library introduces fibers as user-mode threads that support coroutine-like pausing via instrumented continuations, allowing millions of concurrent tasks with minimal overhead compared to traditional threads. In JavaScript, prior to ES2017's async/await, libraries such as co leveraged ES6 generators to simulate coroutines in Node.js, wrapping generator functions to handle yield-based suspension and automatic continuation. The fibers library extends V8 to support true stackful coroutines, enabling synchronous-looking code for asynchronous tasks through cooperative multitasking without callbacks. For .NET and C#, the async/await pattern, introduced in .NET 4.5, compiles to state machines that implement stackless coroutines, where the compiler generates a struct to track execution state across await points, suspending without full stack unwinding. This library-based approach, part of the Task Parallel Library, transforms asynchronous methods into resumable state machines for scalable I/O-bound operations. Other languages employ specialized libraries for coroutine-like behavior. Clojure's core.async provides channels for asynchronous communication, abstracting coroutines into go blocks that compile to state machines for non-blocking coordination, inspired by Communicating Sequential Processes. In PHP, the Amp library facilitates coroutines through promises and event loops, using generators for suspension in pre-fiber versions and integrating with PHP 8.1's native fibers for lightweight concurrency. Implementing coroutines via libraries often faces portability challenges, particularly in low-level languages requiring assembly for context switches; functions like setjmp() and longjmp() offer simple non-local jumps but lack full register preservation, leading to architecture-specific issues and undefined behavior with exceptions or signals. ucontext-based approaches improve portability on POSIX systems but are deprecated in some modern standards, necessitating custom implementations for cross-platform support.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.