Recent from talks
Contribute something
Nothing was collected or created yet.
Cython
View on Wikipedia
| Cython | |
|---|---|
| Developer | Robert Bradshaw, Stefan Behnel, et al. |
| First appeared | 28 July 2007[1] |
| Stable release | 3.1.4 |
| Preview release | 3.0.0 beta 2 (27 March 2023[2]) [±] |
| Implementation language | Python |
| OS | Windows, macOS, Linux |
| License | Apache License 2.0 |
| Filename extensions | .pyx, .pxd, .pxi[3] |
| Website | cython |
| Influenced by | |
| C, Python | |
Cython (/ˈsaɪθɒn/) is a superset of the programming language Python, which allows developers to write Python code (with optional, C-inspired syntax extensions) that yields performance comparable to that of C.[4][5]
Cython is a compiled language that is typically used to generate CPython extension modules. Annotated Python-like code is compiled to C and then automatically wrapped in interface code, producing extension modules that can be loaded and used by regular Python code using the import statement, but with significantly less computational overhead at run time. Cython also facilitates wrapping independent C or C++ code into Python-importable modules.
Cython is written in Python and C and works on Windows, macOS, and Linux, producing C source files compatible with CPython 2.6, 2.7, and 3.3 and later versions. The Cython source code that Cython compiles (to C) can use both Python 2 and Python 3 syntax, defaulting to Python 2 syntax in Cython 0.x and Python 3 syntax in Cython 3.x. The default can be overridden (e.g. in source code comment) to Python 3 (or 2) syntax. Since Python 3 syntax has changed in recent versions, Cython may not be up to date with the latest additions. Cython has "native support for most of the C++ language" and "compiles almost all existing Python code".[6]
Cython 3.0.0 was released on 17 July 2023.[7]
Design
[edit]Cython works by producing a standard Python module. However, the behavior differs from standard Python in that the module code, originally written in Python, is translated into C. While the resulting code is fast, it makes many calls into the CPython interpreter and CPython standard libraries to perform actual work. Choosing this arrangement saved considerably on Cython's development time, but modules have a dependency on the Python interpreter and standard library.
Although most of the code is C-based, a small stub loader written in interpreted Python is usually required (unless the goal is to create a loader written entirely in C, which may involve work with the undocumented internals of CPython). However, this is not a major problem due to the presence of the Python interpreter.[8]
Cython has a foreign function interface for invoking C/C++ routines and the ability to declare the static type of subroutine parameters and results, local variables, and class attributes.
A Cython program that implements the same algorithm as a corresponding Python program may consume fewer computing resources such as core memory and processing cycles due to differences between the CPython and Cython execution models. A basic Python program is loaded and executed by the CPython virtual machine, so both the runtime and the program itself consume computing resources. A Cython program is compiled to C code, which is further compiled to machine code, so the virtual machine is used only briefly when the program is loaded.[9][10][11][12]
Cython employs:
- Optimistic optimizations
- Type inference (optional)
- Low overhead in control structures
- Low function call overhead[13][14]
Performance depends both on what C code is generated by Cython and how that code is compiled by the C compiler.[15]
History
[edit]Cython is a derivative of the Pyrex language, but it supports more features and optimizations than Pyrex.[16][17] Cython was forked from Pyrex in 2007 by developers of the Sage computer algebra package, because they were unhappy with Pyrex's limitations and could not get patches accepted by Pyrex's maintainer Greg Ewing, who envisioned a much smaller scope for his tool than the Sage developers had in mind. They then forked Pyrex as SageX. When they found people were downloading Sage just to get SageX, and developers of other packages (including Stefan Behnel, who maintains the XML library LXML) were also maintaining forks of Pyrex, SageX was split off the Sage project and merged with cython-lxml to become Cython.[18]
Cython files have a .pyx extension. At its most basic, Cython code looks exactly like Python code. However, whereas standard Python is dynamically typed, in Cython, types can optionally be provided, allowing for improved performance, allowing loops to be converted into C loops where possible. For example:
# The argument will be converted to int or raise a TypeError.
def primes(int kmax):
# These variables are declared with C types.
cdef int n, k, i
# Another C type
cdef int p[1000]
# A Python type
result = []
if kmax > 1000:
kmax = 1000
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
result.append(n)
n = n + 1
return result
Example
[edit]
A sample hello world program for Cython is more complex than in most languages because it interfaces with the Python C API and setuptools or other PEP517-compliant extension building facilities.[jargon] At least three files are required for a basic project:
- A
setup.pyfile to invoke thesetuptoolsbuild process that generates the extension module - A main python program to load the extension module
- Cython source file(s)
The following code listings demonstrate the build and launch process:
# hello.pyx - Python module, this code will be translated to C by Cython.
def say_hello():
print("Hello World!")
# launch.py - Python stub loader, loads the module that was made by Cython.
# This code is always interpreted, like normal Python.
# It is not compiled to C.
import hello
hello.say_hello()
# setup.py - unnecessary if not redistributing the code, see below
from setuptools import setup
from Cython.Build import cythonize
setup(name = "Hello world app",
ext_modules = cythonize("*.pyx"))
These commands build and launch the program:
$ python setup.py build_ext --inplace
$ python launch.py
Using in IPython/Jupyter notebook
[edit]A more straightforward way to start with Cython is through command-line IPython (or through in-browser python console called Jupyter notebook):
In [1]: %load_ext Cython
In [2]: %%cython
...: def f(n):
...: a = 0
...: for i in range(n):
...: a += i
...: return a
...:
...: cpdef g(int n):
...: cdef long a = 0
...: cdef int i
...: for i in range(n):
...: a += i
...: return a
...:
In [3]: %timeit f(1000000)
10 loops, best of 3: 26.5 ms per loop
In [4]: %timeit g(1000000)
1000 loops, best of 3: 279 µs per loop
which gives a 95 times improvement over the pure-python version. More details on the subject in the official quickstart page.[19]
Uses
[edit]Cython is particularly popular among scientific users of Python,[11][20][21] where it has "the perfect audience" according to Python creator Guido van Rossum.[22] Of particular note:
- The free software SageMath computer algebra system depends on Cython, both for performance and to interface with other libraries.[23]
- Significant parts of the scientific computing libraries SciPy, pandas and scikit-learn are written in Cython.[24][25]
- Some high-traffic websites such as Quora use Cython.[better source needed][26]
Cython's domain is not limited to just numerical computing. For example, the lxml XML toolkit is written mostly in Cython, and like its predecessor Pyrex, Cython is used to provide Python bindings for many C and C++ libraries such as the messaging library ZeroMQ.[27] Cython can also be used to develop parallel programs for multi-core processor machines; this feature makes use of the OpenMP library.
See also
[edit]References
[edit]- ^ Behnel, Stefan (2008). "The Cython Compiler for C-Extensions in Python". EuroPython (28 July 2007: official Cython launch). Vilnius/Lietuva.
- ^ Cython Changelog, cython, 15 May 2023, retrieved 19 May 2023
- ^ "Language Basics — Cython 3.0.0a9 documentation". cython.readthedocs.io. Retrieved 9 September 2021.
- ^ "Cython - an overview — Cython 0.19.1 documentation". Docs.cython.org. Retrieved 21 July 2013.
- ^ Smith, Kurt (2015). Cython: A Guide for Python Programmers. O'Reilly Media. ISBN 978-1-4919-0155-7.
- ^ "FAQ · cython/cython Wiki". GitHub. Retrieved 11 January 2023.
- ^ "Cython Changelog". cython.org. Retrieved 21 July 2023.
- ^ "Basic Tutorial — Cython 3.0a6 documentation". cython.readthedocs.io. Retrieved 11 December 2020.
- ^ Oliphant, Travis (20 June 2011). "Technical Discovery: Speeding up Python (NumPy, Cython, and Weave)". Technicaldiscovery.blogspot.com. Retrieved 21 July 2013.
- ^ Behnel, Stefan; Bradshaw, Robert; Citro, Craig; Dalcin, Lisandro; Seljebotn, Dag Sverre; Smith, Kurt (2011). "Cython: The Best of Both Worlds". Computing in Science and Engineering. 13 (2): 31–39. Bibcode:2011CSE....13b..31B. doi:10.1109/MCSE.2010.118. hdl:11336/13103. S2CID 14292107.
- ^ a b Seljebot, Dag Sverre (2009). "Fast numerical computations with Cython". Proceedings of the 8th Python in Science Conference. pp. 15–22. doi:10.25080/GTCA8577.
- ^ Wilbers, I.; Langtangen, H. P.; Ødegård, Å. (2009). Skallerud, B.; Andersson, H. I. (eds.). "Using Cython to Speed up Numerical Python Programs". Proceedings of MekIT'09: 495–512. Archived from the original (PDF) on 4 January 2017. Retrieved 14 June 2011.
- ^ "wrapper benchmarks for several Python wrapper generators (except Cython)". Archived from the original on 4 April 2015. Retrieved 28 May 2010.
- ^ "wrapper benchmarks for Cython, Boost.Python and PyBindGen". Archived from the original on 3 March 2016. Retrieved 28 May 2010.
- ^ "Cython: C-Extensions for Python". Retrieved 22 November 2015.
- ^ "Differences between Cython and Pyrex". GitHub.
- ^ Ewing, Greg (21 March 2011). "Re: VM and Language summit info for those not at Pycon (and those that are!)" (Message to the electronic mailing-list
python-dev). Retrieved 5 May 2011. - ^ Says Sage and Cython developer Robert Bradshaw at the Sage Days 29 conference (22 March 2011). "Cython: Past, Present and Future". Archived from the original on 21 December 2021. Retrieved 5 May 2011 – via YouTube.
{{cite web}}: CS1 maint: numeric names: authors list (link) - ^ "Building Cython code". cython.readthedocs.io. Retrieved 24 April 2017.
- ^ "inSCIght: The Scientific Computing Podcast" (Episode 6). Archived from the original on 10 October 2014. Retrieved 29 May 2011.
- ^ Millman, Jarrod; Aivazis, Michael (2011). "Python for Scientists and Engineers". Computing in Science and Engineering. 13 (2): 9–12. Bibcode:2011CSE....13b...9M. doi:10.1109/MCSE.2011.36.
- ^ Guido Van Rossum (21 March 2011). "Re: VM and Language summit info for those not at Pycon (and those that are!)" (Message to the electronic mailing-list
python-dev). Retrieved 5 May 2011. - ^ Erocal, Burcin; Stein, William (2010). "The Sage Project: Unifying Free Mathematical Software to Create a Viable Alternative to Magma, Maple, Mathematica and MATLAB". Mathematical Software – ICMS 2010 (PDF). Lecture Notes in Computer Science. Vol. 6327. Springer Berlin / Heidelberg. pp. 12–27. CiteSeerX 10.1.1.172.624. doi:10.1007/978-3-642-15582-6_4. ISBN 978-3-642-15581-9.
- ^ "SciPy 0.7.2 release notes". Archived from the original on 4 March 2016. Retrieved 29 May 2011.
- ^ Pedregosa, Fabian; Varoquaux, Gaël; Gramfort, Alexandre; Michel, Vincent; Thirion, Bertrand; Grisel, Olivier; Blondel, Mathieu; Prettenhofer, Peter; Weiss, Ron; Dubourg, Vincent; Vanderplas, Jake; Passos, Alexandre; Cournapeau, David (2011). "Scikit-learn: Machine Learning in Python". Journal of Machine Learning Research. 12: 2825–2830. arXiv:1201.0490. Bibcode:2011JMLR...12.2825P.
- ^ "Is Quora still running on PyPy?".
- ^ "ØMQ: Python binding".
External links
[edit]Cython
View on GrokipediaFundamentals
Definition and Purpose
Cython is an optimizing static compiler for both the Python programming language and the extended Cython programming language, which generates C code that can be compiled into efficient extension modules for the CPython interpreter.[1] As a superset of Python, it allows developers to write code that is largely compatible with standard Python syntax while incorporating C-like features for performance enhancement.[3] The primary purpose of Cython is to enable Python programmers to achieve near-C-level performance without fully abandoning Python's productivity advantages, by facilitating optional static typing, direct interoperation with C and C++ code, and the creation of high-speed extensions.[1] This addresses key limitations of pure Python, such as interpreter overhead from dynamic typing and function calls, allowing for substantial speedups— for instance, adding static types can yield orders-of-magnitude improvements over interpreted Python in compute-intensive loops.[4] Compared to writing extension modules directly in C, Cython simplifies the process by reducing boilerplate and error-prone manual memory management. Key benefits include seamless wrapping of existing C libraries for use in Python applications, embedding of the CPython interpreter into C programs, and overall acceleration of numerical or data-processing tasks common in scientific computing.[1] Cython supports CPython 3.8 and later versions as its primary runtime, with experimental compatibility for alternatives like PyPy.[5] Originating as a successor to the Pyrex project, it builds on that foundation to provide more robust Python integration.[1]Relationship to Python and Pyrex
Cython is a superset of the Python 3 programming language, meaning that all valid Python 3 code is also valid Cython code and can be executed unchanged within a Cython context.[1][6] This design ensures seamless integration with existing Python codebases while introducing optional C-inspired extensions, such as static type declarations, to enable performance optimizations through compilation to C. Cython originated as a fork of Pyrex, a language developed by Greg Ewing and first released in April 2002, which aimed to simplify the creation of Python extension modules in C.[7][2] While Pyrex provided a Python-like syntax for generating C code to interface with the Python interpreter, it had limitations, including incomplete support for C++ and no native compatibility with Python 3.[8] Cython evolved from Pyrex to address these shortcomings, enhancing features like C++ integration and full Python 3 support, thereby positioning itself as a more robust tool for high-performance extensions without sacrificing Python's core usability.[7] In terms of compatibility, Cython source files, typically with the .pyx extension, can directly import and extend standard Python modules, allowing developers to mix dynamic Python features with compiled C elements in the same project.[9] Upon compilation, Cython generates C code that leverages the Python/C API to embed and interact with the Python interpreter runtime, preserving access to Python's dynamic behaviors such as object-oriented programming and exception handling even in performance-critical sections.[10] This hybrid approach ensures that Cython modules function as drop-in replacements for pure Python code within larger applications. Unlike standard Python, which relies solely on interpretation, Cython introduces compiler directives to fine-tune behavior, such as# cython: language_level=3 to explicitly set compatibility with Python 3 semantics and avoid ambiguities across versions.[11] These directives, which can also be specified via command-line options or build configurations, enable precise control over aspects like bounds checking and type inference, further bridging the gap between Python's flexibility and C's efficiency.[9]
Historical Development
Origins and Early Influences
Cython traces its roots to Pyrex, a programming language developed by Greg Ewing in 2002 to streamline the process of writing Python extension modules. Pyrex allowed developers to combine Python-like syntax with C data types and operations, compiling the result into efficient C code that interfaces directly with Python's C API. This approach addressed the inherent complexities and boilerplate code required when manually crafting C extensions, particularly for performance-critical applications such as numerical computations where direct API manipulation proved cumbersome and error-prone.[12][2] The motivation behind Pyrex stemmed from the growing demand in scientific computing for tools that could accelerate Python's execution without abandoning its ease of use. In domains like mathematics and data analysis, pure Python's interpreted overhead often resulted in unacceptably slow performance for compute-intensive tasks, such as simulations or large-scale array operations. Projects in this space, including early efforts in computer algebra systems, highlighted the need for a bridge between Python's high-level abstractions and C's low-level efficiency, making Pyrex a valuable asset for integrating legacy C libraries and optimizing bottlenecks.[13][14] By 2007, limitations in Pyrex—such as restricted support for C++ features and lack of compatibility with the emerging Python 3—prompted a fork by developers associated with the SageMath project, leading to the birth of Cython. This divergence was driven by the necessity to support more advanced language integrations and broader Python ecosystem evolution, while preserving Pyrex's core philosophy of static compilation for speed gains in scientific workflows. The fork enabled ongoing enhancements tailored to the demands of high-performance computing, setting Cython apart as a more versatile superset of Python.[15][16]Key Milestones and Contributors
Cython originated as a fork of the Pyrex project in 2007, initiated primarily to support the development needs of the SageMath mathematical software system, with early enhancements merging community patches for better Python compatibility and performance.[2] The first stable release, version 0.11, arrived in March 2009, marking a significant step forward with improved error handling during compilation and initial support for C++ class integration, which facilitated more robust extension module development. This release solidified Cython's divergence from Pyrex by emphasizing Pythonic syntax while enabling direct C-level optimizations. Subsequent milestones advanced Cython's alignment with evolving Python ecosystems. Version 0.14, released in 2011, introduced full support for Python 3 syntax and semantics, allowing seamless compilation across Python 2 and 3 runtimes and easing the transition for users adopting the newer Python version.[17] A major overhaul came with version 3.0 in July 2023, which dropped compatibility with Python 2 entirely and refocused on modern CPython implementations starting from Python 3.8, incorporating updates like exception propagation and vectorcall protocol support to enhance runtime efficiency.[17] Building on this, version 3.1, released in May 2025, added experimental support for free-threading in CPython 3.13 and the Limited C API, enabling better compatibility with alternative Python implementations and reducing ABI dependencies.[1][17] Version 3.2.0, released on November 5, 2025, further improved support for the Limited API and free-threading in CPython, along with other optimizations.[17] As of November 2025, the latest stable release is 3.2.1, issued on November 12, 2025, which includes bug fixes building on the enhancements in 3.2.0 for Limited API and free-threading support.[18] Key contributors have driven Cython's evolution since its inception. Stefan Behnel has served as the lead maintainer since 2007, overseeing core language design and optimization efforts.[2] Robert Bradshaw played a pivotal role in early integration with SageMath, while Dag Sverre Seljebotn enhanced NumPy array handling through Google Summer of Code funding.[1][2] Other notable figures include Lisandro Dalcín for Python 3 advancements and Vitja Makarov for expanded Python feature coverage.[2] Funding from organizations such as Google, Enthought, and Tidelift has supported these developments, including targeted improvements for scientific computing libraries.[1] The project boasts a vibrant community, with over 496 contributors on GitHub as of late 2025, reflecting broad collaboration. Cython was integrated into SageMath by 2008, becoming a foundational tool for its high-performance mathematical computations.[2]Core Features
Syntax and Semantic Extensions
Cython extends the Python programming language, supporting both traditional .pyx files as a superset and pure Python mode for .py files. It parses source using Python's grammar for dynamic features—such as duck typing and runtime polymorphism—alongside static C-like elements in .pyx mode or type annotations in pure mode for enhanced performance and low-level control.[19][20] This design allows seamless integration of Python code with C constructs, where standard Python syntax remains valid; in .pyx files, new keywords enable C-level operations, while in pure mode, PEP 484/526 type hints provide static typing without syntax extensions.[21] Key syntactic extensions in .pyx mode include thecdef keyword, which declares C-level variables, functions, and structs directly in the source code, bypassing Python's dynamic dispatch for faster execution and direct memory access.[22] For functions that need to be accessible from both Python and C contexts, the cpdef keyword provides a hybrid declaration, allowing calls from Python space while using C calling conventions internally for efficiency.[23] Additionally, the nogil directive can be applied to functions or code blocks to indicate that they do not require the Python Global Interpreter Lock (GIL), enabling safe execution in multithreaded environments without explicit GIL management, including compatibility with free-threaded Python as of version 3.1.[24][1] The cimport directive further supports modularity by importing declarations from .pxd files or C header files as if they were Python modules, facilitating the reuse of C definitions across Cython modules.[25]
Semantically, Cython maintains Python's automatic memory management through reference counting for Python objects passed as parameters or returned from functions, ensuring compatibility with Python's garbage collection model.[26] However, it permits optional manual memory handling using C standard library functions like malloc and free for raw C types, allowing developers to opt into lower-level control when needed.[27] For C++ integration, Cython introduces cdef cppclass within cdef extern from blocks to declare and wrap C++ classes, exposing their methods and attributes to Cython code while handling constructors, namespaces, and exceptions appropriately.[28] Compiler directives such as language_level enforce specific Python version semantics—such as treating strings as Unicode in Python 3 mode—affecting type interpretation and compatibility without altering the core syntax.[11]
Extension types, declared using cdef class, represent a semantic extension to Python's class system by implementing them as C structs with vtables, which provides contiguous memory layout for attributes and accelerates access compared to Python's dictionary-based storage.[29] These types support single inheritance and allow Python subclasses to override methods, blending object-oriented Python semantics with C's structural efficiency.[29]
Type Declarations and Optimizations
In .pyx files, Cython employs thecdef keyword to declare variables, functions, and types with static semantics, enabling direct mapping to C constructs and bypassing Python's dynamic type system; additionally, in pure Python mode, static typing can be achieved using standard Python type annotations from the typing module. This includes support for fundamental C types such as int, float, double, char, and void*, as well as pointers like int* and arrays such as int[10]. For Python interoperability, declarations can reference objects using object or PyObject*, allowing seamless handling of Python data structures within statically typed contexts.[30][20]
Memoryviews provide an efficient mechanism for accessing array-like data structures, declared using syntax like double[:] for one-dimensional views or int[:,:] for multi-dimensional ones, integrating with the Python buffer protocol (PEP 3118) to support NumPy arrays and other buffers without copying data. This facilitates zero-overhead access to contiguous memory regions, with optional bounds checking that can be disabled via compiler directives like boundscheck=False to further enhance performance in inner loops.[31]
Static type declarations optimize code by eliminating runtime type checks, object boxing, and dynamic dispatch, allowing the compiler to generate direct C calls and inline operations, which can result in significant speedups—often by orders of magnitude in compute-intensive sections. For instance, typing loop variables and accumulators as C types transforms Python loops into equivalent C loops, reducing overhead from Python's interpreter. Fused types, declared with cython.fused_type(), extend this by enabling generic functions over sets of compatible types (e.g., numeric types like int and double), where the compiler specializes implementations at compile time for optimal performance without runtime type resolution. In practice, such optimizations have demonstrated up to 150x speedups for simple typed functions compared to pure Python equivalents.[32][33]
Building and Deployment
Compilation Process
The compilation of Cython code occurs in two primary stages: first, a source file with a.pyx extension (or .py for pure Python code) is processed by the Cython compiler to generate an equivalent C file, and second, that C file is compiled by a standard C compiler into a platform-specific shared library, such as .so on Linux and macOS or .pyd on Windows.[9] This workflow enables Cython modules to function as efficient Python extension modules, leveraging the Python C API for interoperability.[9]
To initiate compilation from the command line, developers typically use the cythonize script, which is included with Cython installations and automates both the translation to C and the subsequent compilation to a shared library when provided with appropriate compiler flags.[9] For more complex projects, integration with Python's packaging tools is recommended; a setup.py file can employ setuptools or distutils by importing cythonize from Cython.Build and specifying extension modules, as in ext_modules = cythonize("example.pyx"), followed by invoking setup() with the build_ext command.[9] This approach handles dependencies and ensures the generated C code includes the necessary #include <Python.h> directive to embed access to the Python interpreter and its C API.[9]
Compiler directives allow fine-tuned control over the compilation process, influencing optimizations applied during the Cython-to-C translation stage.[9] These directives, such as boundscheck=False to disable array bounds checking or wraparound=False to skip negative indexing adjustments, can be specified in special comments at the top of .pyx or .pxd files (e.g., #cython: boundscheck=False), passed via the command line with cython -X boundscheck=False, or set globally in setup.py through the cythonize function's compiler_directives parameter.[9] Type declarations, as discussed in the Core Features section, further guide the compiler in generating optimized C code by enabling static typing where applicable.[9]
Cython's build process supports cross-platform development across Windows, Linux, and macOS, provided that a compatible C compiler is available—such as GCC or Clang on Unix-like systems and MSVC on Windows—and the Python development headers are installed to provide access to the Python C API.[9] These headers, typically included in Python's "dev" or "devel" packages (e.g., python3-dev on Debian-based Linux distributions), are essential for linking against the Python interpreter during the final C compilation step.[9] Starting with version 3.0, Cython incorporates a new frontend parser, which enhances compatibility with modern Python features and contributes to improved compilation performance in subsequent releases.[17] As of November 2025, the latest stable release is Cython 3.2.0, which includes further optimizations to the build process.[18]
Generated Code and Runtime Behavior
The Cython compiler translates source code into C code that forms a standard Python extension module, incorporating calls to the Python C API to interface with the Python runtime. For functions declared withdef, which are visible to Python code, the generated C includes boilerplate to parse arguments using functions like PyArg_ParseTuple and convert them to appropriate types, ensuring compatibility with Python's dynamic calling conventions. In contrast, sections with static type declarations, such as cdef functions or typed variables, generate direct C operations without Python object overhead, bypassing API calls for those parts to enable efficient computation. The resulting module exports an initialization function, typically named PyInit_<module_name>, which Python's import mechanism invokes to register the module's contents, such as functions and classes, into the Python namespace.[34][9]
At runtime, compiled Cython extensions load seamlessly as Python modules via the standard import statement, integrating with the Python interpreter as if they were pure Python code. Memory management in these modules relies on Python's reference counting for objects interacting with Python code, while C-level allocations in typed sections use manual calls to functions like PyMem_Malloc and PyMem_Free or standard C malloc/free to avoid garbage collection overhead where possible. Cython supports releasing the Global Interpreter Lock (GIL) using the nogil context manager or function qualifier, allowing parallel execution in multithreaded scenarios, such as within OpenMP directives or prange loops, provided the code avoids Python API calls that require the GIL. This enables true concurrency in CPU-bound tasks but requires careful design to prevent race conditions on shared Python objects.[35][36][24]
Performance benefits arise from eliminating Python's interpreter overhead in compiled sections, where typed loops and operations execute as native C code, often yielding speedups of 10x to 100x over equivalent pure Python for numerical tasks. However, the runtime retains Python's garbage collector for any Python objects, which can introduce pauses in long-running computations, and untyped code still incurs API call costs. Cython-annotated pure Python code, compilable directly from .py files, provides partial speedups by optimizing only the typed portions while preserving full Python compatibility. In free-threading mode, available in Python 3.13 and later (officially supported starting with Python 3.14) without the GIL, Cython extensions can achieve true thread parallelism, enhancing scalability on multicore systems.[37] Additionally, Cython's support for the Python Limited API ensures binary compatibility across minor Python versions by restricting use of stable C API subsets, reducing ABI breakage risks during upgrades.[34][38][39]
Practical Usage
Writing and Executing Code
Cython code is typically written in source files with the.pyx extension, which can contain a mix of Python code and Cython-specific extensions for performance optimization. For declaring pure-mode C interfaces or external C library headers without implementation details, separate declaration files with the .pxd extension are used, allowing modular separation of interfaces from implementations. The basic structure of a .pyx file includes standard Python imports for modules, def statements for functions that are directly callable from Python with full Python object semantics, and cdef statements for internal C-level functions that offer faster execution but require wrappers to expose them to Python code.[9][30][30]
To illustrate, consider a simple addition function in a file named add.pyx:
cdef int add(int a, int b):
return a + b
def py_add(a, b):
return add(a, b)
cdef int add(int a, int b):
return a + b
def py_add(a, b):
return add(a, b)
add is a cdef function optimized for C-level speed, while py_add is a def wrapper that makes it accessible from Python. This setup allows type declarations like int to enable static typing, reducing Python overhead.[30][32]
Once written, Cython modules are compiled to extension modules (.so on Unix-like systems or .pyd on Windows) before execution. Compilation can be performed directly using the command python -m cythonize -i add.pyx, which translates the .pyx file to C and builds the extension in place. Alternatively, a setup.py file can be used with from setuptools import setup; from Cython.Build import cythonize; setup(ext_modules=cythonize("add.pyx")), followed by python setup.py build_ext --inplace for building. The resulting module is then imported and executed like any Python module, for example, in a script test.py: import add; print(add.py_add(3, 4)). Cython supports incremental compilation during development, recompiling only modified files unless forced otherwise, and reports errors via Python-like tracebacks that include C line numbers when enabled.[9][9][9]
To demonstrate performance benefits, compare the Cython version against pure Python. A pure Python equivalent might be def py_add(a, b): return a + b. When calling py_add(1, 2) a million times in a loop, the pure Python version takes approximately 50-100 milliseconds, while the typed Cython version completes in under 1 millisecond, yielding a speedup of over 50 times due to eliminated type checking and object overhead. This gain is more pronounced in loops or numerical computations, though simple single calls show modest improvements.[32][32]
Integration with Tools and Ecosystems
Cython provides robust integration with interactive computing environments, particularly Jupyter and IPython, facilitating seamless development and execution of Cython code within notebooks. To enable this, users install Cython via pip or conda and load the extension using the%load_ext Cython command in a notebook cell. The %%cython magic then allows direct compilation and import of Cython code blocks, supporting features like type declarations (e.g., cdef int i) and inline annotations for performance analysis. For instance, a simple loop can be compiled on-the-fly, demonstrating speedups without leaving the notebook interface. This workflow supports rapid iteration, with options like %%cython --annotate to generate HTML visualizations of the compiled C code for optimization insights.[35]
In scientific computing ecosystems, Cython excels in compatibility with libraries such as NumPy and SciPy, enabling efficient array operations and extension building. Integration with NumPy occurs through cimport numpy, which grants access to the NumPy C API for declaring typed memoryviews (e.g., double[:, :]) that provide zero-overhead access to array data, bypassing Python object overhead. This is particularly useful for numerical algorithms, where memoryviews can yield performance comparable to native C code. For SciPy, Cython files (.pyx) are incorporated into the build system via Meson, allowing developers to write performance-critical components like special functions or optimizers in Cython while leveraging SciPy's Python interface. Cython's support for fused types further enhances generality across NumPy dtypes, such as integers and doubles.[40][41]
Cython modules are packaged and distributed using standard Python tools, ensuring broad accessibility. The setuptools library, combined with Cython.Build.cythonize, automates compilation of .pyx files into extension modules during installation via pip install or building wheels for binary distribution. This process supports platform-specific optimizations and dependency management, with wheels enabling quick deployment without requiring users to have Cython or compilers installed. For environments like Conda, Cython is available through the conda-forge channel, allowing binary installations and packaging of Cython-based projects into conda recipes for reproducible scientific workflows. Continuous integration tools, such as GitHub Actions, streamline builds by automating wheel creation and testing across platforms using actions like cibuildwheel.[9][42][43][44]
Integrated development environments (IDEs) enhance Cython development through syntax highlighting, code completion, and error detection for .pyx files. PyCharm offers native support, recognizing Cython syntax and providing features like navigation to definitions and integration with its debugger for stepping through compiled code. In Visual Studio Code, the vscode-cython extension delivers syntax highlighting and static checking, making it suitable for lighter-weight editing. These tools facilitate collaborative development while maintaining compatibility with Python workflows.[45][46]
Debugging compiled Cython modules is supported through tools like GDB via the cygdb wrapper, which enables setting breakpoints in Cython source lines (e.g., cy break function:line), stepping through code, and inspecting variables with Python-aware commands. Compilation with debug flags (e.g., cython --gdb) embeds symbols for this purpose, allowing integration with Python's pdb for hybrid debugging sessions where Python calls trigger Cython execution. This setup is essential for troubleshooting performance issues in extension modules.[47]
Cython 3.x introduces enhancements tailored for NumPy interoperability, including the @cython.ufunc decorator for rapidly generating NumPy universal functions from scalar Cython functions, supporting vectorized operations across arrays without manual loop unrolling. Typed memoryviews receive further optimizations for array manipulations, such as contiguous slicing and fused type compatibility, improving efficiency in numerical tasks. These features, combined with Conda's binary distribution capabilities, make Cython 3.x a cornerstone for modern scientific Python ecosystems.[17][31]
Advanced Topics
Interfacing with C/C++ Libraries
Cython facilitates the integration of existing C libraries by allowing developers to declare and call external C functions and variables directly within Cython code, enabling seamless wrapping without the overhead of Python's C API in many cases.[48] This is primarily achieved through thecdef extern from directive, which declares C entities and instructs the compiler to include the corresponding header file in the generated C code.[48] For instance, to interface with the standard C math library, one can declare the sin function as follows:
cdef extern from "math.h":
double sin(double x)
cdef extern from "math.h":
double sin(double x)
sin from Cython functions, with the generated code invoking the native C implementation.[48] To expose such wrapped functions to Python, they can be defined in a def or cpdef function, ensuring compatibility with Python's dynamic typing while maintaining C-level performance.[49]
For more complex libraries, Cython supports the creation of .pxd files to mirror C header declarations, promoting modularity and reuse across modules.[48] A .pxd file contains cdef declarations without implementation, which can then be imported via cimport in .pyx files. This approach avoids naming conflicts and facilitates the distribution of interface definitions separately from implementations. For example, wrapping a simple C queue library might involve a queue.pxd file declaring the struct and functions like queue_new and queue_push_head, followed by a .pyx file that imports these and wraps them in a Python class using @cython.cclass for memory management via __cinit__ and __dealloc__.[49] Error handling in such wrappers typically involves checking for null pointers and raising Python exceptions, such as MemoryError or IndexError, to propagate C-level errors idiomatically.[49]
Cython extends these capabilities to C++ libraries, providing native support for C++ features like classes, templates, and the Standard Template Library (STL).[50] Declarations use cdef extern from with C++ headers, and namespaces are specified directly, such as cdef extern from "lib.hpp" namespace "std".[50] STL containers are accessible through dedicated libcpp modules, for example, from libcpp.map cimport map as cpp_map, allowing coercion between Python sequences and C++ types like vector[T] or map[Key, Value].[50] C++ classes can be wrapped using cdef cppclass in .pxd files, enabling the creation of Python extensions that instantiate and manipulate C++ objects, with dynamic allocation via new and deallocation via del.[50]
A key aspect of C/C++ interfacing in Cython is handling callbacks from C code back to Python, which is managed through PyObject* pointers and the Global Interpreter Lock (GIL).[48] Callbacks declared as cdef void callback(PyObject* data) require explicit GIL acquisition using with gil: to safely interact with Python objects, preventing thread-safety issues.[48] For templated C++ code, Cython supports parameterization via bracket notation, such as vector[int], and integrates with fused types—compile-time generics in Cython—to generate specialized code that maps to C++ templates, enhancing type safety and performance.[50] These mechanisms allow Cython to serve as a robust bridge for leveraging optimized C/C++ libraries in Python ecosystems.[48]
Support for Alternative Runtimes and Features
Cython offers experimental support for alternative Python runtimes beyond the standard CPython interpreter, enabling developers to compile code for environments like PyPy while adapting to their unique behaviors. This compatibility is achieved through adaptations in the generated C code, such as using PyPy's cpyext layer, which emulates parts of the CPython C API but introduces differences in reference counting, object lifetimes, and borrowed references. However, the benefits of Cython's static typing are limited in PyPy due to its just-in-time (JIT) compiler and garbage collection mechanisms, which can reduce the performance gains from type annotations compared to CPython.[51] For PyPy specifically, Cython compiles modules to C code that is compatible via cpyext, allowing much of the codebase to run unchanged after minor adjustments for PyPy-specific issues, such as non-reentrant GIL handling or performance quirks in functions like PyTuple_GET_ITEM. The support is considered usable with the latest PyPy versions and is tested in Cython's continuous integration, marking it as officially supported in recent releases. A notable improvement came in Cython 3.1.4, which resolved crashes during tracing of C function returns, enhancing reliability for debugging and profiling in PyPy environments.[51][1][17] Cython 3.1 and later versions provide experimental compatibility with Python's free-threaded builds, introduced in CPython 3.13, where the global interpreter lock (GIL) is disabled to enable true parallelism. In these builds, Cython'snogil blocks—sections of code declared to run without acquiring the GIL—function without re-enabling the lock, allowing for multi-threaded execution in performance-critical sections. To ensure compatibility, modules can be marked with the # cython: freethreading_compatible = True directive, and testing is recommended using flags like PYTHON_GIL=0. This support leverages Python's critical section API and synchronization primitives for thread safety, though manual locking is often required to avoid race conditions.[39][1]
Additionally, Cython supports the Python Limited API and stable ABI starting from version 3.1, allowing extension modules to be built against a fixed subset of the C API for binary compatibility across Python versions from 3.8 onward. By setting the Py_LIMITED_API macro during compilation (e.g., to 0x03080000), modules can use the .abi3.so naming convention and avoid recompilation when deploying to different Python minor versions, though testing across targets is essential due to potential runtime differences. This feature restricts certain advanced usages, such as inheriting from builtin types in cdef classes or full profiling support, but it minimizes overhead for C-level operations while enabling broader distribution.[38]
Cython provides partial support for asynchronous programming features like async/await, primarily in Python-mode functions where coroutines can be defined and used with the asyncio library. Async generators, for instance, require CPython 3.6 or later for proper finalization during cleanup. However, cdef functions have limited coroutine capabilities, as they cannot directly use await expressions; instead, they can construct and return awaitable objects for integration with event loops. This enables efficient async code in Cython but falls short of full C-level coroutine support.[52][53]
Compatibility with other Python implementations like Jython and IronPython is partial at best and not recommended for production use, as Cython relies on the CPython C API, which these JVM- and .NET-based runtimes do not fully emulate. While pure Python subsets of Cython code may execute, the generated C extensions fail due to incompatible internals, limiting portability to experimental or legacy scenarios.[54]
Applications and Impact
Scientific and Numerical Computing
Cython plays a pivotal role in scientific and numerical computing by enabling the acceleration of performance-intensive operations, such as loops over large datasets and array manipulations in simulations, through its compilation to optimized C code. This allows researchers to retain Python's expressiveness while achieving near-native speeds for tasks like finite element simulations or Monte Carlo methods, where pure Python would bottleneck due to interpreter overhead. For example, in numerical simulations involving iterative array processing, Cython can deliver speedups of up to 800 times compared to unoptimized Python code for in-cache matrix multiplications.[55] A key strength of Cython in this domain lies in its seamless integration with NumPy and SciPy, facilitating the creation of typed universal functions (ufuncs) and efficient distance computations on multidimensional arrays. Typed memoryviews provide zero-copy access to NumPy arrays, eliminating data copying overhead and enabling direct manipulation at the C level, which is essential for memory-efficient processing in large-scale scientific workflows. This integration supports operations like vectorized distance metrics in clustering algorithms, where Cython code can outperform pure NumPy implementations by factors of 9 to 11 times, particularly when combined with contiguous array layouts.[40] Cython is integral to major projects in scientific computing, including SageMath, where it has been used since 2008 to compile performance-critical mathematical routines for symbolic and numerical computations. In scikit-learn, Cython implements core machine learning algorithms, such as those involving matrix operations for classification and regression. Similarly, spaCy leverages Cython for its NLP pipelines, optimizing numerical aspects like token vector operations and batch processing in language models. These applications highlight Cython's ability to handle 10-1000x speedups in matrix operations, as demonstrated in benchmarks for linear algebra routines.[56][57][58][55] Advanced techniques in Cython further enhance its utility for numerical computing, including memoryviews for efficient, zero-copy NumPy array access, which minimizes memory bandwidth usage in simulation loops. Parallelization is achieved through OpenMP directives, such as#pragma omp parallel for inserted via Cython's prange construct in the generated C code, allowing multi-threaded execution of independent array operations without the global interpreter lock. Cython is also employed in NumPy's build system via Cython-based distutils extensions for compiling random number generators and other low-level modules. In recent machine learning applications, Cython facilitates GPU acceleration by interfacing with C++ code and libraries like CUDA via its C++ support, enabling faster computations on multi-dimensional arrays.[40][59][60][61]