Recent from talks
Contribute something
Nothing was collected or created yet.
| PyMC | |
|---|---|
| Other names | PyMC2, PyMC3 |
| Original author | PyMC Development Team |
| Initial release | April 6, 2012 |
| Stable release | 5.26.1[1] |
| Repository | https://github.com/pymc-devs/pymc |
| Written in | Python |
| Operating system | Unix-like, Mac OS X, Microsoft Windows |
| Platform | Intel x86 – 32-bit, x64 |
| Type | Statistical package |
| License | Apache License, Version 2.0 |
| Website | www |
PyMC (formerly known as PyMC3) is a probabilistic programming library for Python. It can be used for Bayesian statistical modeling and probabilistic machine learning.
PyMC performs inference based on advanced Markov chain Monte Carlo and/or variational fitting algorithms.[2][3][4][5][6] It is a rewrite from scratch of the previous version of the PyMC software.[7] Unlike PyMC2, which had used Fortran extensions for performing computations, PyMC relies on PyTensor, a Python library that allows defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays. From version 3.8 PyMC relies on ArviZ to handle plotting, diagnostics, and statistical checks. PyMC and Stan are the two most popular probabilistic programming tools.[8] PyMC is an open source project, developed by the community and has been fiscally sponsored by NumFOCUS.[9]
PyMC has been used to solve inference problems in several scientific domains, including astronomy,[10][11] epidemiology,[12][13] molecular biology,[14] crystallography,[15][16] chemistry,[17] ecology[18][19] and psychology.[20] Previous versions of PyMC were also used widely, for example in climate science,[21] public health,[22] neuroscience,[23] and parasitology.[24][25]
After Theano announced plans to discontinue development in 2017,[26] the PyMC team evaluated TensorFlow Probability as a computational backend,[27] but decided in 2020 to fork Theano under the name Aesara.[28] Large parts of the Theano codebase have been refactored and compilation through JAX[29] and Numba were added. The PyMC team has released the revised computational backend under the name PyTensor and continues the development of PyMC.[30]
Inference engines
[edit]PyMC implements non-gradient-based and gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference and stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference.
- MCMC-based algorithms:
- No-U-Turn sampler[31] (NUTS), a variant of Hamiltonian Monte Carlo and PyMC's default engine for continuous variables
- Metropolis–Hastings, PyMC's default engine for discrete variables
- Sequential Monte Carlo for static posteriors
- Sequential Monte Carlo for approximate Bayesian computation
- Variational inference algorithms:
- Black-box Variational Inference[32]
See also
[edit]References
[edit]- ^ "Release 5.26.1". 17 October 2025. Retrieved 20 October 2025.
- ^ Abril-Pla O, Andreani V, Carroll C, Dong L, Fonnesbeck CJ, Kochurov M, Kumar R, Lao J, Luhmann CC, Martin OA, Osthege M, Vieira R, Wiecki T, Zinkov R. (2023) PyMC: a modern, and comprehensive probabilistic programming framework in Python. PeerJ Comput. Sci. 9:e1516 doi:10.7717/peerj-cs.1516
- ^ Salvatier J, Wiecki TV, Fonnesbeck C. (2016) Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2:e55 doi:10.7717/peerj-cs.55
- ^ Martin, Osvaldo (2024). Bayesian Analysis with Python. Packt Publishing Ltd. ISBN 978-1-80512-716-1. Retrieved 24 February 2024.
- ^ Martin, Osvaldo; Kumar, Ravin; Lao, Junpeng (2021). Bayesian Modeling and Computation in Python. CRC-press. pp. 1–420. ISBN 978-0-367-89436-8. Retrieved 7 July 2022.
- ^ Davidson-Pilon, Cameron (2015-09-30). Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference. Addison-Wesley Professional. ISBN 978-0-13-390292-1.
- ^ "documentation". Retrieved 2017-09-20.
- ^ "The Algorithms Behind Probabilistic Programming". Retrieved 2017-03-10.
- ^ "NumFOCUS Announces New Fiscally Sponsored Project: PyMC3". NumFOCUS | Open Code = Better Science. Retrieved 2017-03-10.
- ^ Greiner, J.; Burgess, J. M.; Savchenko, V.; Yu, H.-F. (2016). "On the Fermi-GBM Event 0.4 s after GW150914". The Astrophysical Journal Letters. 827 (2): L38. arXiv:1606.00314. Bibcode:2016ApJ...827L..38G. doi:10.3847/2041-8205/827/2/L38. ISSN 2041-8205. S2CID 3529170.
- ^ Hilbe, Joseph M.; Souza, Rafael S. de; Ishida, Emille E. O. (2017-04-30). Bayesian Models for Astrophysical Data: Using R, JAGS, Python, and Stan. Cambridge University Press. ISBN 978-1-108-21074-4.
- ^ Brauner, Jan M.; Mindermann, Sören; Sharma, Mrinank; Johnston, David; Salvatier, John; Gavenčiak, Tom; Stephenson, Anna B.; Leech, Gavin; Altman, George; Mikulik, Vladimir; Norman, Alexander John; Monrad, Joshua Teperowski; Besiroglu, Tamay; Ge, Hong; Hartwick, Meghan A.; Teh, Yee Whye; Chindelevitch, Leonid; Gal, Yarin; Kulveit, Jan (2020-12-15). "Inferring the effectiveness of government interventions against COVID-19". Science. 371 (6531) eabd9338. doi:10.1126/science.abd9338. PMC 7877495. PMID 33323424.
- ^ Systrom, Kevin; Vladek, Thomas; Krieger, Mike. "Rt.live Github repository". Rt.live. Retrieved 10 January 2021.
- ^ Wagner, Stacey D.; Struck, Adam J.; Gupta, Riti; Farnsworth, Dylan R.; Mahady, Amy E.; Eichinger, Katy; Thornton, Charles A.; Wang, Eric T.; Berglund, J. Andrew (2016-09-28). "Dose-Dependent Regulation of Alternative Splicing by MBNL Proteins Reveals Biomarkers for Myotonic Dystrophy". PLOS Genetics. 12 (9) e1006316. doi:10.1371/journal.pgen.1006316. ISSN 1553-7404. PMC 5082313. PMID 27681373.
- ^ Sharma, Amit; Johansson, Linda; Dunevall, Elin; Wahlgren, Weixiao Y.; Neutze, Richard; Katona, Gergely (2017-03-01). "Asymmetry in serial femtosecond crystallography data". Acta Crystallographica Section A. 73 (2): 93–101. Bibcode:2017AcCry..73...93S. doi:10.1107/s2053273316018696. ISSN 2053-2733. PMC 5332129. PMID 28248658.
- ^ Katona, Gergely; Garcia-Bonete, Maria-Jose; Lundholm, Ida (2016-05-01). "Estimating the difference between structure-factor amplitudes using multivariate Bayesian inference". Acta Crystallographica Section A. 72 (3): 406–411. Bibcode:2016AcCry..72..406K. doi:10.1107/S2053273316003430. ISSN 2053-2733. PMC 4850660. PMID 27126118.
- ^ Garay, Pablo G.; Martin, Osvaldo A.; Scheraga, Harold A.; Vila, Jorge A. (2016-07-21). "Detection of methylation, acetylation and glycosylation of protein residues by monitoring13C chemical-shift changes: A quantum-chemical study". PeerJ. 4 e2253. doi:10.7717/peerj.2253. ISSN 2167-8359. PMC 4963218. PMID 27547559.
- ^ Wang, Yan; Huang, Hong; Huang, Lida; Ristic, Branko (2017). "Evaluation of Bayesian source estimation methods with Prairie Grass observations and Gaussian plume model: A comparison of likelihood functions and distance measures". Atmospheric Environment. 152: 519–530. Bibcode:2017AtmEn.152..519W. doi:10.1016/j.atmosenv.2017.01.014.
- ^ MacNeil, M. Aaron; Chong-Seng, Karen M.; Pratchett, Deborah J.; Thompson, Casssandra A.; Messmer, Vanessa; Pratchett, Morgan S. (2017-03-14). "Age and Growth of An Outbreaking Acanthaster cf. solaris Population within the Great Barrier Reef" (PDF). Diversity. 9 (1): 18. doi:10.3390/d9010018.
- ^ Tünnermann, Jan; Scharlau, Ingrid (2016). "Peripheral Visual Cues: Their Fate in Processing and Effects on Attention and Temporal-Order Perception". Frontiers in Psychology. 7: 1442. doi:10.3389/fpsyg.2016.01442. ISSN 1664-1078. PMC 5052275. PMID 27766086.
- ^ Graham, Nicholas A. J.; Jennings, Simon; MacNeil, M. Aaron; Mouillot, David; Wilson, Shaun K. (2015). "Predicting climate-driven regime shifts versus rebound potential in coral reefs". Nature. 518 (7537): 94–97. Bibcode:2015Natur.518...94G. doi:10.1038/nature14140. PMID 25607371. S2CID 4453338.
- ^ Mascarenhas, Maya N.; Flaxman, Seth R.; Boerma, Ties; Vanderpoel, Sheryl; Stevens, Gretchen A. (2012-12-18). "National, Regional, and Global Trends in Infertility Prevalence Since 1990: A Systematic Analysis of 277 Health Surveys". PLOS Medicine. 9 (12) e1001356. doi:10.1371/journal.pmed.1001356. ISSN 1549-1676. PMC 3525527. PMID 23271957.
- ^ Cavanagh, James F; Wiecki, Thomas V; Cohen, Michael X; Figueroa, Christina M; Samanta, Johan; Sherman, Scott J; Frank, Michael J (2011). "Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold". Nature Neuroscience. 14 (11): 1462–1467. doi:10.1038/nn.2925. PMC 3394226. PMID 21946325.
- ^ Gething, Peter W.; Elyazar, Iqbal R. F.; Moyes, Catherine L.; Smith, David L.; Battle, Katherine E.; Guerra, Carlos A.; Patil, Anand P.; Tatem, Andrew J.; Howes, Rosalind E. (2012-09-06). "A Long Neglected World Malaria Map: Plasmodium vivax Endemicity in 2010". PLOS Neglected Tropical Diseases. 6 (9) e1814. Bibcode:2012PNTDi...6.1814G. doi:10.1371/journal.pntd.0001814. ISSN 1935-2735. PMC 3435256. PMID 22970336.
- ^ Pullan, Rachel L.; Smith, Jennifer L.; Jasrasaria, Rashmi; Brooker, Simon J. (2014-01-21). "Global numbers of infection and disease burden of soil transmitted helminth infections in 2010". Parasites & Vectors. 7: 37. Bibcode:2014PVec....7...37P. doi:10.1186/1756-3305-7-37. ISSN 1756-3305. PMC 3905661. PMID 24447578.
- ^ Lamblin, Pascal (28 September 2017). "MILA and the future of Theano". theano-users (Mailing list). Retrieved 28 September 2017.
- ^ Developers, PyMC (2018-05-17). "Theano, TensorFlow and the Future of PyMC". PyMC Developers. Retrieved 2019-01-25.
- ^ "The Future of PyMC3, or: Theano is Dead, Long Live Theano". PyMC Developers. 27 October 2020. Retrieved 10 January 2021.
- ^ Bradbury, James; Frostig, Roy; Hawkins, Peter; James, Matthew James; Leary, Chris; Maclaurin, Dougal; Necula, George; Paszke, Adam; VanderPlas, Jake; Wanderman-Milne, Skye; Zhang, Qiao. "JAX". GitHub. Retrieved 10 January 2021.
- ^ "PyMC Timeline". PyMC Timeline. Retrieved 10 January 2021.
- ^ Hoffman, Matthew D.; Gelman, Andrew (April 2014). "The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo". Journal of Machine Learning Research. 15: pp. 1593–1623.
- ^ Kucukelbir, Alp; Ranganath, Rajesh; Blei, David M. (June 2015). "Automatic Variational Inference in Stan". 1506 (3431). arXiv:1506.03431. Bibcode:2015arXiv150603431K.
{{cite journal}}: Cite journal requires|journal=(help)
Further reading
[edit]- Martin, Osvaldo (2024). Bayesian Analysis with Python: A Practical Guide to Probabilistic Modeling (Third ed.). Packt. ISBN 978-1-80512-716-1.
External links
[edit]- PyMC website
- PyMC source, a Git repository hosted on GitHub
- PyTensor is a Python library for defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays.
Overview
Definition and Purpose
PyMC is an open-source probabilistic programming library implemented in Python, designed for constructing and estimating Bayesian statistical models.[2] It facilitates the declarative specification of probabilistic models through random variables and their dependencies, enabling users to represent complex uncertainty in data-driven applications.[2] The primary purpose of PyMC is to allow statisticians and researchers to define intricate probabilistic models intuitively while performing posterior inference to quantify uncertainty in fields such as statistical modeling and probabilistic machine learning.[4] By supporting automatic Bayesian inference on user-defined models, it bridges theoretical model design with practical computation, making it suitable for tasks requiring robust uncertainty estimates.[2] PyMC emphasizes accessibility for users familiar with statistical notation, featuring a syntax that mirrors natural mathematical descriptions of models without relying on a domain-specific language.[2] This approach, rooted in the probabilistic programming paradigm, promotes ease of learning, customization, and debugging.[2] Over time, PyMC has evolved from its earlier iterations to integrate advanced computational backends, including PyTensor for automatic differentiation, enhancing its flexibility for modern workflows.[2]Core Components
PyMC relies on PyTensor as its primary tensor computation backend, which enables automatic differentiation, gradient computation, and support for GPU acceleration through optimized expression evaluation of multi-dimensional arrays.[5] PyTensor, a fork of the earlier Aesara library (itself derived from Theano), forms the foundational layer for all symbolic computations in PyMC, allowing for efficient handling of complex probabilistic expressions.[2] At the heart of PyMC's architecture is thepm.Model class, which serves as the central API for defining and encapsulating probabilistic models by grouping random variables, priors, and likelihood functions within a structured context.[6] This class manages the model's components, including deterministic transformations and observed data, ensuring that all elements are interconnected for joint posterior inference.[7]
PyMC provides an extensive distribution library comprising over 50 built-in probability distributions, such as the Normal for continuous outcomes and Bernoulli for binary events, covering both univariate and multivariate cases.[8] Users can extend this library by implementing custom distributions directly through PyTensor's graph-based operations, facilitating the integration of domain-specific probabilistic components.[9]
For inference, PyMC exposes high-level APIs including pm.sample() for Markov chain Monte Carlo (MCMC) sampling and pm.fit() for variational inference approximations, which abstract the underlying computational details while leveraging the PyTensor backend.[10] These functions streamline the process of obtaining posterior samples or approximations from a defined model.[2]
The release of PyMC 4.0 in 2022 marked a significant modular rewrite, decoupling the modeling interface from inference engines to enhance extensibility and allow independent development of each layer.[11] This architectural shift has enabled greater flexibility in integrating new backends and inference methods while maintaining backward compatibility for core modeling tasks.[11]
These components collectively support Bayesian workflows by providing a robust foundation for specifying and computing posteriors in probabilistic models.[2]
History
Origins and Early Versions
PyMC's development originated in 2003, initiated by Chris Fonnesbeck, then a graduate student at the University of Georgia, with the primary goal of generalizing the construction of Metropolis-Hastings samplers to enhance accessibility of Markov chain Monte Carlo (MCMC) methods within Python for Bayesian modeling.[12][13] This effort addressed the need for a flexible tool in ecological and statistical applications, where Fonnesbeck's work focused on probabilistic modeling.[14] By 2005, the package had matured sufficiently for the public release of PyMC 1.0, which incorporated refinements based on feedback from a core group of users primarily affiliated with the University of Georgia's ecology and statistics communities.[12][13] These early adopters applied the software to ecological modeling tasks, such as population dynamics and environmental inference, helping to identify usability improvements in sampler implementation and model specification.[14] In 2006, Fonnesbeck was joined by David Huard and Anand Patil, leading to the release of PyMC 2.0 in January 2009, which prioritized greater flexibility in model building, performance enhancements through optimized sampling algorithms, and more intuitive user interfaces to broaden adoption beyond niche academic uses.[12] Fonnesbeck remained the lead developer through subsequent iterations, with notable contributions from John Salvatier beginning in 2011, who focused on integrating gradient-based inference methods to complement the existing MCMC capabilities.[12] PyMC 2.2 followed in April 2012, featuring extensive bug fixes, computational optimizations, enhanced plotting tools for trace visualization, support for CSV table output, and facilities for posterior predictive checks to aid model validation.[12] The final update in this early phase, PyMC 2.3, was released on October 31, 2013, introducing compatibility with Python 3 and refined summary plots for better interpretive diagnostics.[12]Modern Development and Transitions
In the early 2010s, significant advancements in PyMC's capabilities emerged through contributions from key developers. In 2011, John Salvatier initiated work on gradient-based Markov chain Monte Carlo (MCMC) samplers via the mcex package, laying groundwork for more efficient inference methods.[12] By 2012, Salvatier had begun developing Hamiltonian Monte Carlo (HMC) with Theano for automatic differentiation, which were integrated as key features in PyMC3, enabling advanced gradient-based sampling that improved performance over earlier Metropolis-Hastings approaches.[12][13] The release of PyMC3 marked a pivotal shift toward modern probabilistic programming. An alpha version of PyMC 3.0 was introduced in June 2015, followed by the full stable release in January 2017, developed by a core team of 12 contributors.[12][15] This version introduced a redesigned object-oriented model specification, the No-U-Turn Sampler (NUTS) for efficient HMC exploration, Automatic Differentiation Variational Inference (ADVI) for scalable approximation, a generalized linear models (GLM) interface for hierarchical modeling, and specialized distributions for time series analysis.[12][16] These enhancements prioritized usability and computational efficiency while maintaining compatibility with prior workflows. Backend maintenance challenges prompted strategic forks in the late 2010s. In 2020, due to waning upstream support for Theano, PyMC developers created a fork to ensure continued reliability for automatic differentiation and tensor operations.[12][15] This effort culminated in 2021 with the renamed Aesara project, which preserved Theano's core functionality while adding optimizations like just-in-time compilation support.[12][17] PyMC underwent rebranding and architectural overhaul in the early 2020s to enhance modularity. In June 2022, PyMC 4.0 was released as a major rewrite of PyMC3—now simply branded as PyMC—retaining the established API for backward compatibility while introducing a more extensible class structure and performance improvements through better integration with Aesara.[11] This update decoupled core components for easier customization, such as sampling engines and model building blocks, without disrupting existing models.[4] PyMC 5.0 was released in December 2022, switching the backend from Aesara to PyTensor, a fork providing improved optimizations and support for backends like JAX and Numba. The PyMC 5.x series has continued iterative enhancements focused on advanced inference and domain-specific tools, reaching version 5.26.1 as of October 2025. A notable addition in September 2025 was the 'do' operator, enabling interventional queries for causal inference by intervening on model variables to simulate counterfactual scenarios.[13][18] These updates build on prior stability, with over 100 GitHub releases since inception, reflecting sustained refinement in areas like gradient computation and sampler diagnostics. Community growth has paralleled technical evolution, fostering broader adoption. PyMC secured NumFOCUS sponsorship in 2016, providing fiscal and infrastructural support as a nonprofit project.[13][19] Engagement expanded through dedicated Discourse forums for user discussions, regular office hours with core developers, and collaborative GitHub contributions, which have driven feature requests and bug resolutions across versions.[13] This ecosystem has ensured modeling syntax remains intuitive and stable, facilitating transitions across releases.[11]Modeling Framework
Syntax and Model Specification
PyMC employs a Pythonic syntax for model specification that closely mirrors statistical notation, enabling users to define probabilistic models declaratively within context managers. The core structure revolves around thepm.Model() class, which encapsulates the model's components, including random variables, deterministic transformations, and observed data. This approach leverages PyTensor for symbolic computation, allowing for automatic differentiation and efficient evaluation. All variables defined inside a with pm.Model(): block are automatically associated with that model instance, facilitating modular and hierarchical constructions.[2]
Priors are specified by instantiating distribution objects from the PyMC library, assigning them names and hyperparameters that reflect their role in the model. For instance, a normal prior for an intercept parameter can be defined as alpha = pm.Normal('alpha', mu=0, sigma=10), where the name 'alpha' aids in traceability and posterior analysis. Likelihoods are similarly defined using distributions with an observed argument to bind data, such as obs = pm.Normal('obs', mu=mu, sigma=sigma, observed=data), where mu and sigma are typically functions of prior variables. This syntax ensures that the joint probability model is constructed as a directed acyclic graph (DAG), with dependencies flowing from priors to likelihoods.[2]
Hierarchical models are built using nested sampling statements, where group-level parameters draw from hyperpriors, enabling partial pooling across levels. For example, in a longitudinal growth model, global parameters like intercepts and slopes are defined first, followed by subject-specific deviations: subject_intercept = pm.Normal('subject_intercept', 0, subject_intercept_sigma, dims='ids'), indexed by subject identifiers. The likelihood then incorporates these via indexing, such as growth_model = pm.Deterministic('growth_model', (global_intercept + subject_intercept[id_indx]) + (global_age_beta + subject_age_beta[id_indx]) * age_14). This structure captures variation at multiple levels while sharing information efficiently.[20]
For non-standard likelihoods, models can be extended using PyTensor operations to define custom deterministic nodes or likelihoods. Users implement custom Op classes to wrap external functions, ensuring compatibility with PyMC's computational graph. An example involves a black-box log-likelihood: define a LogLike class inheriting from pytensor.Op, with a perform method computing the log-likelihood from inputs like parameters and data, then integrate it as loglike = LogLike()(m, c, sigma, x, data) within the model context, using uniform priors on parameters. This allows incorporation of complex, user-defined probability densities without altering core distributions.[21]
Best practices emphasize clarity and efficiency in model assembly. Coordinate systems are specified via the coords argument in pm.Model(), such as coords={'predictors': X.columns.values}, to name dimensions and improve interpretability in posterior summaries. Indexing follows NumPy conventions, e.g., beta[0] * X1, for subsetting parameters, while vectorization uses PyTensor functions like pt.dot(X.values, beta) to handle array operations scalably, avoiding loops and enhancing performance in high-dimensional settings. These techniques promote reproducible and computationally tractable models.[2]
A representative example is the classic coin flip model, estimating the bias of a Bernoulli process with a beta prior. The following code snippet defines the model:
import pymc as pm
import [numpy](/page/NumPy) as np
# Simulated data: 100 flips, 60 heads
data = np.random.binomial(1, 0.6, 100)
with pm.Model() as coin_model:
# Prior: [Beta distribution](/page/Beta_distribution) for probability of heads
p = pm.Beta('p', alpha=1, beta=1)
# Likelihood: [Bernoulli distribution](/page/Bernoulli_distribution) with observed flips
obs = pm.Bernoulli('obs', p=p, observed=data)
import pymc as pm
import [numpy](/page/NumPy) as np
# Simulated data: 100 flips, 60 heads
data = np.random.binomial(1, 0.6, 100)
with pm.Model() as coin_model:
# Prior: [Beta distribution](/page/Beta_distribution) for probability of heads
p = pm.Beta('p', alpha=1, beta=1)
# Likelihood: [Bernoulli distribution](/page/Bernoulli_distribution) with observed flips
obs = pm.Bernoulli('obs', p=p, observed=data)
p represents uncertainty in the coin's fairness, while the Bernoulli likelihood binds the observed binary outcomes (heads as 1, tails as 0). This simple structure exemplifies the declarative syntax, where the model graph is implicitly built for subsequent inference.[22]
Distributions and Custom Components
PyMC offers a rich library of built-in probability distributions that serve as foundational building blocks for constructing probabilistic models, enabling users to specify priors, likelihoods, and latent variables with high flexibility. These distributions are implemented using PyTensor for symbolic computation, allowing seamless integration into the model's computational graph for automatic differentiation and inference. Among the continuous distributions, key examples include the Normal distribution, parameterized by mean and standard deviation , which is widely used for modeling symmetric, unbounded data; the StudentT distribution, with location , scale , and degrees of freedom , suitable for heavier-tailed observations; and the HalfNormal distribution, a truncated version of the Normal restricted to positive values, often applied to variance parameters like standard deviations.[23] PyMC provides 34 such continuous distributions, covering a range from standard forms like Exponential and Gamma to specialized ones like SkewNormal and TruncatedNormal.[23] For discrete data, PyMC includes distributions such as the Bernoulli, which models binary outcomes with success probability ; the Binomial, for the number of successes in trials each with probability ; and the Poisson, ideal for count data with rate parameter . There are 12 discrete distributions in total, supporting applications like categorical outcomes via Categorical and ordered data via OrderedLogistic.[24] Multivariate and advanced distributions extend this capability to higher dimensions and complex structures. The MultivariateNormal (MvNormal) distribution, defined by mean vector and covariance matrix , handles correlated continuous variables, while the gp module facilitates Gaussian Processes through classes like Latent for prior specification on functions and Marginal for likelihood modeling, with covariance functions such as ExpQuad for squared exponential kernels and Matern52 for smoother alternatives. For scalable approximations in Gaussian processes, PyMC supports inducing point methods via pm.gp.MarginalApprox, the updated version of the older MarginalSparse class, which implements deterministic sparse approximations including FITC (Fully Independent Training Conditional), DTC (Deterministic Training Conditional), and VFE (Variational Free Energy). These methods reduce computational complexity from O(n^3) to O(n m^2), where n is the number of data points and m is the number of inducing points, by using a small set of inducing points to avoid the full n × n covariance matrix; this is a marginal likelihood-based sparse approximation, not stochastic variational inference. Additionally, PyMC provides Hilbert Space Gaussian Processes (HSGP) as a scalable GP approximation using a low-rank basis function expansion with eigenfunctions of the Laplacian, similar to random Fourier features or spectral methods, scaling linearly with the number of data points (O(n m) where m ≪ n is the number of basis functions), making it suitable for large datasets of thousands to tens of thousands of points. HSGP works best for 1D/2D inputs and stationary kernels, supports any likelihood, and is implemented via the pm.gp.HSGP and pm.gp.HSGPPeriodic classes, the latter supporting periodic covariance functions.[25][26][27][28][29] In total, PyMC includes over 50 distributions across these categories, many of which support conjugate prior relationships—such as Normal with Normal likelihoods or Gamma with Poisson—and provide methods for moment calculations, including .mean(), .var(), and .median() for analytical summaries without simulation.[8] To enhance model flexibility, users can define custom distributions by subclassing pm.Distribution (or pm.Continuous/Discrete for appropriate support) and implementing the logp method using PyTensor operations to compute the log-probability density symbolically. For instance, the logp function must handle vectorized inputs and bounds, often incorporating PyTensor functions like pt.switch for domain checks. A helper class, pm.CustomDist, simplifies this for black-box logp and random methods when full symbolic support is unnecessary. Additionally, for parameter constraints, transforms such as Log (mapping the real line to positive reals via ) and Softplus (ensuring positivity through , computable via pm.math.softplus) are applied automatically or manually to maintain valid support during sampling.[9][30] In joint models composed of multiple distributions, the overall log-probability is typically the sum of individual log-probabilities: This is computed efficiently in PyMC using PyTensor's summation over the logp outputs of each component.Inference Techniques
Sampling Methods
PyMC employs Markov chain Monte Carlo (MCMC) methods to perform exact Bayesian inference by generating a sequence of samples from the posterior distribution through Markov chains that converge to the target distribution under mild conditions.[2] These chains are constructed such that each sample depends only on the previous one, ensuring ergodicity and allowing the empirical distribution of samples to approximate the posterior as the chain length increases.[2] The Metropolis-Hastings algorithm serves as a foundational MCMC sampler in PyMC, implementing a random walk proposal mechanism where candidate states are accepted or rejected based on the posterior ratio to maintain detailed balance.[31] Tuning is essential for efficiency, with the acceptance rate targeted between 20% and 50% to balance exploration and exploitation; PyMC's adaptive variants adjust proposal scales automatically during warmup to approach optimal rates around 0.234 for univariate normals.[31] For more efficient sampling in continuous parameter spaces, PyMC implements Hamiltonian Monte Carlo (HMC), which leverages gradient information from the log-posterior to propose distant moves while preserving the target distribution through Hamiltonian dynamics.[2] The dynamics are governed by the Hamiltonian function: where denotes position variables (model parameters), momentum variables, the potential energy (negative log-posterior up to normalization), and the kinetic energy, typically with mass matrix .[32] This approach reduces random walk behavior, enabling faster exploration of high-dimensional posteriors compared to Metropolis-Hastings.[2] The No-U-Turn Sampler (NUTS), an adaptive extension of HMC, is PyMC's default sampler for continuous variables, automatically tuning the integration step size, trajectory length, and mass matrix to avoid inefficient U-turns in the Hamiltonian path.[2] By employing a binary tree exploration and stopping criteria based on generalized No-U-Turn conditions, NUTS achieves robust performance without manual specification of HMC hyperparameters.[33] NUTS was introduced as the primary sampler in PyMC3 in 2016, markedly improving efficiency over prior methods for complex models.[31] PyMC also supports Sequential Monte Carlo (SMC) methods for inference, particularly suited for dynamic models, multimodal posteriors, or estimating model evidence. SMC uses a population of particles that evolve through sequential importance sampling with resampling and optional MCMC kernels within each step to maintain diversity. In PyMC, SMC is invoked via thepm.sample_smc function, which implements adaptive tempering schedules and supports parallel execution across multiple chains. Key parameters include draws for the number of particles, kernel for the choice of proposal (e.g., Metropolis), and parallel for multiprocessing. SMC provides exact inference like MCMC but can be more efficient for certain hierarchical or time-series models by avoiding autocorrelation issues in long chains.[34]
Sampling in PyMC is invoked via the pm.sample function, which by default uses NUTS for eligible variables and falls back to Metropolis for discrete ones; key parameters include tune for warmup iterations (default 1000), draws for post-warmup samples, chains for parallel chains (default 4), and cores for multiprocessing (default 1).[10] This supports efficient parallel tempering or independent chain runs to enhance convergence assessment.[10]
Convergence and quality of MCMC samples in PyMC are evaluated using diagnostics such as the Gelman-Rubin statistic (R-hat), which compares between-chain and within-chain variances across multiple chains; values below 1.1 indicate adequate convergence. Additionally, the effective sample size (ESS) measures autocorrelation-adjusted independence, with recommendations of at least 100 per chain for reliable inference.[35] These metrics, computed via integrated tools like ArviZ, guide practitioners in discarding burn-in or increasing chain length if needed.[2] Similar diagnostics apply to SMC outputs, including effective sample size and weight degeneracy checks.
Approximation Methods
Variational inference (VI) in PyMC provides a scalable approach to approximate the posterior distribution by optimizing a variational family of distributions to closely match the true posterior. The core principle involves maximizing the Evidence Lower Bound (ELBO), a lower bound on the log marginal likelihood, defined as: where is the variational posterior approximation, is the prior, and KL denotes the Kullback-Leibler divergence. This optimization turns the intractable integration of Bayesian inference into a tractable optimization problem, often solved using stochastic gradient methods.[36] Automatic Differentiation Variational Inference (ADVI) is a key implementation in PyMC, employing a mean-field approximation that assumes independence among parameters in the variational posterior, typically modeled as a fully factorized Gaussian distribution. It leverages automatic differentiation to compute gradients efficiently and optimizes the ELBO via stochastic gradient descent or variants like Adam. This approach enables rapid inference, particularly for high-dimensional models.[36][37] For more accurate approximations that capture posterior correlations, PyMC supports FullRankADVI, which uses a multivariate Gaussian variational family with a full-rank covariance matrix, avoiding the independence assumption of mean-field ADVI at the cost of increased computational demands. Additionally, black-box VI in PyMC allows flexible variational families, such as those based on normalizing flows, which transform a simple base distribution (e.g., Gaussian) through a series of invertible mappings to better fit complex posteriors. Normalizing flows enhance expressiveness, enabling multimodal or non-Gaussian approximations without custom derivations.[38][39] The PyMC API simplifies VI usage through thepm.fit function, which fits the variational approximation and returns samples from it; for example, approx = pm.fit(n=10000, method='advi') performs 10,000 optimization steps using ADVI and allows subsequent sampling via approx.sample(). Users can specify methods like 'fullrank_advi' or customize approximations with flows.[40][41]
VI methods in PyMC offer advantages in scalability to large datasets and high-dimensional parameters, providing faster approximations compared to exact MCMC sampling, though they trade off some accuracy for speed. ADVI and the more general Operational Variational Inference (OPVI) framework were introduced in PyMC3 in 2017, with enhancements in PyMC 4.0 (released 2022) including JAX backend support for accelerated computation on GPUs.[42][43][11] Subsequent releases in PyMC 5.x (starting 2023) have further improved these capabilities, such as enhanced normalizing flow support and better integration with JAX and Numba for GPU acceleration, as of version 5.26 in October 2025.[44][45]
Ecosystem and Integration
Supporting Libraries
PyMC's functionality is extended through a rich ecosystem of supporting libraries that handle backend computations, data management, visualization, and specialized modeling interfaces. These tools integrate seamlessly with PyMC, enabling users to build, fit, and analyze complex Bayesian models more efficiently.[46] Central to PyMC is PyTensor, an open-source Python library serving as the backend for tensor operations, automatic differentiation, and just-in-time (JIT) compilation of mathematical expressions involving multi-dimensional arrays. PyTensor allows PyMC to optimize and evaluate probabilistic expressions efficiently, supporting dynamic graph construction and multiple execution backends like NumPy or JAX.[5] ArviZ provides exploratory analysis capabilities for Bayesian models, offering tools for posterior analysis, data storage, model checking, and visualization such as trace plots and posterior summaries. Developed starting in 2018, ArviZ aims to standardize diagnostics and workflows across probabilistic programming libraries like PyMC, PyStan, and TensorFlow Probability, using a unified InferenceData structure based on xarray for interoperability. Bambi offers a high-level interface for specifying and fitting generalized linear models (GLMs) and extensions like mixed-effects models directly on top of PyMC, simplifying syntax for common statistical tasks while leveraging PyMC's inference engines.[47][48] Additional tools in the ecosystem include xarray, which facilitates multidimensional data handling and is integral to ArviZ's data structures for storing and manipulating Bayesian outputs. The pymc-numpyro bridge enables integration with JAX-based inference from NumPyro, allowing users to leverage accelerated computations within PyMC workflows.[46] The PyMC ecosystem has grown to include professional support through PyMC Labs, a consultancy composed of core PyMC developers offering expertise in applied Bayesian modeling for industry applications. PyMC models are commonly integrated with Jupyter notebooks for interactive development and visualization, enhancing reproducibility and exploration in data science pipelines.[49][50][51]Tools for Analysis and Diagnostics
PyMC provides a suite of integrated tools for posterior analysis, convergence diagnostics, and model validation, enabling users to assess the reliability and performance of Bayesian models after inference. These tools leverage the tight integration with ArviZ, a companion library for exploratory analysis of Bayesian models, which has been supported since the PyMC3 era to facilitate seamless workflows from sampling to diagnostics.[11][52] Key functionalities include posterior predictive checks, divergence detection in sampling traces, summary statistics for trace inspection, leave-one-out cross-validation for model comparison, and methods for prior sensitivity analysis, all designed to identify potential biases, ensure convergence, and evaluate predictive accuracy without requiring extensive custom coding.[53] Posterior predictive checks (PPCs) are a fundamental validation technique in PyMC, where simulated data generated from the posterior distribution is compared to observed data to verify model fit. Thepm.sample_posterior_predictive() function draws from the posterior predictive distribution using samples from the fitted model's trace, allowing users to assess whether the model can reproduce key features of the data, such as distributions or patterns.[54] For instance, in a linear regression model, PPCs might plot replicated datasets against observations to detect systematic discrepancies, highlighting issues like underdispersion or multimodality in the model's predictions.[55] This process supports iterative model refinement by quantifying discrepancies through metrics like root mean squared error between simulated and observed summaries.[55]
Divergence warnings in PyMC's No-U-Turn Sampler (NUTS) flag regions of the posterior where the Hamiltonian Monte Carlo trajectory encounters numerical instability, often due to funnel-like geometries or high curvature. These warnings arise when the simulated energy exceeds a threshold during leapfrog integration steps, indicating potential biased sampling.[56] To diagnose problematic areas, users can generate energy plots via ArviZ's az.plot_energy() function, which visualizes the difference between observed and expected energies across iterations; a "funnel" pattern in these plots reveals regions where the sampler struggles to explore, such as near parameter boundaries.[2] Mitigating divergences typically involves reparameterization, like centering predictors in hierarchical models, to flatten the posterior landscape and reduce the warning rate below 1-2% of samples.[57]
Trace inspection is facilitated through ArviZ's az.summary() function, which computes essential diagnostics from PyMC traces, including the Gelman-Rubin statistic (R-hat) for chain convergence, effective sample size (ESS) for autocorrelation assessment, and highest density intervals (HDIs) as credible intervals. R-hat values close to 1 (ideally below 1.01) indicate well-mixed chains, while ESS should exceed 100-400 per parameter to ensure reliable Monte Carlo estimates.[58] For example, in a multilevel model, az.summary(trace) might reveal low ESS for variance components, prompting increased thinning or longer chains. HDIs provide probabilistic bounds, such as a 94% interval capturing the range where the true parameter lies with high posterior probability.[59] These summaries are output as pandas DataFrames, enabling quick tabular review of means, standard deviations, and diagnostics across multiple variables.[58]
Model comparison in PyMC employs Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO) via az.loo(), approximating out-of-sample predictive performance without refitting the model for each data point. This method computes the expected log pointwise predictive density (ELPD), where higher values indicate better predictive accuracy, stabilized by Pareto smoothing to handle influential observations.[53] For comparing multiple models, az.compare() ranks them by LOO differences and standard errors, favoring those with substantial ELPD gains (e.g., >2 SE difference) while penalizing complexity implicitly through the cross-validation. In practice, for competing hierarchical and pooled models, LOO might show the hierarchical variant improving ELPD by 5-10 nats, signifying superior generalization.[53] Enhancements in PyMC 4.0, including JAX backend optimizations, accelerated LOO computations by enabling faster log-likelihood evaluations on GPU hardware.[11]
Sensitivity analysis assesses the robustness of inferences to prior specifications by systematically varying prior distributions and re-running models to observe changes in posterior summaries. In PyMC workflows, this involves fitting the model with alternative priors—such as shifting from weakly informative Normal(0, 10) to more diffuse or informative alternatives—and comparing posterior means or credible intervals via tools like az.summary().[60] For example, in regression coefficients, if posteriors remain stable across a range of prior variances, the analysis demonstrates low sensitivity; otherwise, it signals the need for data-driven prior elicitation. Power-scaling priors offer a formal approach, scaling prior precision to quantify sensitivity without exhaustive refits.[61] This practice ensures conclusions are not overly influenced by subjective prior choices, promoting reproducible Bayesian modeling.[61]
Applications
Statistical Modeling Examples
PyMC facilitates the implementation of standard Bayesian statistical models through its probabilistic programming interface, allowing users to specify priors, likelihoods, and observe data in a concise manner. These models provide full posterior distributions, enabling quantification of uncertainty in parameter estimates, which is particularly useful for hypothesis testing and decision-making under uncertainty. Examples from the PyMC documentation and example gallery illustrate these concepts, often emphasizing the role of prior choices in model behavior via prior predictive checks.[55] In linear regression, PyMC models the relationship between a predictor and outcome using normal priors on the intercept and slope coefficients, with a normal likelihood centered on the linear predictor. For instance, weakly informative priors such as for the slope and intercept, and for the noise standard deviation, are specified to reflect plausible ranges without undue influence. The model is defined as:with pm.Model() as linear_model:
sigma = pm.HalfCauchy("sigma", beta=10)
intercept = pm.Normal("Intercept", 0, sigma=20)
slope = pm.Normal("slope", 0, sigma=20)
mu = intercept + slope * x
y_obs = pm.Normal("y", mu=mu, sigma=sigma, observed=y)
trace = pm.sample(3000)
with pm.Model() as linear_model:
sigma = pm.HalfCauchy("sigma", beta=10)
intercept = pm.Normal("Intercept", 0, sigma=20)
slope = pm.Normal("slope", 0, sigma=20)
mu = intercept + slope * x
y_obs = pm.Normal("y", mu=mu, sigma=sigma, observed=y)
trace = pm.sample(3000)
with pm.Model() as logistic_model:
beta0 = pm.Normal("beta0", 0, 1)
beta1 = pm.Normal("beta1", 0, 1)
mu = beta0 + beta1 * x
p = pm.Deterministic("p", pm.math.invlogit(mu))
y_obs = pm.Binomial("y", n=20, p=p, observed=y)
trace = pm.sample(1000)
with pm.Model() as logistic_model:
beta0 = pm.Normal("beta0", 0, 1)
beta1 = pm.Normal("beta1", 0, 1)
mu = beta0 + beta1 * x
p = pm.Deterministic("p", pm.math.invlogit(mu))
y_obs = pm.Binomial("y", n=20, p=p, observed=y)
trace = pm.sample(1000)
with pm.Model() as coin_model:
p = pm.Beta("p", 1, 1)
y_obs = pm.Bernoulli("y", p=p, observed=y_heads_tails)
trace = pm.sample()
with pm.Model() as coin_model:
p = pm.Beta("p", 1, 1)
y_obs = pm.Bernoulli("y", p=p, observed=y_heads_tails)
trace = pm.sample()
Advanced Use Cases
PyMC supports advanced applications in non-parametric regression through its Gaussian process (GP) module, which facilitates modeling of complex spatial and temporal dependencies. Thegp submodule provides implementations such as gp.Latent for latent function priors and gp.Marginal for likelihood-based inference, enabling the construction of multivariate normal distributions with customizable mean and covariance functions like the exponential quadratic or Matérn kernels. These features allow users to compose GPs additively or multiplicatively, making them suitable for hierarchical spatial models, such as predicting radon concentrations across counties using chordal distances on a spherical Earth, where a Matérn32 kernel captures smooth spatial variations with a length scale prior of approximately 200 km.[64][65]
In causal inference, PyMC incorporates the do operator, introduced in September 2023, to perform interventional queries within do-calculus frameworks. This operator transforms a Bayesian model by fixing specified random variables to constant values, effectively simulating interventions while severing confounding edges in the causal graph. For instance, in marketing analysis, applying do(ads=0) isolates the effect of turning off Google Ads on sales, yielding an average treatment effect of 0.06 units after sampling from the intervened posterior.[18]
For time series analysis, PyMC enables robust forecasting via autoregressive (AR) models with heavy-tailed innovations, such as those using the Student T distribution to handle outliers. An AR(2) process can be specified using the scan mechanism for sequential dependencies, where innovations follow a Student T distribution with degrees of freedom greater than 2 to ensure finite variance, improving predictions in noisy data like patient disease progression. Conditional posterior predictives from such models produce tighter credible intervals compared to unconditional ones, enhancing forecast reliability.[66]
PyMC extends to probabilistic machine learning by integrating with PyTensor, allowing hybrid models that combine Bayesian inference with deep learning architectures. Users can define neural networks using PyTensor operations, such as tanh activations in hidden layers, within PyMC's probabilistic framework, as seen in Bayesian neural networks for binary classification tasks. This integration supports scalable inference through variational methods like automatic differentiation variational inference (ADVI), which approximates posteriors efficiently on large datasets via mini-batches, achieving over 94% accuracy in under 10 seconds on 30,000 iterations. PyMC thus plays a key role in scalable Bayesian deep learning by leveraging these variational techniques to quantify uncertainty in neural network predictions.[67]
A notable real-world application involves epidemiological modeling of COVID-19 transmission, where PyMC employs hierarchical priors to capture evolving reproduction rates across regions. Thomas Wiecki's model uses a Gaussian random walk prior on the reproduction number to flexibly account for interventions like social distancing, structuring variability temporally and potentially across locations through hierarchical levels. This approach fits observed case data while enabling policy scenario comparisons, demonstrating PyMC's utility in public health forecasting during the pandemic.[68]