Hubbry Logo
PyMCPyMCMain
Open search
PyMC
Community hub
PyMC
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
PyMC
PyMC
from Wikipedia
PyMC
Other namesPyMC2, PyMC3
Original authorPyMC Development Team
Initial releaseApril 6, 2012 (2012-04-06)
Stable release
5.26.1[1] Edit this on Wikidata / 17 October 2025; 4 days ago (17 October 2025)
Repositoryhttps://github.com/pymc-devs/pymc
Written inPython
Operating systemUnix-like, Mac OS X, Microsoft Windows
PlatformIntel x86 – 32-bit, x64
TypeStatistical package
License Apache License, Version 2.0
Websitewww.pymc.io

PyMC (formerly known as PyMC3) is a probabilistic programming library for Python. It can be used for Bayesian statistical modeling and probabilistic machine learning.

PyMC performs inference based on advanced Markov chain Monte Carlo and/or variational fitting algorithms.[2][3][4][5][6] It is a rewrite from scratch of the previous version of the PyMC software.[7] Unlike PyMC2, which had used Fortran extensions for performing computations, PyMC relies on PyTensor, a Python library that allows defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays. From version 3.8 PyMC relies on ArviZ to handle plotting, diagnostics, and statistical checks. PyMC and Stan are the two most popular probabilistic programming tools.[8] PyMC is an open source project, developed by the community and has been fiscally sponsored by NumFOCUS.[9]

PyMC has been used to solve inference problems in several scientific domains, including astronomy,[10][11] epidemiology,[12][13] molecular biology,[14] crystallography,[15][16] chemistry,[17] ecology[18][19] and psychology.[20] Previous versions of PyMC were also used widely, for example in climate science,[21] public health,[22] neuroscience,[23] and parasitology.[24][25]

After Theano announced plans to discontinue development in 2017,[26] the PyMC team evaluated TensorFlow Probability as a computational backend,[27] but decided in 2020 to fork Theano under the name Aesara.[28] Large parts of the Theano codebase have been refactored and compilation through JAX[29] and Numba were added. The PyMC team has released the revised computational backend under the name PyTensor and continues the development of PyMC.[30]

Inference engines

[edit]

PyMC implements non-gradient-based and gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference and stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference.

See also

[edit]
  • Stan is a probabilistic programming language for statistical inference written in C++
  • ArviZ a Python library for exploratory analysis of Bayesian models
  • Bambi is a high-level Bayesian model-building interface based on PyMC

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
PyMC is an open-source framework written in Python that enables the specification, fitting, and analysis of complex Bayesian statistical models and probabilistic applications through an intuitive, high-level . It supports a wide range of inference algorithms, including advanced (MCMC) methods such as the No-U-Turn Sampler (NUTS) and (HMC), as well as variational inference (VI) and sequential (SMC), allowing users to perform posterior inference on hierarchical models, , Gaussian processes, and differential equations without deep expertise in underlying numerical methods. Originally developed in 2003 as a tool for Bayesian modeling in Python, PyMC has undergone several major revisions, with PyMC3 released in 2017 introducing gradient-based sampling via integration with Theano, and PyMC 5 launched in 2022 as a comprehensive rewrite that shifted to PyTensor—a of Theano—for enhanced symbolic computation, , and compatibility with backends like and Numba. This evolution addressed performance limitations in earlier versions, such as PyMC 2.0's reliance on basic Metropolis-Hastings sampling, by incorporating state-of-the-art algorithms and optimizations for scalability on CPUs and GPUs. As of 2025, PyMC is maintained under version 5.x by the PyMC development team, with ongoing contributions from a global community hosted on and supported by NumFOCUS. At its core, PyMC leverages PyTensor for compiling probabilistic models into efficient computational graphs, supporting numerous probability distributions, custom likelihoods, and deterministic transformations while integrating seamlessly with the Python scientific ecosystem, including for array operations, for data handling, and for additional statistical tools. It emphasizes modularity, allowing extensions in languages like C++ or for specialized samplers, and provides built-in diagnostics for , such as effective sample size and R-hat statistics. Compared to alternatives like Stan or Pyro, PyMC prioritizes Python-native syntax and ease of use for rapid prototyping, making it particularly suitable for researchers in fields like , , and . PyMC's ecosystem extends through companion libraries such as ArviZ for posterior analysis and visualization, Bambi for generalized linear mixed models, and PyMC-Experimental for cutting-edge features like JAX-accelerated sampling, fostering a collaborative environment via forums like and annual meetups. Licensed under the Apache 2.0 license, it encourages open contributions and has been cited in thousands of peer-reviewed publications for applications ranging from to .

Overview

Definition and Purpose

PyMC is an open-source library implemented in Python, designed for constructing and estimating Bayesian statistical models. It facilitates the declarative specification of probabilistic models through random variables and their dependencies, enabling users to represent complex uncertainty in data-driven applications. The primary purpose of PyMC is to allow statisticians and researchers to define intricate probabilistic models intuitively while performing posterior inference to quantify in fields such as statistical modeling and probabilistic . By supporting automatic on user-defined models, it bridges theoretical model design with practical computation, making it suitable for tasks requiring robust uncertainty estimates. PyMC emphasizes accessibility for users familiar with statistical notation, featuring a syntax that mirrors natural mathematical descriptions of models without relying on a domain-specific language. This approach, rooted in the probabilistic programming paradigm, promotes ease of learning, customization, and debugging. Over time, PyMC has evolved from its earlier iterations to integrate advanced computational backends, including PyTensor for automatic differentiation, enhancing its flexibility for modern workflows.

Core Components

PyMC relies on PyTensor as its primary tensor computation backend, which enables , gradient computation, and support for GPU acceleration through optimized expression evaluation of multi-dimensional arrays. PyTensor, a of the earlier Aesara library (itself derived from Theano), forms the foundational layer for all symbolic computations in PyMC, allowing for efficient handling of complex probabilistic expressions. At the heart of PyMC's architecture is the pm.Model class, which serves as the central for defining and encapsulating probabilistic models by grouping random variables, priors, and likelihood functions within a structured context. This class manages the model's components, including deterministic transformations and observed data, ensuring that all elements are interconnected for joint posterior inference. PyMC provides an extensive distribution library comprising over 50 built-in probability distributions, such as for continuous outcomes and Bernoulli for binary events, covering both univariate and multivariate cases. Users can extend this library by implementing custom distributions directly through PyTensor's graph-based operations, facilitating the integration of domain-specific probabilistic components. For inference, PyMC exposes high-level APIs including pm.sample() for (MCMC) sampling and pm.fit() for variational inference approximations, which abstract the underlying computational details while leveraging the PyTensor backend. These functions streamline the process of obtaining posterior samples or approximations from a defined model. The release of PyMC 4.0 in 2022 marked a significant modular rewrite, decoupling the modeling interface from inference engines to enhance extensibility and allow independent development of each layer. This architectural shift has enabled greater flexibility in integrating new backends and inference methods while maintaining for core modeling tasks. These components collectively support Bayesian workflows by providing a robust foundation for specifying and computing posteriors in probabilistic models.

History

Origins and Early Versions

PyMC's development originated in 2003, initiated by Chris Fonnesbeck, then a graduate student at the , with the primary goal of generalizing the construction of Metropolis-Hastings samplers to enhance accessibility of (MCMC) methods within Python for Bayesian modeling. This effort addressed the need for a flexible tool in ecological and statistical applications, where Fonnesbeck's work focused on probabilistic modeling. By 2005, the package had matured sufficiently for the public release of PyMC 1.0, which incorporated refinements based on feedback from a core group of users primarily affiliated with the University of Georgia's ecology and statistics communities. These early adopters applied the software to ecological modeling tasks, such as population dynamics and environmental inference, helping to identify usability improvements in sampler implementation and model specification. In 2006, Fonnesbeck was joined by David Huard and Anand Patil, leading to the release of PyMC 2.0 in January 2009, which prioritized greater flexibility in , performance enhancements through optimized sampling algorithms, and more intuitive user interfaces to broaden adoption beyond niche academic uses. Fonnesbeck remained the lead developer through subsequent iterations, with notable contributions from John Salvatier beginning in 2011, who focused on integrating gradient-based inference methods to complement the existing MCMC capabilities. PyMC 2.2 followed in April 2012, featuring extensive bug fixes, computational optimizations, enhanced plotting tools for trace visualization, support for CSV table output, and facilities for posterior predictive checks to aid model validation. The final update in this early phase, PyMC 2.3, was released on October 31, 2013, introducing compatibility with Python 3 and refined summary plots for better interpretive diagnostics.

Modern Development and Transitions

In the early 2010s, significant advancements in PyMC's capabilities emerged through contributions from key developers. In 2011, John Salvatier initiated work on gradient-based (MCMC) samplers via the mcex package, laying groundwork for more efficient inference methods. By 2012, Salvatier had begun developing (HMC) with Theano for , which were integrated as key features in PyMC3, enabling advanced gradient-based sampling that improved performance over earlier Metropolis-Hastings approaches. The release of PyMC3 marked a pivotal shift toward modern probabilistic programming. An alpha version of PyMC 3.0 was introduced in June 2015, followed by the full stable release in January 2017, developed by a core team of 12 contributors. This version introduced a redesigned object-oriented model specification, the No-U-Turn Sampler (NUTS) for efficient HMC exploration, Automatic Differentiation Variational Inference (ADVI) for scalable approximation, a generalized linear models (GLM) interface for hierarchical modeling, and specialized distributions for time series analysis. These enhancements prioritized usability and computational efficiency while maintaining compatibility with prior workflows. Backend maintenance challenges prompted strategic forks in the late 2010s. In 2020, due to waning upstream support for Theano, PyMC developers created a to ensure continued reliability for and tensor operations. This effort culminated in 2021 with the renamed Aesara project, which preserved Theano's core functionality while adding optimizations like support. PyMC underwent rebranding and architectural overhaul in the early to enhance . In June 2022, PyMC 4.0 was released as a major rewrite of PyMC3—now simply branded as PyMC—retaining the established for while introducing a more extensible class structure and performance improvements through better integration with Aesara. This update decoupled core components for easier customization, such as sampling engines and model building blocks, without disrupting existing models. PyMC 5.0 was released in December 2022, switching the backend from Aesara to PyTensor, a providing improved optimizations and support for backends like and Numba. The PyMC 5.x series has continued iterative enhancements focused on advanced inference and domain-specific tools, reaching version 5.26.1 as of October 2025. A notable in September 2025 was the 'do' operator, enabling interventional queries for by intervening on model variables to simulate counterfactual scenarios. These updates build on prior stability, with over 100 releases since inception, reflecting sustained refinement in areas like gradient computation and sampler diagnostics. Community growth has paralleled technical evolution, fostering broader adoption. PyMC secured NumFOCUS sponsorship in 2016, providing fiscal and infrastructural support as a nonprofit project. Engagement expanded through dedicated forums for user discussions, regular office hours with core developers, and collaborative contributions, which have driven feature requests and bug resolutions across versions. This ecosystem has ensured modeling syntax remains intuitive and stable, facilitating transitions across releases.

Modeling Framework

Syntax and Model Specification

PyMC employs a Pythonic syntax for model specification that closely mirrors statistical notation, enabling users to define probabilistic models declaratively within context managers. The core structure revolves around the pm.Model() class, which encapsulates the model's components, including random variables, deterministic transformations, and observed data. This approach leverages PyTensor for symbolic computation, allowing for and efficient evaluation. All variables defined inside a with pm.Model(): block are automatically associated with that model instance, facilitating modular and hierarchical constructions. Priors are specified by instantiating distribution objects from the PyMC library, assigning them names and hyperparameters that reflect their role in the model. For instance, a normal prior for an intercept parameter can be defined as alpha = pm.Normal('alpha', mu=0, sigma=10), where the name 'alpha' aids in traceability and posterior analysis. Likelihoods are similarly defined using distributions with an observed argument to bind data, such as obs = pm.Normal('obs', mu=mu, sigma=sigma, observed=data), where mu and sigma are typically functions of prior variables. This syntax ensures that the joint probability model is constructed as a directed acyclic graph (DAG), with dependencies flowing from priors to likelihoods. Hierarchical models are built using nested sampling statements, where group-level parameters draw from hyperpriors, enabling partial pooling across levels. For example, in a longitudinal growth model, global parameters like intercepts and slopes are defined first, followed by subject-specific deviations: subject_intercept = pm.Normal('subject_intercept', 0, subject_intercept_sigma, dims='ids'), indexed by subject identifiers. The likelihood then incorporates these via indexing, such as growth_model = pm.Deterministic('growth_model', (global_intercept + subject_intercept[id_indx]) + (global_age_beta + subject_age_beta[id_indx]) * age_14). This structure captures variation at multiple levels while sharing information efficiently. For non-standard likelihoods, models can be extended using PyTensor operations to define custom deterministic nodes or likelihoods. Users implement custom Op classes to wrap external functions, ensuring compatibility with PyMC's computational graph. An example involves a black-box log-likelihood: define a LogLike class inheriting from pytensor.Op, with a perform method computing the log-likelihood from inputs like parameters and data, then integrate it as loglike = LogLike()(m, c, sigma, x, data) within the model context, using uniform priors on parameters. This allows incorporation of complex, user-defined probability densities without altering core distributions. Best practices emphasize clarity and efficiency in model assembly. Coordinate systems are specified via the coords argument in pm.Model(), such as coords={'predictors': X.columns.values}, to name dimensions and improve interpretability in posterior summaries. Indexing follows conventions, e.g., beta[0] * X1, for subsetting parameters, while vectorization uses PyTensor functions like pt.dot(X.values, beta) to handle array operations scalably, avoiding loops and enhancing performance in high-dimensional settings. These techniques promote reproducible and computationally tractable models. A representative example is the classic coin flip model, estimating the bias of a with a beta prior. The following code snippet defines the model:

python

import pymc as pm import [numpy](/page/NumPy) as np # Simulated data: 100 flips, 60 heads data = np.random.binomial(1, 0.6, 100) with pm.Model() as coin_model: # Prior: [Beta distribution](/page/Beta_distribution) for probability of heads p = pm.Beta('p', alpha=1, beta=1) # Likelihood: [Bernoulli distribution](/page/Bernoulli_distribution) with observed flips obs = pm.Bernoulli('obs', p=p, observed=data)

import pymc as pm import [numpy](/page/NumPy) as np # Simulated data: 100 flips, 60 heads data = np.random.binomial(1, 0.6, 100) with pm.Model() as coin_model: # Prior: [Beta distribution](/page/Beta_distribution) for probability of heads p = pm.Beta('p', alpha=1, beta=1) # Likelihood: [Bernoulli distribution](/page/Bernoulli_distribution) with observed flips obs = pm.Bernoulli('obs', p=p, observed=data)

Here, the beta prior p represents uncertainty in the coin's fairness, while the Bernoulli likelihood binds the observed binary outcomes (heads as 1, tails as 0). This simple structure exemplifies the declarative syntax, where the model graph is implicitly built for subsequent .

Distributions and Custom Components

PyMC offers a rich library of built-in probability distributions that serve as foundational building blocks for constructing probabilistic models, enabling users to specify priors, likelihoods, and latent variables with high flexibility. These distributions are implemented using PyTensor for symbolic computation, allowing seamless integration into the model's computational graph for and . Among the continuous distributions, key examples include the distribution, parameterized by μ\mu and standard deviation σ\sigma, which is widely used for modeling symmetric, unbounded data; the StudentT distribution, with μ\mu, scale σ\sigma, and ν\nu, suitable for heavier-tailed observations; and the HalfNormal distribution, a truncated version of the Normal restricted to positive values, often applied to variance parameters like standard deviations. PyMC provides 34 such continuous distributions, covering a range from standard forms like Exponential and to specialized ones like SkewNormal and TruncatedNormal. For discrete data, PyMC includes distributions such as the Bernoulli, which models binary outcomes with success probability pp; the Binomial, for the number of successes in nn trials each with probability pp; and the Poisson, ideal for data with rate parameter μ\mu. There are 12 discrete distributions in total, supporting applications like categorical outcomes via Categorical and ordered data via OrderedLogistic. Multivariate and advanced distributions extend this capability to higher dimensions and complex structures. The MultivariateNormal (MvNormal) distribution, defined by mean vector μ\mu and Σ\Sigma, handles correlated continuous variables, while the gp module facilitates Gaussian Processes through classes like Latent for prior specification on functions and Marginal for likelihood modeling, with covariance functions such as ExpQuad for squared exponential kernels and Matern52 for smoother alternatives. For scalable approximations in Gaussian processes, PyMC supports inducing point methods via pm.gp.MarginalApprox, the updated version of the older MarginalSparse class, which implements deterministic sparse approximations including FITC (Fully Independent Training Conditional), DTC (Deterministic Training Conditional), and VFE (Variational Free Energy). These methods reduce computational complexity from O(n^3) to O(n m^2), where n is the number of data points and m is the number of inducing points, by using a small set of inducing points to avoid the full n × n covariance matrix; this is a marginal likelihood-based sparse approximation, not stochastic variational inference. Additionally, PyMC provides Hilbert Space Gaussian Processes (HSGP) as a scalable GP approximation using a low-rank basis function expansion with eigenfunctions of the Laplacian, similar to random Fourier features or spectral methods, scaling linearly with the number of data points (O(n m) where m ≪ n is the number of basis functions), making it suitable for large datasets of thousands to tens of thousands of points. HSGP works best for 1D/2D inputs and stationary kernels, supports any likelihood, and is implemented via the pm.gp.HSGP and pm.gp.HSGPPeriodic classes, the latter supporting periodic covariance functions. In total, PyMC includes over 50 distributions across these categories, many of which support relationships—such as Normal with Normal likelihoods or with Poisson—and provide methods for moment calculations, including .mean(), .var(), and .median() for analytical summaries without . To enhance model flexibility, users can define custom distributions by subclassing pm.Distribution (or pm.Continuous/Discrete for appropriate support) and implementing the logp method using PyTensor operations to compute the log-probability density symbolically. For instance, the logp function must handle vectorized inputs and bounds, often incorporating PyTensor functions like pt.switch for domain checks. A helper class, pm.CustomDist, simplifies this for black-box logp and random methods when full symbolic support is unnecessary. Additionally, for parameter constraints, transforms such as Log (mapping the real line to positive reals via exp(x)\exp(x)) and Softplus (ensuring positivity through log(1+exp(x))\log(1 + \exp(x)), computable via pm.math.softplus) are applied automatically or manually to maintain valid support during sampling. In joint models composed of multiple distributions, the overall log-probability is typically the sum of individual log-probabilities: logp(x)=ilogp(xiθi)\log p(\mathbf{x}) = \sum_{i} \log p(x_i \mid \theta_i) This is computed efficiently in PyMC using PyTensor's summation over the logp outputs of each component.

Inference Techniques

Sampling Methods

PyMC employs (MCMC) methods to perform exact by generating a sequence of samples from the posterior distribution through Markov chains that converge to the target distribution under mild conditions. These chains are constructed such that each sample depends only on the previous one, ensuring and allowing the empirical distribution of samples to approximate the posterior as the chain length increases. The Metropolis-Hastings algorithm serves as a foundational MCMC sampler in PyMC, implementing a proposal mechanism where candidate states are accepted or rejected based on the posterior ratio to maintain . Tuning is essential for efficiency, with the acceptance rate targeted between 20% and 50% to balance exploration and exploitation; PyMC's adaptive variants adjust proposal scales automatically during warmup to approach optimal rates around 0.234 for univariate normals. For more efficient sampling in continuous parameter spaces, PyMC implements Hamiltonian Monte Carlo (HMC), which leverages gradient information from the log-posterior to propose distant moves while preserving the target distribution through Hamiltonian dynamics. The dynamics are governed by the Hamiltonian function: H(q,p)=U(q)+K(p)H(q, p) = U(q) + K(p) where qq denotes position variables (model parameters), pp momentum variables, U(q)=logπ(q)U(q) = -\log \pi(q) the potential energy (negative log-posterior up to normalization), and K(p)K(p) the kinetic energy, typically 12pTM1p\frac{1}{2} p^T M^{-1} p with mass matrix MM. This approach reduces random walk behavior, enabling faster exploration of high-dimensional posteriors compared to Metropolis-Hastings. The No-U-Turn Sampler (NUTS), an adaptive extension of HMC, is PyMC's default sampler for continuous variables, automatically tuning the integration step size, trajectory length, and to avoid inefficient U-turns in the . By employing a exploration and stopping criteria based on generalized No-U-Turn conditions, NUTS achieves robust performance without manual specification of HMC hyperparameters. NUTS was introduced as the primary sampler in PyMC3 in , markedly improving efficiency over prior methods for complex models. PyMC also supports Sequential Monte Carlo (SMC) methods for , particularly suited for dynamic models, multimodal posteriors, or estimating model evidence. SMC uses a of particles that evolve through sequential with resampling and optional MCMC kernels within each step to maintain diversity. In PyMC, SMC is invoked via the pm.sample_smc function, which implements adaptive tempering schedules and supports parallel execution across multiple chains. Key parameters include draws for the number of particles, kernel for the choice of proposal (e.g., ), and parallel for . SMC provides exact like MCMC but can be more efficient for certain hierarchical or time-series models by avoiding issues in long chains. Sampling in PyMC is invoked via the pm.sample function, which by default uses NUTS for eligible variables and falls back to for discrete ones; key parameters include tune for warmup iterations (default 1000), draws for post-warmup samples, chains for parallel chains (default 4), and cores for multiprocessing (default 1). This supports efficient or independent chain runs to enhance convergence assessment. Convergence and quality of MCMC samples in PyMC are evaluated using diagnostics such as the Gelman-Rubin statistic (R-hat), which compares between-chain and within-chain variances across multiple chains; values below 1.1 indicate adequate convergence. Additionally, the effective sample size (ESS) measures autocorrelation-adjusted independence, with recommendations of at least 100 per chain for reliable inference. These metrics, computed via integrated tools like ArviZ, guide practitioners in discarding or increasing chain length if needed. Similar diagnostics apply to SMC outputs, including effective sample size and weight degeneracy checks.

Approximation Methods

Variational inference (VI) in PyMC provides a scalable approach to approximate the posterior distribution by optimizing a variational family of distributions to closely match the true posterior. The core principle involves maximizing the Evidence Lower Bound (ELBO), a lower bound on the log marginal likelihood, defined as: ELBO=Eq(θ)[logp(dataθ)]KL(q(θ)p(θ)),\text{ELBO} = \mathbb{E}_{q(\theta)}[\log p(\text{data} \mid \theta)] - \text{KL}(q(\theta) \parallel p(\theta)), where q(θ)q(\theta) is the variational posterior approximation, p(θ)p(\theta) is the prior, and KL denotes the Kullback-Leibler divergence. This optimization turns the intractable integration of Bayesian inference into a tractable optimization problem, often solved using stochastic gradient methods. Automatic Differentiation Variational Inference (ADVI) is a key implementation in PyMC, employing a mean-field approximation that assumes independence among parameters in the variational posterior, typically modeled as a fully factorized Gaussian distribution. It leverages to compute gradients efficiently and optimizes the ELBO via or variants like . This approach enables rapid , particularly for high-dimensional models. For more accurate approximations that capture posterior correlations, PyMC supports FullRankADVI, which uses a multivariate Gaussian variational family with a full-rank , avoiding the independence assumption of mean-field ADVI at the cost of increased computational demands. Additionally, black-box VI in PyMC allows flexible variational families, such as those based on normalizing flows, which transform a simple base distribution (e.g., Gaussian) through a series of invertible mappings to better fit complex posteriors. Normalizing flows enhance expressiveness, enabling multimodal or non-Gaussian approximations without custom derivations. The PyMC API simplifies VI usage through the pm.fit function, which fits the variational approximation and returns samples from it; for example, approx = pm.fit(n=10000, method='advi') performs 10,000 optimization steps using ADVI and allows subsequent sampling via approx.sample(). Users can specify methods like 'fullrank_advi' or customize approximations with flows. VI methods in PyMC offer advantages in scalability to large datasets and high-dimensional parameters, providing faster approximations compared to exact MCMC sampling, though they trade off some accuracy for speed. ADVI and the more general Operational Variational Inference (OPVI) framework were introduced in PyMC3 in 2017, with enhancements in PyMC 4.0 (released 2022) including JAX backend support for accelerated computation on GPUs. Subsequent releases in PyMC 5.x (starting 2023) have further improved these capabilities, such as enhanced normalizing flow support and better integration with JAX and Numba for GPU acceleration, as of version 5.26 in October 2025.

Ecosystem and Integration

Supporting Libraries

PyMC's functionality is extended through a rich ecosystem of supporting libraries that handle backend computations, , visualization, and specialized modeling interfaces. These tools integrate seamlessly with PyMC, enabling users to build, fit, and analyze complex Bayesian models more efficiently. Central to PyMC is PyTensor, an open-source Python library serving as the backend for tensor operations, , and just-in-time (JIT) compilation of mathematical expressions involving multi-dimensional arrays. PyTensor allows PyMC to optimize and evaluate probabilistic expressions efficiently, supporting dynamic graph construction and multiple execution backends like or . ArviZ provides exploratory analysis capabilities for Bayesian models, offering tools for posterior analysis, , , and visualization such as trace plots and posterior summaries. Developed starting in 2018, ArviZ aims to standardize diagnostics and workflows across probabilistic programming libraries like PyMC, PyStan, and TensorFlow Probability, using a unified InferenceData structure based on xarray for . Bambi offers a high-level interface for specifying and fitting generalized linear models (GLMs) and extensions like mixed-effects models directly on top of PyMC, simplifying syntax for common statistical tasks while leveraging PyMC's inference engines. Additional tools in the ecosystem include xarray, which facilitates multidimensional data handling and is integral to ArviZ's data structures for storing and manipulating Bayesian outputs. The pymc-numpyro bridge enables integration with JAX-based inference from NumPyro, allowing users to leverage accelerated computations within PyMC workflows. The PyMC ecosystem has grown to include professional support through PyMC Labs, a consultancy composed of core PyMC developers offering expertise in applied Bayesian modeling for industry applications. PyMC models are commonly integrated with Jupyter notebooks for interactive development and visualization, enhancing reproducibility and exploration in pipelines.

Tools for Analysis and Diagnostics

PyMC provides a suite of integrated tools for posterior analysis, convergence diagnostics, and model validation, enabling users to assess the reliability and performance of Bayesian models after . These tools leverage the tight integration with ArviZ, a companion library for exploratory analysis of Bayesian models, which has been supported since the PyMC3 era to facilitate seamless workflows from sampling to diagnostics. Key functionalities include posterior predictive checks, divergence detection in sampling traces, for trace inspection, leave-one-out cross-validation for model comparison, and methods for prior , all designed to identify potential biases, ensure convergence, and evaluate predictive accuracy without requiring extensive custom coding. Posterior predictive checks (PPCs) are a fundamental validation technique in PyMC, where simulated data generated from the posterior distribution is compared to observed data to verify model fit. The pm.sample_posterior_predictive() function draws from the using samples from the fitted model's trace, allowing users to assess whether the model can reproduce key features of the data, such as distributions or patterns. For instance, in a model, PPCs might plot replicated datasets against observations to detect systematic discrepancies, highlighting issues like underdispersion or in the model's predictions. This process supports iterative model refinement by quantifying discrepancies through metrics like root mean squared error between simulated and observed summaries. Divergence warnings in PyMC's No-U-Turn Sampler (NUTS) flag regions of the posterior where the trajectory encounters numerical instability, often due to funnel-like geometries or high . These warnings arise when the simulated exceeds a threshold during steps, indicating potential biased sampling. To diagnose problematic areas, users can generate energy plots via ArviZ's az.plot_energy() function, which visualizes the difference between observed and expected energies across iterations; a "" pattern in these plots reveals regions where the sampler struggles to explore, such as near parameter boundaries. Mitigating divergences typically involves reparameterization, like centering predictors in hierarchical models, to flatten the posterior landscape and reduce the warning rate below 1-2% of samples. Trace inspection is facilitated through ArviZ's az.summary() function, which computes essential diagnostics from PyMC traces, including the Gelman-Rubin statistic (R-hat) for chain convergence, effective sample size (ESS) for autocorrelation assessment, and highest density intervals (HDIs) as credible intervals. R-hat values close to 1 (ideally below 1.01) indicate well-mixed chains, while ESS should exceed 100-400 per parameter to ensure reliable estimates. For example, in a , az.summary(trace) might reveal low ESS for variance components, prompting increased or longer chains. HDIs provide probabilistic bounds, such as a 94% interval capturing the range where the true parameter lies with high . These summaries are output as DataFrames, enabling quick tabular review of means, standard deviations, and diagnostics across multiple variables. Model comparison in PyMC employs Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO) via az.loo(), approximating out-of-sample predictive performance without refitting the model for each data point. This method computes the expected log pointwise predictive density (ELPD), where higher values indicate better predictive accuracy, stabilized by Pareto smoothing to handle influential observations. For comparing multiple models, az.compare() ranks them by LOO differences and standard errors, favoring those with substantial ELPD gains (e.g., >2 SE difference) while penalizing complexity implicitly through the cross-validation. In practice, for competing hierarchical and pooled models, LOO might show the hierarchical variant improving ELPD by 5-10 nats, signifying superior generalization. Enhancements in PyMC 4.0, including JAX backend optimizations, accelerated LOO computations by enabling faster log-likelihood evaluations on GPU hardware. Sensitivity analysis assesses the robustness of inferences to prior specifications by systematically varying prior distributions and re-running models to observe changes in posterior summaries. In PyMC workflows, this involves fitting the model with alternative priors—such as shifting from weakly informative Normal(0, 10) to more diffuse or informative alternatives—and comparing posterior means or credible intervals via tools like az.summary(). For example, in regression coefficients, if posteriors remain stable across a range of prior variances, the analysis demonstrates low sensitivity; otherwise, it signals the need for data-driven prior elicitation. Power-scaling priors offer a formal approach, scaling prior precision to quantify sensitivity without exhaustive refits. This practice ensures conclusions are not overly influenced by subjective prior choices, promoting reproducible Bayesian modeling.

Applications

Statistical Modeling Examples

PyMC facilitates the implementation of standard Bayesian statistical models through its interface, allowing users to specify priors, likelihoods, and observe data in a concise manner. These models provide full posterior distributions, enabling quantification of in parameter estimates, which is particularly useful for hypothesis testing and under . Examples from the PyMC documentation and example gallery illustrate these concepts, often emphasizing the role of prior choices in via prior predictive checks. In linear regression, PyMC models the relationship between a predictor xx and outcome yy using normal priors on the intercept and slope coefficients, with a normal likelihood centered on the linear predictor. For instance, weakly informative priors such as βN(0,20)\beta \sim \mathcal{N}(0, 20) for the slope and intercept, and σHalfCauchy(10)\sigma \sim \text{HalfCauchy}(10) for the noise standard deviation, are specified to reflect plausible ranges without undue influence. The model is defined as:

python

with pm.Model() as linear_model: sigma = pm.HalfCauchy("sigma", beta=10) intercept = pm.Normal("Intercept", 0, sigma=20) slope = pm.Normal("slope", 0, sigma=20) mu = intercept + slope * x y_obs = pm.Normal("y", mu=mu, sigma=sigma, observed=y) trace = pm.sample(3000)

with pm.Model() as linear_model: sigma = pm.HalfCauchy("sigma", beta=10) intercept = pm.Normal("Intercept", 0, sigma=20) slope = pm.Normal("slope", 0, sigma=20) mu = intercept + slope * x y_obs = pm.Normal("y", mu=mu, sigma=sigma, observed=y) trace = pm.sample(3000)

Posterior means of the slope and intercept serve as point estimates, while credible intervals capture estimation uncertainty; for synthetic data with true values (intercept=1, slope=2), posteriors concentrate near these with 95% intervals reflecting data informativeness. For binary outcomes, logistic regression in PyMC employs a Bernoulli likelihood with a logit link to model probabilities bounded between 0 and 1. Priors on coefficients are typically normal, such as β0,β1N(0,1)\beta_0, \beta_1 \sim \mathcal{N}(0, 1), and the probability is obtained via the inverse logit: p=σ(β0+β1x)p = \sigma(\beta_0 + \beta_1 x), where σ\sigma is the sigmoid function. An example with simulated data (30 observations, 20 trials each) uses:

python

with pm.Model() as logistic_model: beta0 = pm.Normal("beta0", 0, 1) beta1 = pm.Normal("beta1", 0, 1) mu = beta0 + beta1 * x p = pm.Deterministic("p", pm.math.invlogit(mu)) y_obs = pm.Binomial("y", n=20, p=p, observed=y) trace = pm.sample(1000)

with pm.Model() as logistic_model: beta0 = pm.Normal("beta0", 0, 1) beta1 = pm.Normal("beta1", 0, 1) mu = beta0 + beta1 * x p = pm.Deterministic("p", pm.math.invlogit(mu)) y_obs = pm.Binomial("y", n=20, p=p, observed=y) trace = pm.sample(1000)

Posteriors for β1\beta_1 indicate the log-odds change per unit xx, with credible intervals assessing significance for hypothesis testing, such as whether the effect differs from zero. Hierarchical models in PyMC address by allowing parameters to vary across groups while sharing information via hyperpriors, exemplifying partial pooling. The classic eight schools example models SAT coaching effects θj\theta_j for schools j=1,,8j=1,\dots,8, with observed effects y=[28,8,3,7,1,1,18,12]y = [28, 8, -3, 7, -1, 1, 18, 12] and standard deviations σ=[15,10,16,11,9,11,10,18]\sigma = [15, 10, 16, 11, 9, 11, 10, 18]. Varying intercepts are specified as θj=μ+τηj\theta_j = \mu + \tau \eta_j, where ηjN(0,1)\eta_j \sim \mathcal{N}(0,1), μN(0,10)\mu \sim \mathcal{N}(0,10), and τHalfNormal(10)\tau \sim \text{HalfNormal}(10), with likelihood N(yjθj,σj)\mathcal{N}(y_j | \theta_j, \sigma_j). This partial pooling shrinks individual school effects toward the global mean μ\mu, reducing variance compared to separate models, as sampled via NUTS (2000 draws). A foundational example is the beta-binomial model for coin tosses, extending the simple Bernoulli to account for unknown bias pBeta(α,β)p \sim \text{Beta}(\alpha, \beta), which introduces overdispersion relative to a fixed-pp binomial. For 100 flips with 50 heads, priors like uniform Beta(1,1)\text{Beta}(1,1) or concentrated Beta(30,30)\text{Beta}(30,30) are compared; the likelihood is Binomial(n,p)\text{Binomial}(n, p) or equivalent Bernoulli sequence. Code for the uniform prior case:

python

with pm.Model() as coin_model: p = pm.Beta("p", 1, 1) y_obs = pm.Bernoulli("y", p=p, observed=y_heads_tails) trace = pm.sample()

with pm.Model() as coin_model: p = pm.Beta("p", 1, 1) y_obs = pm.Bernoulli("y", p=p, observed=y_heads_tails) trace = pm.sample()

The posterior for pp updates via conjugacy, with Beta(51,51) yielding mean 0.5 but credible intervals quantifying ; concentrated priors demonstrate sensitivity, pulling estimates toward 0.5 and aiding tests on fairness (e.g., P(p>0.5)P(p > 0.5)). Examples from the PyMC gallery highlight prior sensitivity through prior predictive simulations, ensuring priors generate plausible ranges before . These models underscore PyMC's utility in providing for parameters, such as credible intervals for testing null hypotheses (e.g., slope=0 in regression), with posteriors enabling probabilistic statements like the probability of a positive effect.

Advanced Use Cases

PyMC supports advanced applications in non-parametric regression through its (GP) module, which facilitates modeling of complex spatial and temporal dependencies. The gp submodule provides implementations such as gp.Latent for latent function priors and gp.Marginal for likelihood-based inference, enabling the construction of multivariate normal distributions with customizable and functions like the exponential quadratic or Matérn kernels. These features allow users to compose GPs additively or multiplicatively, making them suitable for hierarchical spatial models, such as predicting concentrations across counties using chordal distances on a , where a Matérn32 kernel captures smooth spatial variations with a length scale prior of approximately 200 km. In causal inference, PyMC incorporates the do operator, introduced in September 2023, to perform interventional queries within do-calculus frameworks. This operator transforms a Bayesian model by fixing specified random variables to constant values, effectively simulating interventions while severing confounding edges in the causal graph. For instance, in marketing analysis, applying do(ads=0) isolates the effect of turning off on sales, yielding an of 0.06 units after sampling from the intervened posterior. For time series analysis, PyMC enables robust forecasting via autoregressive (AR) models with heavy-tailed innovations, such as those using the Student T distribution to handle outliers. An AR(2) process can be specified using the scan mechanism for sequential dependencies, where innovations follow a Student T distribution with degrees of freedom greater than 2 to ensure finite variance, improving predictions in noisy data like patient disease progression. Conditional posterior predictives from such models produce tighter credible intervals compared to unconditional ones, enhancing forecast reliability. PyMC extends to probabilistic machine learning by integrating with PyTensor, allowing hybrid models that combine with architectures. Users can define using PyTensor operations, such as tanh activations in hidden layers, within PyMC's probabilistic framework, as seen in Bayesian for tasks. This integration supports scalable inference through variational methods like variational inference (ADVI), which approximates posteriors efficiently on large datasets via mini-batches, achieving over 94% accuracy in under 10 seconds on 30,000 iterations. PyMC thus plays a key role in scalable Bayesian by leveraging these variational techniques to quantify in predictions. A notable real-world application involves epidemiological modeling of transmission, where PyMC employs hierarchical priors to capture evolving reproduction rates across regions. Thomas Wiecki's model uses a Gaussian prior on the reproduction number to flexibly account for interventions like , structuring variability temporally and potentially across locations through hierarchical levels. This approach fits observed case data while enabling policy scenario comparisons, demonstrating PyMC's utility in forecasting during the .

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.