Recent from talks
Nothing was collected or created yet.
Stan (software)
View on Wikipedia| Stan | |
|---|---|
| Original author | Stan Development Team |
| Initial release | August 30, 2012 |
| Stable release | 2.37.0[1] |
| Written in | C++ |
| Operating system | Unix-like, Microsoft Windows, Mac OS X |
| Platform | Intel x86 - 32-bit, x64 |
| Type | Statistical package |
| License | New BSD License |
| Website | mc-stan |
| Repository | |
Stan is a probabilistic programming language for statistical inference written in C++.[2] The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.[2]
Stan is licensed under the New BSD License. Stan is named in honour of Stanislaw Ulam, pioneer of the Monte Carlo method.[2]
Stan was created by a development team consisting of 52 members[3] that includes Andrew Gelman, Bob Carpenter, Daniel Lee, Ben Goodrich, and others.
Example
[edit]A simple linear regression model can be described as , where . This can also be expressed as . The latter form can be written in Stan as the following:
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
parameters {
real alpha;
real beta;
real<lower=0> sigma;
}
model {
y ~ normal(alpha + beta * x, sigma);
}
Interfaces
[edit]The Stan language itself can be accessed through several interfaces:
- CmdStan – a command-line executable for the shell,
- CmdStanR and rstan – R software libraries,
- CmdStanPy and PyStan – libraries for the Python programming language,
- CmdStan.rb - library for the Ruby programming language,
- MatlabStan – integration with the MATLAB numerical computing environment,
- Stan.jl – integration with the Julia programming language,
- StataStan – integration with Stata.
- Stan Playground - online at [1]
In addition, higher-level interfaces are provided with packages using Stan as backend, primarily in the R language:[4]
- rstanarm provides a drop-in replacement for frequentist models provided by base R and lme4 using the R formula syntax;
- brms[5] provides a wide array of linear and nonlinear models using the R formula syntax;
- prophet provides automated procedures for time series forecasting.
Algorithms
[edit]Stan implements gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference, stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference, and gradient-based optimization for penalized maximum likelihood estimation.
- MCMC algorithms:
- Hamiltonian Monte Carlo (HMC)
- No-U-Turn sampler[2][6] (NUTS), a variant of HMC and Stan's default MCMC engine
- Variational inference algorithms:
- Optimization algorithms:
- Limited-memory BFGS (L-BFGS) (Stan's default optimization algorithm)
- Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS)
- Laplace's approximation for classical standard error estimates and approximate Bayesian posteriors
Automatic differentiation
[edit]Stan implements reverse-mode automatic differentiation to calculate gradients of the model, which is required by HMC, NUTS, L-BFGS, BFGS, and variational inference.[2] The automatic differentiation within Stan can be used outside of the probabilistic programming language.
Usage
[edit]Stan is used in fields including social science,[9] pharmaceutical statistics,[10] market research,[11] and medical imaging.[12]
See also
[edit]References
[edit]- ^ "Release 2.37.0". 2 September 2025. Retrieved 15 September 2025.
- ^ a b c d e Stan Development Team. 2015. Stan Modeling Language User's Guide and Reference Manual, Version 2.9.0
- ^ "Development Team". stan-dev.github.io. Retrieved 2024-11-21.
- ^ Gabry, Jonah. "The current state of the Stan ecosystem in R". Statistical Modeling, Causal Inference, and Social Science. Retrieved 25 August 2020.
- ^ "BRMS: Bayesian Regression Models using 'Stan'". 23 August 2021.
- ^ Hoffman, Matthew D.; Gelman, Andrew (April 2014). "The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo". Journal of Machine Learning Research. 15: pp. 1593–1623.
- ^ Kucukelbir, Alp; Ranganath, Rajesh; Blei, David M. (June 2015). "Automatic Variational Inference in Stan". 1506 (3431). arXiv:1506.03431. Bibcode:2015arXiv150603431K.
{{cite journal}}: Cite journal requires|journal=(help) - ^ Zhang, Lu; Carpenter, Bob; Gelman, Andrew; Vehtari, Aki (2022). "Pathfinder: Parallel quasi-Newton variational inference". Journal of Machine Learning Research. 23 (306): 1–49.
- ^ Goodrich, Benjamin King, Wawro, Gregory and Katznelson, Ira, Designing Quantitative Historical Social Inquiry: An Introduction to Stan (2012). APSA 2012 Annual Meeting Paper. Available at SSRN 2105531
- ^ Natanegara, Fanni; Neuenschwander, Beat; Seaman, John W.; Kinnersley, Nelson; Heilmann, Cory R.; Ohlssen, David; Rochester, George (2013). "The current state of Bayesian methods in medical product development: survey results and recommendations from the DIA Bayesian Scientific Working Group". Pharmaceutical Statistics. 13 (1): 3–12. doi:10.1002/pst.1595. ISSN 1539-1612. PMID 24027093. S2CID 19738522.
- ^ Feit, Elea (15 May 2017). "Using Stan to Estimate Hierarchical Bayes Models". Retrieved 19 March 2019.
- ^ Gordon, GSD; Joseph, J; Alcolea, MP; Sawyer, T; Macfaden, AJ; Williams, C; Fitzpatrick, CRM; Jones, PH; di Pietro, M; Fitzgerald, RC; Wilkinson, TD; Bohndiek, SE (2019). "Quantitative phase and polarization imaging through an optical fiber applied to detection of early esophageal tumorigenesis". Journal of Biomedical Optics. 24 (12): 1–13. arXiv:1811.03977. Bibcode:2019JBO....24l6004G. doi:10.1117/1.JBO.24.12.126004. PMC 7006047. PMID 31840442.
Further reading
[edit]- Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew; Lee, Daniel; Goodrich, Ben; Betancourt, Michael; Brubaker, Marcus; Guo, Jiqiang; Li, Peter; Riddell, Allen (2017). "Stan: A Probabilistic Programming Language". Journal of Statistical Software. 76 (1): 1–32. doi:10.18637/jss.v076.i01. ISSN 1548-7660. PMC 9788645. PMID 36568334.
- Gelman, Andrew, Daniel Lee, and Jiqiang Guo (2015). Stan: A probabilistic programming language for Bayesian inference and optimization, Journal of Educational and Behavioral Statistics.
- Hoffman, Matthew D., Bob Carpenter, and Andrew Gelman (2012). Stan, scalable software for Bayesian modeling Archived 2015-01-21 at the Wayback Machine, Proceedings of the NIPS Workshop on Probabilistic Programming.
External links
[edit]- Stan web site
- Stan source, a Git repository hosted on GitHub
Stan (software)
View on GrokipediaIntroduction and History
Development and Team
Stan was founded in 2012 by Andrew Gelman and a team of colleagues at Columbia University, driven by the need to overcome limitations in existing Markov chain Monte Carlo (MCMC) tools like BUGS and JAGS, which relied on less efficient sampling methods and restrictive declarative languages.[5] The initial development focused on creating an imperative probabilistic programming language that could support more scalable and flexible Bayesian inference, particularly through advanced techniques like Hamiltonian Monte Carlo.[5] As of 2025, the project has approximately 84 contributors, including statisticians, computer scientists, and experts from various domains, with ongoing contributions coordinated through the Stan Development Team.[6] Since its inception at Columbia, the team has expanded to include collaborators from multiple institutions worldwide, fostering a collaborative environment for maintenance and enhancements.[7] In 2016, Stan became a fiscally sponsored project of NumFocus, a nonprofit organization supporting open-source scientific computing, which provides infrastructural and financial backing to sustain the project's growth.[7] Stan is released under the New BSD 3-clause License, a permissive open-source license that encourages broad collaboration by allowing modifications and redistribution with minimal restrictions, while requiring attribution to original contributors.[8] This licensing choice has facilitated its evolution from early prototypes—initially focused on basic model specification and inference—to a comprehensive probabilistic programming system with interfaces for multiple languages and robust automatic differentiation capabilities.[5]Key Milestones and Versions
Stan was first released on August 30, 2012, as version 1.0, providing core Markov chain Monte Carlo (MCMC) capabilities for Bayesian inference through its probabilistic programming language.[2] The project advanced significantly with version 2.0, released on February 25, 2014, which introduced enhanced modeling features and improved performance for larger datasets, marking a shift toward more robust statistical applications. Subsequent releases built on this foundation, with version 2.7 in July 2015 adding automatic differentiation variational inference (ADVI) for faster approximate posterior estimation.[9][10] Key community events began with the inaugural StanCon conference in 2016, held in Aspen, Colorado, fostering collaboration among developers and users; annual conferences have since continued, including StanCon 2024 in Oxford and the planned StanCon 2026 in Uppsala. In the 2020s, Stan saw increased compatibility with other probabilistic programming frameworks like PyMC through shared tools such as ArviZ for posterior analysis.[11][12] Post-2020 versions emphasized parallel computing, with within-chain parallelization introduced in Stan 2.23 in 2020 via the reduce_sum function for multi-threading support, and GPU acceleration enhanced through OpenCL integration starting from version 2.21 in 2019 but refined in later releases for heterogeneous devices. By 2025, developments included support for HistFactory models tailored for high-energy physics experiments, enabling more flexible histogram-based analyses.[13][14][15] The latest stable release, version 2.37.0 on September 2, 2025, included memory efficiency improvements in the Pathfinder tool for variational inference.[16]Core Components
Probabilistic Modeling Language
Stan's probabilistic modeling language is a declarative domain-specific language designed for specifying complex statistical models, particularly those amenable to Bayesian inference. It allows users to define the joint probability distribution of data and parameters in a structured, block-based format, emphasizing model declaration over procedural simulation or control flow. This approach facilitates clear separation of model components, enabling efficient compilation into executable code for inference algorithms.[17] The language organizes model specifications into six core blocks, each serving a distinct purpose in the modeling process. The data block declares input variables, such as observed data, with constraints like non-negativity (e.g.,int<lower=0> N; array[N] real y;), ensuring type safety and dimensionality checks before model execution. These declarations are read once per inference chain and cannot include statements or distributions.[17]
Following the data block, the transformed data block computes deterministic transformations or constants from the input data (e.g., real<lower=0> alpha = 0.1;), executed once per chain to prepare derived quantities like sums or scales, without allowing probabilistic statements. The parameters block then defines the unknown variables to be estimated, such as means or variances (e.g., real mu_y; real<lower=0> tau_y;), which are automatically transformed to unconstrained space for sampling efficiency, with constraints specified at declaration.[17]
The transformed parameters block derives additional parameters from the base parameters and data (e.g., real<lower=0> sigma_y = pow(tau_y, -0.5);), evaluated repeatedly during inference and included in output summaries. These blocks together support constrained parameter spaces, where bounds like <lower=0> or <upper=1> are enforced through Jacobian adjustments during transformation, preventing invalid values without runtime rejection sampling.[17][18]
Central to model specification is the model block, where the log posterior density is defined declaratively using sampling statements (e.g., y[n] ~ normal(mu_y, sigma_y);) or explicit increments (e.g., target += normal_lpdf(y | mu_y, sigma_y);). The ~ notation is syntactic sugar that adds the log density to the target function while dropping constant terms for computational efficiency, whereas target += retains all terms, useful for model comparison via log-likelihoods. This block supports loops and local variables for expressing dependencies, such as in hierarchical models where group-level parameters depend on population-level ones (e.g., for (j in 1:J) mu[j] ~ normal(mu_global, sigma_global);). Custom distributions can be defined in a separate functions block and invoked similarly, extending beyond built-in families like normal or beta.[19][17]
Finally, the generated quantities block computes posterior predictive quantities or transformations post-sampling (e.g., real variance_y = sigma_y * sigma_y;), executed for each draw to generate simulations or summaries without affecting the inference process. Unlike imperative languages like Python or C++, Stan's declarative syntax avoids specifying execution paths or simulation steps, focusing solely on the mathematical structure of the probability model to ensure portability and optimization during compilation. This design promotes reproducibility and scalability for hierarchical and multivariate models.[17]
Automatic Differentiation Engine
Stan's automatic differentiation engine is implemented in the Stan Math Library, a C++ template library that provides reverse-mode automatic differentiation (AD) for efficient computation of gradients and Hessians. This implementation leverages operator overloading and template metaprogramming to track dependencies during forward evaluation and propagate derivatives backward, enabling the automatic generation of first- and second-order derivatives without manual coding. The library's design ensures that AD computations are fused with the original function evaluation, minimizing overhead and supporting complex expressions involving built-in types like doubles, vectors, and matrices.[20][21] The core of the engine is thestan::math namespace, which defines custom operators and functions tailored for probabilistic modeling. These include support for dense and sparse vectors, matrices, and a comprehensive set of univariate and multivariate probability distributions, allowing seamless differentiation of statistical functions such as log densities and cumulative distribution functions. For instance, operations like matrix multiplications or element-wise transformations automatically produce the required partial derivatives through templated variadic functions, which handle arbitrary argument counts at compile time for optimal performance. This extensibility permits users to define new differentiable functions by implementing forward and reverse passes, ensuring the library remains adaptable to domain-specific needs.[20][21]
Beyond its integration within Stan's probabilistic programming framework, the Stan Math Library is designed for standalone use in external C++ applications, such as general-purpose optimization or sensitivity analysis. Developers can link against the library to compute gradients for user-defined objectives, exporting AD utilities without relying on Stan's higher-level components; for example, it includes interfaces for numerical integration and differential equation solvers that also support differentiated outputs. This modularity has made it a reusable tool in fields requiring high-precision derivative computations.[20][21]
A key strength of the engine is its performance characteristics, particularly its linear O(n scaling in the number of parameters n for gradient evaluations, which contrasts with the quadratic costs of finite differences or forward-mode AD in high dimensions. This efficiency arises from the reverse-mode approach, where a single backward pass computes all partial derivatives relative to an output scalar, making it suitable for models with thousands of parameters. Benchmarks demonstrate that gradient computations remain feasible even for large-scale problems, with memory usage proportional to the expression graph size. In Stan's inference algorithms, this engine underpins gradient-based samplers like Hamiltonian Monte Carlo by providing accurate derivatives for trajectory simulations.[20][21]
Inference Algorithms
Hamiltonian Monte Carlo Methods
Hamiltonian Monte Carlo (HMC) serves as the foundational Markov chain Monte Carlo (MCMC) algorithm in Stan for exploring Bayesian posterior distributions, enabling efficient sampling by incorporating gradient information from the log posterior density. Unlike random-walk Metropolis methods, HMC simulates Hamiltonian dynamics to propose correlated moves that traverse the parameter space more effectively, reducing autocorrelation and improving exploration in high dimensions. This approach relies on the automatic differentiation engine to compute the necessary gradients of the target density.[22][23] In HMC, the position variables (corresponding to model parameters) are augmented with auxiliary momentum variables , drawn from a Gaussian distribution to ensure ergodicity. The joint system evolves according to the Hamiltonian , where represents the potential energy (negative log posterior) and the kinetic energy, with as the mass matrix. The continuous-time dynamics are governed by Hamilton's equations: These equations are discretized using the leapfrog integrator for proposed moves, which applies symplectic updates with step size : a half-step momentum update , a full-step position update , and another half-step momentum update . The proposal is then accepted or rejected via the Metropolis-Hastings rule to maintain detailed balance, with the leapfrog scheme providing second-order accuracy and volume preservation. Multiple leapfrog steps (typically to ) form a trajectory of total integration time .[22][23] The No-U-Turn Sampler (NUTS), Stan's default implementation, extends HMC by adaptively determining the number of leapfrog steps during warmup, eliminating the need for manual tuning of . NUTS constructs a binary tree of possible trajectories from the current position, exploring both directions until a U-turn is detected—defined as the trajectory looping back such that the momentum at the endpoints points away from each other—or a maximum tree depth is reached, preventing inefficient periodic or divergent paths. This recursive exploration efficiently samples diverse trajectory lengths while controlling computational cost, often achieving higher effective sample sizes than fixed-step HMC.[22][24] Stan supports static HMC with user-specified fixed , but emphasizes dynamic adaptations during an initial warmup phase to optimize performance. The mass matrix is tuned to approximate the posterior covariance, using either a diagonal estimate (default for scalability) or a dense Cholesky factorization for better conditioning in correlated posteriors, which rescales the geometry to improve step efficiency. Step size is adapted via dual averaging, targeting an average Metropolis acceptance rate of 0.8 to balance autocorrelation and rejection rates; this employs a shrinkage parameter , adaptation rate , and initial scale , ensuring robust initialization for sampling. These adaptations are disabled after warmup to preserve the Markov property.[22][24] To assess sampling quality, Stan provides built-in convergence diagnostics, including the split- (R-hat) statistic, which compares within-chain and between-chain variances across multiple parallel chains to detect non-convergence (values above 1.01 indicate potential issues). Additionally, effective sample size (ESS) quantifies the number of independent draws equivalent to the autocorrelated MCMC output, computed via the MCMC central limit theorem as , where is the number of samples and the autocorrelation at lag ; Stan uses a robust estimator truncating at half the chain length to avoid bias in finite samples. These metrics, reported per parameter, guide users in evaluating chain mixing and precision.[22]Variational and Optimization Techniques
Stan implements variational Bayesian inference through Automatic Differentiation Variational Inference (ADVI), a black-box variational method that approximates the posterior distribution with a Gaussian in the unconstrained parameter space.[25] ADVI optimizes the evidence lower bound (ELBO) using stochastic gradient ascent, enabling scalable inference for complex models.[26] It supports two approximation families: mean-field ADVI, which assumes a fully factorized Gaussian (diagonal covariance) for faster computation and simpler structure, and full-rank ADVI, which uses a full covariance matrix to capture posterior correlations at the cost of increased computational expense.[27][25] Stan also supports Pathfinder, a variational inference algorithm that uses parallel quasi-Newton optimization to locate multiple modes of the posterior before generating approximate samples via Laplace approximation around those modes. Pathfinder can provide better approximations than ADVI, especially for multimodal posteriors, and is useful for initialization or as a faster alternative to full MCMC. It includes single-path and multi-path variants, with the latter running multiple independent optimizations. As of Stan 2.37 (September 2025), Pathfinder has seen memory usage improvements.[28][29] For point estimation, Stan provides optimization routines to compute maximum a posteriori (MAP) estimates by maximizing the log-posterior density. The objective is to solve using gradients obtained via automatic differentiation.[26] The default algorithm is L-BFGS, a limited-memory quasi-Newton method that approximates the Hessian with a history of recent gradients for efficient convergence on large problems, while Newton's method offers exact Hessian-based steps but is less efficient overall.[26] These techniques are particularly suited for large datasets or real-time applications where Markov chain Monte Carlo methods are computationally prohibitive due to their slower convergence.[25] However, variational approximations like ADVI systematically underestimate posterior variances because they minimize the KL divergence from the approximating distribution to the true posterior, leading to overly narrow credible intervals compared to MCMC results. Similar considerations apply to Pathfinder approximations.[25]Interfaces and Integrations
Language-Specific Interfaces
Stan provides several language-specific interfaces that enable users to leverage its probabilistic modeling and inference capabilities directly within popular programming environments. These interfaces compile Stan models into executable code and facilitate interaction with Stan's core engine, allowing seamless integration into workflows without requiring users to write C++ code. The primary interfaces include RStan for R, PyStan and CmdStanPy for Python, and CmdStan for command-line and other language bindings.[30][1] RStan serves as the primary interface for the R programming language, built using Rcpp to bridge Stan's C++ library with R's statistical ecosystem. It allows users to specify, compile, and fit Stan models entirely within R scripts or interactive sessions, with the key functionstan() handling model compilation, sampling via Hamiltonian Monte Carlo (such as the No-U-Turn Sampler), optimization, and variational inference. The resulting stanfit objects provide methods for posterior extraction, diagnostics like trace plots and Gelman-Rubin statistics, and integration with R packages for visualization and analysis. RStan supports reproducible workflows by caching compiled models and is distributed via CRAN, ensuring compatibility with R's package manager.[31][32]
For Python users, PyStan offers a direct binding to Stan's C++ API, enabling the definition and fitting of Bayesian models in Python code, including support for interactive environments like Jupyter notebooks. It provides classes like StanModel for compilation and sampling, with methods to interface data and extract posterior samples, making it suitable for exploratory data analysis and scripting. Complementing PyStan, CmdStanPy is a lightweight Python wrapper around the CmdStan command-line tool, emphasizing modularity by separating model compilation from inference execution; it supports advanced features like parallel chains and is optimized for production environments with large datasets. Both interfaces facilitate Jupyter-based workflows, where users can iteratively refine models and visualize results using libraries like Matplotlib or ArviZ.[33]
CmdStan provides a standalone command-line interface to Stan, primarily for C++ but extensible to other languages through its executable binaries. Users compile Stan programs into platform-independent executables that perform inference tasks such as MCMC sampling, optimization, or posterior predictive checks via simple command invocations, outputting results in CSV format for easy parsing. This interface underpins bindings for languages like Julia through the Stan.jl package, which invokes CmdStan executables from Julia code to fit models while leveraging Julia's high-performance computing features. CmdStan's minimal dependencies and low memory footprint make it ideal for deployment in constrained environments, such as high-performance computing clusters.[34][35]
A key advantage of these interfaces is their cross-language compatibility: a Stan model is compiled once into an executable via CmdStan, then callable from multiple environments without recompilation, promoting efficiency and portability across R, Python, Julia, and C++ workflows. This design allows teams to share models while using preferred languages for data preparation and post-processing. Specialized packages built on these core interfaces extend functionality for specific use cases.[34][1]
Ecosystem Packages and Tools
rstanarm is an R package that facilitates Bayesian applied regression modeling by emulating familiar R functions likelm() and glm(), while utilizing Stan via the rstan interface for backend estimation of common regression models such as linear, generalized linear, and mixed-effects models.[36] It automates the specification of weakly informative default priors to promote numerical stability and efficient MCMC sampling, and incorporates streamlined post-processing features for generating posterior summaries, intervals, and diagnostics.[36]
brms provides an intuitive R interface for fitting Bayesian generalized (non-)linear multilevel models using Stan, employing formula syntax akin to that in the lme4 package to specify hierarchical structures, response distributions (e.g., Gaussian, binomial, Poisson), and link functions without requiring direct Stan code authoring.[37] This package supports advanced features including autocorrelation, censored data, and flexible prior distributions, enabling users to perform full Bayesian inference on complex multilevel data while leveraging Stan's sampling capabilities.[37]
Prophet, an open-source tool developed by Meta (formerly Facebook), specializes in time-series forecasting by fitting additive models that capture nonlinear trends, seasonalities, and holiday effects, with Stan serving as the core backend for model optimization and uncertainty quantification to produce rapid, interpretable predictions.[38]
Complementing these domain-specific packages are diagnostic and evaluation tools such as ShinyStan, an R package built on the Shiny framework that offers an interactive web interface for examining MCMC outputs from Stan models, including trace plots, posterior density visualizations, Gelman-Rubin diagnostics, and posterior predictive simulations to assess convergence and model fit.[39]
The loo package computes efficient approximations of leave-one-out cross-validation (via Pareto-smoothed importance sampling, PSIS-LOO) and the widely applicable information criterion (WAIC) directly from posterior draws of Stan-fitted Bayesian models, providing model weights, standard errors, and diagnostics to support reliable model comparison and selection.[40]
A notable 2025 addition to the ecosystem is stanhf, a command-line tool that translates HistFactory JSON specifications— a standard schema for defining statistical models in high-energy physics experiments involving event counts and systematic uncertainties—into executable Stan programs, enabling probabilistic inference for collider data analysis with Stan's automatic differentiation and sampling methods.[15]
Applications and Usage
Primary Domains
Stan has found extensive application in the social sciences, particularly for hierarchical modeling of survey data and multilevel structures that account for nested data hierarchies, such as individuals within groups, regions, or time periods. This approach enables robust inference on complex social phenomena by pooling information across levels, reducing variance in estimates for sparse subgroups while preserving heterogeneity. For instance, Bayesian hierarchical weighting adjustments have been employed to improve survey inference by combining poststratification with multilevel regression, enhancing accuracy in national polls and demographic analyses.[41] In the pharmaceutical domain, Stan facilitates Bayesian modeling of dose-response relationships and pharmacokinetics, supporting sequential updates to prior knowledge with new clinical data for more precise predictions of drug behavior. Tools like Torsten extend Stan to handle ordinary differential equations for population pharmacokinetic models, enabling efficient fitting of compartmental models to longitudinal concentration data. This has been demonstrated in analyses of drugs like somatrogon, where hierarchical Bayesian methods quantify inter-individual variability and support dose optimization in pediatric populations.[42][43] Stan is prominently used in ecology and epidemiology for time-series and spatial models that capture dynamic processes in populations and disease spread. In ecology, state-space models implemented in Stan analyze ecological time series, such as population trajectories or predator-prey interactions, by separating observation error from process noise through hierarchical structures. For epidemiology, these capabilities extend to spatiotemporal forecasting, including Bayesian hierarchical models for under-reporting and transmission dynamics during the COVID-19 pandemic, where time-varying coefficients model evolving infection rates across regions.[44][45][46] Within machine learning, Stan supports Gaussian processes for nonparametric regression and uncertainty quantification in functional data, leveraging its automatic differentiation for scalable inference on covariance kernels like squared exponential or Matérn. This is particularly valuable for tasks requiring probabilistic predictions, such as spatial interpolation or time-series forecasting. Additionally, extensions enable probabilistic neural networks by integrating deep learning components into Bayesian frameworks, allowing for variational inference on network parameters to model uncertainty in high-dimensional inputs.[47][48][49] Recent expansions of Stan include applications in high-energy physics, notably through the stanhf package that implements HistFactory models for statistical analysis of collider data, facilitating blinded inference and hypothesis testing in particle searches.[15] In market research, Stan underpins hierarchical Bayesian models for conjoint analysis and marketing mix optimization, enabling attribution of consumer preferences and media impacts across heterogeneous segments.[50]Practical Examples and Limitations
A foundational application of Stan is fitting a simple linear regression model, which demonstrates the language's syntax for specifying data, parameters, priors, and likelihoods. In this model, the response variable is assumed to follow a normal distribution centered at the linear predictor , with unknown variance . Normal priors are placed on the intercept and slope , while receives a half-Cauchy prior for positivity. Sampling is performed using the No-U-Turn Sampler (NUTS), Stan's default Hamiltonian Monte Carlo algorithm, which efficiently explores the posterior distribution.[51][52] The complete Stan program for this example is as follows:data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
parameters {
real alpha;
real beta;
real<lower=0> sigma;
}
model {
alpha ~ normal(0, 10);
beta ~ normal(0, 10);
sigma ~ cauchy(0, 5);
y ~ normal(alpha + beta * x, sigma);
}
generated quantities {
real pred_y = normal_rng(alpha + beta * 1.5, sigma); // Example prediction
}
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
parameters {
real alpha;
real beta;
real<lower=0> sigma;
}
model {
alpha ~ normal(0, 10);
beta ~ normal(0, 10);
sigma ~ cauchy(0, 5);
y ~ normal(alpha + beta * x, sigma);
}
generated quantities {
real pred_y = normal_rng(alpha + beta * 1.5, sigma); // Example prediction
}
data {
int<lower=0> N1; // Trials for variant 1
int<lower=0> C1; // Successes for variant 1
int<lower=0> N2; // Trials for variant 2
int<lower=0> C2; // Successes for variant 2
}
parameters {
real<lower=0> a;
real<lower=0> b;
real<lower=0, upper=1> p1;
real<lower=0, upper=1> p2;
}
model {
// Weakly informative prior on hyperparameters (e.g., horseshoe-like)
target += -2.5 * log(a + b);
p1 ~ beta(a, b);
p2 ~ beta(a, b);
C1 ~ binomial(N1, p1);
C2 ~ binomial(N2, p2);
}
generated quantities {
real diff = p1 - p2; // Posterior difference in conversion rates
}
data {
int<lower=0> N1; // Trials for variant 1
int<lower=0> C1; // Successes for variant 1
int<lower=0> N2; // Trials for variant 2
int<lower=0> C2; // Successes for variant 2
}
parameters {
real<lower=0> a;
real<lower=0> b;
real<lower=0, upper=1> p1;
real<lower=0, upper=1> p2;
}
model {
// Weakly informative prior on hyperparameters (e.g., horseshoe-like)
target += -2.5 * log(a + b);
p1 ~ beta(a, b);
p2 ~ beta(a, b);
C1 ~ binomial(N1, p1);
C2 ~ binomial(N2, p2);
}
generated quantities {
real diff = p1 - p2; // Posterior difference in conversion rates
}
