Recent from talks
Nothing was collected or created yet.
Stochastic simulation
View on WikipediaA stochastic simulation is a simulation of a system that has variables that can change stochastically (randomly) with individual probabilities.[1]
Realizations of these random variables are generated and inserted into a model of the system. Outputs of the model are recorded, and then the process is repeated with a new set of random values. These steps are repeated until a sufficient amount of data is gathered. In the end, the distribution of the outputs shows the most probable estimates as well as a frame of expectations regarding what ranges of values the variables are more or less likely to fall in.[1]
Often random variables inserted into the model are created on a computer with a random number generator (RNG). The U(0,1) uniform distribution outputs of the random number generator are then transformed into random variables with probability distributions that are used in the system model.[2]
Etymology
[edit]Stochastic originally meant "pertaining to conjecture"; from Greek stokhastikos "able to guess, conjecturing": from stokhazesthai "guess"; from stokhos "a guess, aim, target, mark". The sense of "randomly determined" was first recorded in 1934, from German Stochastik.[3]
Discrete-event simulation
[edit]In order to determine the next event in a stochastic simulation, the rates of all possible changes to the state of the model are computed, and then ordered in an array. Next, the cumulative sum of the array is taken, and the final cell contains the number R, where R is the total event rate. This cumulative array is now a discrete cumulative distribution, and can be used to choose the next event by picking a random number z~U(0,R) and choosing the first event, such that z is less than the rate associated with that event.
Probability distributions
[edit]A probability distribution is used to describe the potential outcome of a random variable.
Limits the outcomes where the variable can only take on discrete values.[4]
Bernoulli distribution
[edit]A random variable X is Bernoulli-distributed with parameter p if it has two possible outcomes usually encoded 1 (success or default) or 0 (failure or survival)[5] where the probabilities of success and failure are and where .
To produce a random variable X with a Bernoulli distribution from a U(0,1) uniform distribution made by a random number generator, we define such that the probability for and .[2]
Example: Toss of coin
[edit]Define For a fair coin, both realizations are equally likely. We can generate realizations of this random variable X from a uniform distribution provided by a random number generator (RNG) by having if the RNG outputs a value between 0 and 0.5 and if the RNG outputs a value between 0.5 and 1. Of course, the two outcomes may not be equally likely (e.g. success of medical treatment).[6]
Binomial distribution
[edit]A binomial distributed random variable Y with parameters n and p is obtained as the sum of n independent and identically Bernoulli-distributed random variables X1, X2, ..., Xn[4]
Example: A coin is tossed three times. Find the probability of getting exactly two heads. This problem can be solved by looking at the sample space. There are three ways to get two heads.
The answer is 3/8 (= 0.375).[7]
Poisson distribution
[edit]A poisson process is a process where events occur randomly in an interval of time or space.[2][8] The probability distribution for Poisson processes with constant rate λ per time interval is given by the following equation.[4]
Defining as the number of events that occur in the time interval
It can be shown that inter-arrival times for events is exponentially distributed with a cumulative distribution function (CDF) of . The inverse of the exponential CDF is given by where is an uniformly distributed random variable.[2]
Simulating a Poisson process with a constant rate for the number of events that occur in interval can be carried out with the following algorithm.[9]
- Begin with and
- Generate random variable from uniform distribution
- Update the time with
- If , then stop. Else continue to step 5.
- Continue to step 2
Methods
[edit]Direct and first reaction methods
[edit]Published by Dan Gillespie in 1977, and is a linear search on the cumulative array. See Gillespie algorithm.
Gillespie’s Stochastic Simulation Algorithm (SSA) is essentially an exact procedure for numerically simulating the time evolution of a well-stirred chemically reacting system by taking proper account of the randomness inherent in such a system.[10]
It is rigorously based on the same microphysical premise that underlies the chemical master equation and gives a more realistic representation of a system’s evolution than the deterministic reaction rate equation (RRE) represented mathematically by ODEs.[10]
As with the chemical master equation, the SSA converges, in the limit of large numbers of reactants, to the same solution as the law of mass action.
Next reaction method
[edit]Published 2000 by Gibson and Bruck[11] the next reaction method improves over the first reaction method by reducing the amount of random numbers that need to be generated. To make the sampling of reactions more efficient, an indexed [priority queue] is used to store the reaction times. To make the computation of reaction propensities more efficient, a dependency graph is also used. This dependency graph tells which reaction propensities to update after a particular reaction has fired. While more efficient, the next reaction method requires more complex data structures than either direct simulation or the first reaction method.
Optimised and sorting direct methods
[edit]Published 2004[12] and 2005. These methods sort the cumulative array to reduce the average search depth of the algorithm. The former runs a presimulation to estimate the firing frequency of reactions, whereas the latter sorts the cumulative array on-the-fly.
Logarithmic direct method
[edit]Published in 2006. This is a binary search on the cumulative array, thus reducing the worst-case time complexity of reaction sampling to O (log M).
Partial-propensity methods
[edit]Published in 2009, 2010, and 2011 (Ramaswamy 2009, 2010, 2011). Use factored-out, partial reaction propensities to reduce the computational cost to scale with the number of species in the network, rather than the (larger) number of reactions. Four variants exist:
- PDM, the partial-propensity direct method. Has a computational cost that scales linearly with the number of different species in the reaction network, independent of the coupling class of the network (Ramaswamy 2009).
- SPDM, the sorting partial-propensity direct method. Uses dynamic bubble sort to reduce the pre-factor of the computational cost in multi-scale reaction networks where the reaction rates span several orders of magnitude (Ramaswamy 2009).
- PSSA-CR, the partial-propensity SSA with composition-rejection sampling. Reduces the computational cost to constant time (i.e., independent of network size) for weakly coupled networks (Ramaswamy 2010) using composition-rejection sampling (Slepoy 2008).
- dPDM, the delay partial-propensity direct method. Extends PDM to reaction networks that incur time delays (Ramaswamy 2011) by providing a partial-propensity variant of the delay-SSA method (Bratsun 2005, Cai 2007).
The use of partial-propensity methods is limited to elementary chemical reactions, i.e., reactions with at most two different reactants. Every non-elementary chemical reaction can be equivalently decomposed into a set of elementary ones, at the expense of a linear (in the order of the reaction) increase in network size.
Approximate Methods
[edit]A general drawback of stochastic simulations is that for big systems, too many events happen which cannot all be taken into account in a simulation. The following methods can dramatically improve simulation speed by some approximations.
τ leaping method
[edit]Since the SSA method keeps track of each transition, it would be impractical to implement for certain applications due to high time complexity. Gillespie proposed an approximation algorithm, the tau-leaping method which decreases computational time with minimal loss of accuracy.[13] Instead of taking incremental steps in time, keeping track of X(t) at each time step as in the SSA method, the tau-leaping method leaps from one subinterval to the next, approximating how many transitions take place during a given subinterval. It is assumed that the value of the leap, τ, is small enough that there is no significant change in the value of the transition rates along the subinterval [t, t + τ]. This condition is known as the leap condition. The tau-leaping method thus has the advantage of simulating many transitions in one leap while not losing significant accuracy, resulting in a speed up in computational time.[14]
Conditional Difference Method
[edit]This method approximates reversible processes (which includes random walk/diffusion processes) by taking only net rates of the opposing events of a reversible process into account. The main advantage of this method is that it can be implemented with a simple if-statement replacing the previous transition rates of the model with new, effective rates. The model with the replaced transition rates can thus be solved, for instance, with the conventional SSA.[15]
Continuous simulation
[edit]While in discrete state space it is clearly distinguished between particular states (values) in continuous space it is not possible due to certain continuity. The system usually change over time, variables of the model, then change continuously as well. Continuous simulation thereby simulates the system over time, given differential equations determining the rates of change of state variables.[16] Example of continuous system is the predator/prey model[17] or cart-pole balancing [18]
Probability distributions
[edit]Normal distribution
[edit]The random variable X is said to be normally distributed with parameters μ and σ, abbreviated by X ∈ N(μ, σ2), if the density of the random variable is given by the formula [4]
Many things actually are normally distributed, or very close to it. For example, height and intelligence are approximately normally distributed; measurement errors also often have a normal distribution.[19]
Exponential distribution
[edit]Exponential distribution describes the time between events in a Poisson process, i.e. a process in which events occur continuously and independently at a constant average rate.
The exponential distribution is popular, for example, in queuing theory when we want to model the time we have to wait until a certain event takes place. Examples include the time until the next client enters the store, the time until a certain company defaults or the time until some machine has a defect.[4]
Student's t-distribution
[edit]Student's t-distribution are used in finance as probabilistic models of assets returns. The density function of the t-distribution is given by the following equation:[4] where is the number of degrees of freedom and is the gamma function.
For large values of n, the t-distribution doesn't significantly differ from a standard normal distribution. Usually, for values n > 30, the t-distribution is considered as equal to the standard normal distribution.
Other distributions
[edit]Combined simulation
[edit]It is often possible to model one and the same system by use of completely different world views. Discrete event simulation of a problem as well as continuous event simulation of it (continuous simulation with the discrete events that disrupt the continuous flow) may lead eventually to the same answers. Sometimes however, the techniques can answer different questions about a system. If we necessarily need to answer all the questions, or if we don't know what purposes is the model going to be used for, it is convenient to apply combined continuous/discrete methodology.[20] Similar techniques can change from a discrete, stochastic description to a deterministic, continuum description in a time-and space dependent manner.[21] The use of this technique enables the capturing of noise due to small copy numbers, while being much faster to simulate than the conventional Gillespie algorithm. Furthermore, the use of the deterministic continuum description enables the simulations of arbitrarily large systems.
Monte Carlo simulation
[edit]Monte Carlo is an estimation procedure. The main idea is that if it is necessary to know the average value of some random variable and its distribution cannot be stated, and if it is possible to take samples from the distribution, we can estimate it by taking the samples, independently, and averaging them. If there are sufficient samples, then the law of large numbers says the average must be close to the true value. The central limit theorem says that the average has a Gaussian distribution around the true value.[22]
As a simple example, suppose we need to measure area of a shape with a complicated, irregular outline. The Monte Carlo approach is to draw a square around the shape and measure the square. Then we throw darts into the square, as uniformly as possible. The fraction of darts falling on the shape gives the ratio of the area of the shape to the area of the square. In fact, it is possible to cast almost any integral problem, or any averaging problem, into this form. It is necessary to have a good way to tell if you're inside the outline, and a good way to figure out how many darts to throw. Last but not least, we need to throw the darts uniformly, i.e., using a good random number generator.[22]
Application
[edit]There are wide possibilities for use of Monte Carlo Method:[1]
- Statistic experiment using generation of random variables (e.g. dice)
- sampling method
- Mathematics (e.g. numerical integration, multiple integrals)
- Reliability Engineering
- Project Management (SixSigma)
- Experimental particle physics
- Simulations
- Risk Measurement/Risk Management (e.g. Portfolio value estimation)
- Economics (e.g. finding the best fitting demand curve)
- Process Simulation
- Operations Research
Random number generators
[edit]Source:[23]
For simulation experiments (including Monte Carlo) it is necessary to generate random numbers (as values of variables). The problem is that the computer is highly deterministic machine—basically, behind each process there is always an algorithm, a deterministic computation changing inputs to outputs; therefore it is not easy to generate uniformly spread random numbers over a defined interval or set.[1]
A random number generator is a device capable of producing a sequence of numbers which cannot be "easily" identified with deterministic properties. This sequence is then called a sequence of stochastic numbers.[24]
The algorithms typically rely on pseudorandom numbers, computer generated numbers mimicking true random numbers, to generate a realization, one possible outcome of a process.[25]
Methods for obtaining random numbers have existed for a long time and are used in many different fields (such as gaming). However, these numbers suffer from a certain bias. Currently the best methods expected to produce truly random sequences are natural methods that take advantage of the random nature of quantum phenomena.[24]
See also
[edit]References
[edit]- ^ a b c d DLOUHÝ, M.; FÁBRY, J.; KUNCOVÁ, M.. Simulace pro ekonomy. Praha : VŠE, 2005.
- ^ a b c d Dekking, Frederik Michel (2005). A Modern Introduction to Probability and Statistics: Understanding Why and How. Springer. ISBN 1-85233-896-2. OCLC 783259968.
- ^ stochastic. (n.d.). Online Etymology Dictionary. Retrieved January 23, 2014, from Dictionary.com website: http://dictionary.reference.com/browse/stochastic
- ^ a b c d e f Rachev, Svetlozar T. Stoyanov, Stoyan V. Fabozzi, Frank J., "Chapter 1 Concepts of Probability" in Advanced Stochastic Models, Risk Assessment, and Portfolio Optimization : The Ideal Risk, Uncertainty, and Performance Measures, Hoboken, NJ, USA: Wiley, 2008
- ^ Rachev, Svetlozar T.; Stoyanov, Stoyan V.; Fabozzi, Frank J. (2011-04-14). A Probability Metrics Approach to Financial Risk Measures. doi:10.1002/9781444392715. ISBN 9781444392715.
- ^ Bernoulli Distribution, The University of Chicago - Department of Statistics, [online] available at http://galton.uchicago.edu/~eichler/stat22000/Handouts/l12.pdf
- ^ "The Binomial Distribution". Archived from the original on 2014-02-26. Retrieved 2014-01-25.
- ^ Haight, Frank A. (1967). Handbook of the Poisson distribution. Wiley. OCLC 422367440.
- ^ Sigman, Karl. "Poisson processes, and Compound (batch) Poisson processes" (PDF).
- ^ a b Stephen Gilmore, An Introduction to Stochastic Simulation - Stochastic Simulation Algorithms, University of Edinburgh, [online] available at http://www.doc.ic.ac.uk/~jb/conferences/pasta2006/slides/stochastic-simulation-introduction.pdf
- ^ Michael A. Gibson and Jehoshua Bruck, Efficient exact stochastic simulation of chemical systems with many species and many channels, J. Phys. Chem. A, 104:1876–1899, 2000.
- ^ Y. Cao, H. Li, and L. Petzold. Efficient formulation of the stochastic simulation algorithm for chemically reacting systems, J. Chem. Phys, 121(9):4059–4067, 2004.
- ^ Gillespie, D.T. (1976). "A General Method for Numerically Simulating the stochastic time evolution of coupled chemical reactions". Journal of Computational Physics. 22 (4): 403–434. Bibcode:1976JCoPh..22..403G. doi:10.1016/0021-9991(76)90041-3.
- ^ H.T. Banks, Anna Broido, Brandi Canter, Kaitlyn Gayvert, Shuhua Hu, Michele Joyner, Kathryn Link, Simulation Algorithms for Continuous Time Markov Chain Models, [online] available at http://www.ncsu.edu/crsc/reports/ftp/pdf/crsc-tr11-17.pdf
- ^ Spill, F; Maini, PK; Byrne, HM (2016). "Optimisation of simulations of stochastic processes by removal of opposing reactions". Journal of Chemical Physics. 144 (8): 084105. arXiv:1602.02655. Bibcode:2016JChPh.144h4105S. doi:10.1063/1.4942413. PMID 26931679. S2CID 13334842.
- ^ Crespo-Márquez, A., R. R. Usano and R. D. Aznar, 1993, "Continuous and Discrete Simulation in a Production Planning System. A Comparative Study"
- ^ Louis G. Birta, Gilbert Arbez (2007). Modelling and Simulation, p. 255. Springer.
- ^ "Pole Balancing Tutorial".
- ^ University of Notre Dame, Normal Distribution, [online] available at http://www3.nd.edu/~rwilliam/stats1/x21.pdf
- ^ Francois E. Cellier, Combined Continuous/Discrete Simulation Applications, Techniques, and Tools
- ^ Spill, F.; et al. (2015). "Hybrid approaches for multiple-species stochastic reaction–diffusion models". Journal of Computational Physics. 299: 429–445. arXiv:1507.07992. Bibcode:2015JCoPh.299..429S. doi:10.1016/j.jcp.2015.07.002. PMC 4554296. PMID 26478601.
- ^ a b Cosma Rohilla Shalizi, Monte Carlo, and Other Kinds of Stochastic Simulation, [online] available at http://bactra.org/notebooks/monte-carlo.html
- ^ Tanaka, M.; Takahashi, Y.; Shimoda, T.; Yosoi, M.; Takahisa, K.; Plis, Yu. A. (2008-01-31). "RCNP polarized He3 ion source". Review of Scientific Instruments. 79 (2). doi:10.1063/1.2821495. ISSN 0034-6748.
- ^ a b Donald E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms - chapitre 3 : Random Numbers (Addison-Wesley, Boston, 1998).
- ^ Andreas hellander, Stochastic Simulation and Monte Carlo Methods, [online] available at http://www.it.uu.se/edu/course/homepage/bervet2/MCkompendium/mc.pdf
- (Slepoy 2008): Slepoy, A; Thompson, AP; Plimpton, SJ (2008). "A constant-time kinetic Monte Carlo algorithm for simulation of large biochemical reaction networks". Journal of Chemical Physics. 128 (20): 205101. Bibcode:2008JChPh.128t5101S. doi:10.1063/1.2919546. PMID 18513044.
- (Bratsun 2005): D. Bratsun; D. Volfson; J. Hasty; L. Tsimring (2005). "Delay-induced stochastic oscillations in gene regulation". PNAS. 102 (41): 14593–8. Bibcode:2005PNAS..10214593B. doi:10.1073/pnas.0503858102. PMC 1253555. PMID 16199522.
- (Cai 2007): X. Cai (2007). "Exact stochastic simulation of coupled chemical reactions with delays". J. Chem. Phys. 126 (12): 124108. Bibcode:2007JChPh.126l4108C. doi:10.1063/1.2710253. PMID 17411109.
- Hartmann, A.K. (2009). Practical Guide to Computer Simulations. World Scientific. ISBN 978-981-283-415-7. Archived from the original on 2009-02-11. Retrieved 2012-05-03.
- Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (2007). "Section 17.7. Stochastic Simulation of Chemical Reaction Networks". Numerical Recipes: The Art of Scientific Computing (3rd ed.). New York: Cambridge University Press. ISBN 978-0-521-88068-8.
- (Ramaswamy 2009): R. Ramaswamy; N. Gonzalez-Segredo; I. F. Sbalzarini (2009). "A new class of highly efficient exact stochastic simulation algorithms for chemical reaction networks". J. Chem. Phys. 130 (24): 244104. arXiv:0906.1992. Bibcode:2009JChPh.130x4104R. doi:10.1063/1.3154624. PMID 19566139. S2CID 4952205.
- (Ramaswamy 2010): R. Ramaswamy; I. F. Sbalzarini (2010). "A partial-propensity variant of the composition-rejection stochastic simulation algorithm for chemical reaction networks" (PDF). J. Chem. Phys. 132 (4): 044102. Bibcode:2010JChPh.132d4102R. doi:10.1063/1.3297948. PMID 20113014.
- (Ramaswamy 2011): R. Ramaswamy; I. F. Sbalzarini (2011). "A partial-propensity formulation of the stochastic simulation algorithm for chemical reaction networks with delays" (PDF). J. Chem. Phys. 134 (1): 014106. Bibcode:2011JChPh.134a4106R. doi:10.1063/1.3521496. PMID 21218996. S2CID 4949530.
External links
[edit]- Software
- cayenne - Fast, easy to use Python package for stochastic simulations. Implementations of direct, tau-leaping, and tau-adaptive algorithms.
- StochSS - StochSS: Stochastic Simulation Service - A Cloud Computing Framework for Modeling and Simulation of Stochastic Biochemical Systems.
- ResAssure - Stochastic reservoir simulation software - solves fully implicit, dynamic three-phase fluid flow equations for every geological realisation.
- Cain - Stochastic simulation of chemical kinetics. Direct, next reaction, tau-leaping, hybrid, etc.
- pSSAlib - C++ implementations of all partial-propensity methods.
- StochPy - Stochastic modelling in Python
- STEPS - STochastic Engine for Pathway Simulation using swig to create Python interface to C/C++ code
Stochastic simulation
View on GrokipediaOverview
Definition and Principles
Stochastic simulation is a computational method that employs random sampling to model and approximate the behavior of systems characterized by inherent randomness or uncertainty, generating multiple realizations or sample paths of stochastic processes to infer expected outcomes.[1] This approach contrasts with deterministic simulation, where inputs produce fixed, repeatable outputs without variability; in stochastic simulation, elements such as arrival times, service durations, or environmental noise introduce probabilistic variability, necessitating statistical analysis to handle the resulting output distributions.[1][5] Key principles underpinning stochastic simulation include replication, where independent runs of the model using different random seeds are performed to build statistical confidence in estimates, and convergence to true expectations, guaranteed by the law of large numbers.[1] Under the law of large numbers, the sample mean of independent and identically distributed random variables with finite expectation converges in probability to as the number of replications approaches infinity: This convergence ensures that, with sufficiently many replications, simulated averages reliably approximate system performance measures like expected throughput or queue lengths.[1][5] The basic workflow of stochastic simulation begins with model formulation, defining the system's entities, states, and probabilistic rules (e.g., exponential distributions for inter-event times).[1] This is followed by random input generation, producing sequences of pseudo-random numbers to drive stochastic elements, such as simulating customer arrivals in a queueing system.[1] The model is then executed through algorithms like discrete-event simulation, advancing time via event occurrences and updating system states accordingly.[1] Finally, output analysis aggregates results across replications to compute metrics including the sample mean, variance, and confidence intervals, enabling robust inference about the underlying probabilistic system.[1][5]Historical Development
The term "stochastic" originates from the Greek adjective stokhastikos (στοχαστικός), denoting "skillful in aiming" or relating to conjecture and guesswork. Jacob Bernoulli connected this concept to probability in his 1713 treatise Ars Conjectandi, employing the phrase "Ars Conjectandi sive Stochastice" to frame the study of probabilistic inference as an art of reasoned guessing.[6][7] The practical foundations of stochastic simulation took shape in the 1940s amid World War II efforts, when Stanisław Ulam and John von Neumann developed the Monte Carlo method at Los Alamos National Laboratory as part of the Manhattan Project. This approach used random sampling on early computers to model neutron diffusion in nuclear fission processes, marking the first systematic application of probabilistic simulation to solve intractable physical problems.[8] In the 1950s, stochastic techniques advanced into discrete-event simulation for operations research, with John McLeod playing a pivotal role by outlining core simulation principles in 1952 at the Naval Air Missile Test Center, which spurred the creation of dedicated simulation societies and broader adoption in military and industrial planning.[9] The 1960s saw the emergence of general-purpose simulation languages, such as SIMULA and GPSS, which facilitated more accessible and versatile modeling of stochastic systems.[10] The 1970s brought innovations in stochastic simulation, highlighted by Daniel T. Gillespie's 1977 algorithm for exact stochastic modeling of chemical kinetics in well-mixed systems, enabling precise trajectory simulations of reacting molecules.[11] From the 1980s onward, stochastic simulation evolved alongside computational hardware, incorporating parallel processing in the 1980s and 1990s to distribute discrete-event workloads across multiprocessor systems and tackle expansive models in fields like telecommunications and logistics.[12] By the 2000s and 2010s, graphics processing units (GPUs) revolutionized the domain through massive parallelism, accelerating simulations of complex stochastic processes such as biochemical reaction-diffusion networks by orders of magnitude compared to CPU-based methods.[13] Into the 2020s, advancements have included integration of machine learning for variance reduction and surrogate models, enhancing efficiency in high-dimensional stochastic simulations as of 2025.[14]Fundamentals
Key Probability Distributions
Stochastic simulations model uncertainty and randomness in systems by drawing from probability distributions that capture different forms of variability, such as discrete counts, binary events, or continuous times. These distributions provide the foundational mechanisms for generating synthetic data that mimics real-world stochastic behavior, enabling analysts to evaluate system performance under uncertainty. The choice of distribution depends on the nature of the randomness being modeled, with parameters tuned to match empirical data or theoretical expectations. Seminal works in stochastic modeling emphasize these as core building blocks for constructing more complex simulations.[15] The Bernoulli distribution represents the simplest form of discrete randomness, modeling binary outcomes like success or failure in a single trial. It is parameterized by , the probability of success (where ), with probability mass function and . This distribution is essential for simulating individual random events, such as a machine malfunction or a customer arrival decision, forming the basis for more aggregated models in discrete stochastic systems.[15] The binomial distribution generalizes the Bernoulli to count the number of successes in a fixed number of independent trials. It has parameters (number of trials, a positive integer) and (success probability per trial), with probability mass function This distribution is widely used in simulations to model cumulative outcomes, such as the number of defective items in a batch or successes in repeated experiments, where the total trials are predefined.[15] The Poisson distribution models the count of independent events occurring in a fixed interval of time or space, particularly when events are rare and random. Parameterized by (the expected rate or mean number of events), its probability mass function is It serves as a limiting approximation to the binomial distribution for large and small (with ), making it suitable for simulating arrivals in queues, defects in quality control, or particle emissions in physical processes.[15] The exponential distribution captures continuous waiting times between successive events in a Poisson process, modeling interarrival or lifetime durations. It is parameterized by rate (the inverse of the mean time), with probability density function for . A key property is its memorylessness—the probability of an event occurring in the next interval does not depend on time already elapsed—which underpins its use in simulating service times, repair durations, or decay processes in continuous-time stochastic systems.[15] The normal distribution, also known as Gaussian, describes symmetric, continuous variability around a central value, common for aggregated measurements or errors. Parameterized by mean and variance , its probability density function is Its prevalence in stochastic simulation stems from the central limit theorem, which states that sums or averages of many independent random variables (regardless of their original distributions) converge to a normal distribution as the number of terms increases, justifying its application to model noise, measurement errors, or population traits in large-scale systems.[15] Additional distributions extend these foundations for specialized randomness. The uniform distribution provides baseline randomness over a continuous interval , with parameters and defining the range, often used to initialize simulations or model equally likely outcomes. The Student's t-distribution, parameterized by degrees of freedom , accounts for heavier tails and uncertainty in estimates from small samples, aiding simulations involving statistical inference. The gamma distribution, with shape and rate , models positive continuous variables like total waiting times that sum multiple exponential interarrivals, such as processing durations in multi-stage systems. These are selected based on goodness-of-fit to data, enhancing the fidelity of stochastic models.[15]Random Number Generation
Stochastic simulations rely on sequences of numbers that mimic true randomness to model uncertainty and variability in systems. Pseudo-random number generators (PRNGs) produce such sequences through deterministic algorithms that start from an initial seed value and apply recurrence relations to generate subsequent numbers, appearing statistically random while ensuring reproducibility for verification and debugging purposes.[16] In contrast, true random number generators (TRNGs) derive entropy from unpredictable physical processes, such as thermal noise or radioactive decay, yielding genuinely unpredictable outputs but lacking the controllability needed for repeatable simulations.[17] PRNGs dominate stochastic simulation due to their computational efficiency and ability to produce long, non-repeating sequences from finite state spaces.[16] A foundational PRNG is the linear congruential generator (LCG), defined by the recurrence relation , where is the current state, is the multiplier, the increment, and the modulus, typically chosen as a large power of 2 like or to maximize the range.[18] The output uniform variate is then , yielding values in [0, 1). Proper selection of parameters ensures a full period of length , meaning the sequence cycles through all possible states before repeating, though poor choices can result in short cycles and detectable patterns.[18] Limitations of LCGs include their susceptibility to serial correlation and inadequate performance in high-dimensional applications, often failing advanced statistical tests due to structured lattice artifacts in generated points.[18] For more robust generation in complex stochastic simulations, the Mersenne Twister algorithm, introduced in 1997 by Makoto Matsumoto and Takuji Nishimura, offers a period of , vastly exceeding typical simulation requirements and providing excellent equidistribution in up to 623 dimensions.[19] This twisted generalized feedback shift register design efficiently produces high-quality uniform pseudorandom numbers suitable for Monte Carlo methods and other high-dimensional integrations, with minimal computational overhead after initialization.[19] The quality of PRNG outputs is evaluated using statistical tests to verify properties like uniformity, independence, and absence of correlations. The chi-squared test assesses uniformity by partitioning the [0,1) interval into cells, computing observed frequencies of generated numbers, and comparing them to expected uniform counts via the statistic , which follows a chi-squared distribution under the null hypothesis of uniformity.[20] The runs test checks independence by counting sequences of ascending or descending values (runs) in the sequence and applying a chi-squared goodness-of-fit to detect dependencies, with deviations indicating non-random patterns.[20] The spectral test evaluates lattice structure by examining the discrete Fourier transform of binary sequences to identify low-discrepancy hyperplanes, where a high spectral value signals poor multidimensional uniformity.[20] To sample from non-uniform probability distributions essential in stochastic simulation, PRNG uniforms are transformed using methods like inverse transform sampling. This technique generates a sample from a target cumulative distribution function (CDF) by drawing and setting , ensuring follows the desired distribution if the inverse exists.[21] For discrete distributions with probabilities , cumulative sums guide selection:Sample u ~ Uniform[0,1]
Find smallest j such that sum_{i=1 to j} p_i >= u
Return x_j
Sample u ~ Uniform[0,1]
Find smallest j such that sum_{i=1 to j} p_i >= u
Return x_j
While not accepted:
Draw x ~ \tilde{f}(·)
Draw u ~ Uniform[0,1]
If u ≤ f(x) / (c \tilde{f}(x)), accept and return x
While not accepted:
Draw x ~ \tilde{f}(·)
Draw u ~ Uniform[0,1]
If u ≤ f(x) / (c \tilde{f}(x)), accept and return x
