Hubbry Logo
Geometric distributionGeometric distributionMain
Open search
Geometric distribution
Community hub
Geometric distribution
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Geometric distribution
Geometric distribution
from Wikipedia
Geometric
Probability mass function
Cumulative distribution function
Parameters success probability (real) success probability (real)
Support k trials where k failures where
PMF
CDF for ,
for
for ,
for
Mean
Median


(not unique if is an integer)


(not unique if is an integer)
Mode
Variance
Skewness
Excess kurtosis
Entropy
MGF
for

for
CF
PGF
Fisher information

In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:

  • The probability distribution of the number of Bernoulli trials needed to get one success, supported on ;
  • The probability distribution of the number of failures before the first success, supported on .

These two different geometric distributions should not be confused with each other. Often, the name shifted geometric distribution is adopted for the former one (distribution of ); however, to avoid ambiguity, it is considered wise to indicate which is intended, by mentioning the support explicitly.

The geometric distribution gives the probability that the first occurrence of success requires independent trials, each with success probability . If the probability of success on each trial is , then the probability that the -th trial is the first success is

for

The above form of the geometric distribution is used for modeling the number of trials up to and including the first success. By contrast, the following form of the geometric distribution is used for modeling the number of failures until the first success:

for

The geometric distribution gets its name because its probabilities follow a geometric sequence. It is sometimes called the Furry distribution after Wendell H. Furry.[1]: 210 

Definition

[edit]

The geometric distribution is the discrete probability distribution that describes when the first success in an infinite sequence of independent and identically distributed Bernoulli trials occurs. Its probability mass function depends on its parameterization and support. When supported on , the probability mass function is where is the number of trials and is the probability of success in each trial.[2]: 260–261 

The support may also be , defining . This alters the probability mass function into where is the number of failures before the first success.[3]: 66 

An alternative parameterization of the distribution gives the probability mass function where and .[1]: 208–209 

An example of a geometric distribution arises from rolling a six-sided die until a "1" appears. Each roll is independent with a chance of success. The number of rolls needed follows a geometric distribution with .

Properties

[edit]

Memorylessness

[edit]

The geometric distribution is the only memoryless discrete probability distribution.[4] It is the discrete version of the same property found in the exponential distribution.[1]: 228  The property asserts that the number of previously failed trials does not affect the number of future trials needed for a success.

Because there are two definitions of the geometric distribution, there are also two definitions of memorylessness for discrete random variables.[5] Expressed in terms of conditional probability, the two definitions are and

where and are natural numbers, is a geometrically distributed random variable defined over , and is a geometrically distributed random variable defined over . Note that these definitions are not equivalent for discrete random variables; does not satisfy the first equation and does not satisfy the second.

Moments and cumulants

[edit]

The expected value and variance of a geometrically distributed random variable defined over is[2]: 261  With a geometrically distributed random variable defined over , the expected value changes into while the variance stays the same.[6]: 114–115 

For example, when rolling a six-sided die until landing on a "1", the average number of rolls needed is and the average number of failures is .

The moment generating function of the geometric distribution when defined over and respectively is[7][6]: 114  The moments for the number of failures before the first success are given by

where is the polylogarithm function.[8]

The cumulant generating function of the geometric distribution defined over is[1]: 216  The cumulants satisfy the recursionwhere , when defined over .[1]: 216 

Proof of expected value

[edit]

Consider the expected value of X as above, i.e. the average number of trials until a success. The first trial either succeeds with probability , or fails with probability . If it fails, the remaining mean number of trials until a success is identical to the original mean - this follows from the fact that all trials are independent.

From this we get the formula:

which, when solved for , gives:

The expected number of failures can be found from the linearity of expectation, . It can also be shown in the following way:

The interchange of summation and differentiation is justified by the fact that convergent power series converge uniformly on compact subsets of the set of points where they converge.

Summary statistics

[edit]

The mean of the geometric distribution is its expected value which is, as previously discussed in § Moments and cumulants, or when defined over or respectively.

The median of the geometric distribution is when defined over [9] and when defined over .[3]: 69 

The mode of the geometric distribution is the first value in the support set. This is 1 when defined over and 0 when defined over .[3]: 69 

The skewness of the geometric distribution is .[6]: 115 

The kurtosis of the geometric distribution is .[6]: 115  The excess kurtosis of a distribution is the difference between its kurtosis and the kurtosis of a normal distribution, .[10]: 217  Therefore, the excess kurtosis of the geometric distribution is . Since , the excess kurtosis is always positive so the distribution is leptokurtic.[3]: 69  In other words, the tail of a geometric distribution decays faster than a Gaussian.[10]: 217 

Entropy and Fisher's Information

[edit]

Entropy (Geometric Distribution, Failures Before Success)

[edit]

Entropy is a measure of uncertainty in a probability distribution. For the geometric distribution that models the number of failures before the first success, the probability mass function is:

The entropy for this distribution is defined as:

The entropy increases as the probability decreases, reflecting greater uncertainty as success becomes rarer.

Fisher's Information (Geometric Distribution, Failures Before Success)

[edit]

Fisher information measures the amount of information that an observable random variable carries about an unknown parameter . For the geometric distribution (failures before the first success), the Fisher information with respect to is given by:

Proof:

  • The Likelihood Function for a geometric random variable is:
  • The Log-Likelihood Function is:
  • The Score Function (first derivative of the log-likelihood w.r.t. ) is:
  • The second derivative of the log-likelihood function is:
  • Fisher Information is calculated as the negative expected value of the second derivative:

Fisher information increases as decreases, indicating that rarer successes provide more information about the parameter .

Entropy (Geometric Distribution, Trials Until Success)

[edit]

For the geometric distribution modeling the number of trials until the first success, the probability mass function is:

The entropy for this distribution is the same as that of version modeling trials until failure,

Fisher's Information (Geometric Distribution, Trials Until Success)

[edit]

Fisher information for the geometric distribution modeling the number of trials until the first success is given by:

Proof:

  • The Likelihood Function for a geometric random variable is:

  • The Log-Likelihood Function is:

  • The Score Function (first derivative of the log-likelihood w.r.t. ) is:

  • The second derivative of the log-likelihood function is:

  • Fisher Information is calculated as the negative expected value of the second derivative:

General properties

[edit]
  • The probability generating functions of geometric random variables and defined over and are, respectively,[6]: 114–115 
  • The characteristic function is equal to so the geometric distribution's characteristic function, when defined over and respectively, is[11]: 1630 
  • The entropy of a geometric distribution with parameter is[12]
  • Given a mean, the geometric distribution is the maximum entropy probability distribution of all discrete probability distributions. The corresponding continuous distribution is the exponential distribution.[13]
  • The geometric distribution defined on is infinitely divisible, that is, for any positive integer , there exist independent identically distributed random variables whose sum is also geometrically distributed. This is because the negative binomial distribution can be derived from a Poisson-stopped sum of logarithmic random variables.[11]: 606–607 
  • The decimal digits of the geometrically distributed random variable Y are a sequence of independent (and not identically distributed) random variables.[citation needed] For example, the hundreds digit D has this probability distribution: where q = 1 − p, and similarly for the other digits, and, more generally, similarly for numeral systems with other bases than 10. When the base is 2, this shows that a geometrically distributed random variable can be written as a sum of independent random variables whose probability distributions are indecomposable.
  • Golomb coding is the optimal prefix code[clarification needed] for the geometric discrete distribution.[12]
[edit]
  • The sum of independent geometric random variables with parameter is a negative binomial random variable with parameters and .[14] The geometric distribution is a special case of the negative binomial distribution, with .
  • The geometric distribution is a special case of discrete compound Poisson distribution.[11]: 606 
  • The minimum of geometric random variables with parameters is also geometrically distributed with parameter .[15]
  • Suppose 0 < r < 1, and for k = 1, 2, 3, ... the random variable Xk has a Poisson distribution with expected value rk/k. Then has a geometric distribution taking values in , with expected value r/(1 − r).[citation needed]
  • The exponential distribution is the continuous analogue of the geometric distribution. Applying the floor function to the exponential distribution with parameter creates a geometric distribution with parameter defined over .[3]: 74  This can be used to generate geometrically distributed random numbers as detailed in § Random variate generation.
  • If p = 1/n and X is geometrically distributed with parameter p, then the distribution of X/n approaches an exponential distribution with expected value 1 as n → ∞, sinceMore generally, if p = λ/n, where λ is a parameter, then as n→ ∞ the distribution of X/n approaches an exponential distribution with rate λ: therefore the distribution function of X/n converges to , which is that of an exponential random variable.[citation needed]
  • The index of dispersion of the geometric distribution is and its coefficient of variation is . The distribution is overdispersed.[1]: 216 

Statistical inference

[edit]

The true parameter of an unknown geometric distribution can be inferred through estimators and conjugate distributions.

Method of moments

[edit]

Provided they exist, the first moments of a probability distribution can be estimated from a sample using the formulawhere is the th sample moment and .[16]: 349–350  Estimating with gives the sample mean, denoted . Substituting this estimate in the formula for the expected value of a geometric distribution and solving for gives the estimators and when supported on and respectively. These estimators are biased since as a result of Jensen's inequality.[17]: 53–54 

Maximum likelihood estimation

[edit]

The maximum likelihood estimator of is the value that maximizes the likelihood function given a sample.[16]: 308  By finding the zero of the derivative of the log-likelihood function when the distribution is defined over , the maximum likelihood estimator can be found to be , where is the sample mean.[18] If the domain is , then the estimator shifts to . As previously discussed in § Method of moments, these estimators are biased.

Regardless of the domain, the bias is equal to

which yields the bias-corrected maximum likelihood estimator,[citation needed]

Bayesian inference

[edit]

In Bayesian inference, the parameter is a random variable from a prior distribution with a posterior distribution calculated using Bayes' theorem after observing samples.[17]: 167  If a beta distribution is chosen as the prior distribution, then the posterior will also be a beta distribution and it is called the conjugate distribution. In particular, if a prior is selected, then the posterior, after observing samples , is[19]Alternatively, if the samples are in , the posterior distribution is[20]Since the expected value of a distribution is ,[11]: 145  as and approach zero, the posterior mean approaches its maximum likelihood estimate.

Random variate generation

[edit]

The geometric distribution can be generated experimentally from i.i.d. standard uniform random variables by finding the first such random variable to be less than or equal to . However, the number of random variables needed is also geometrically distributed and the algorithm slows as decreases.[21]: 498 

Random generation can be done in constant time by truncating exponential random numbers. An exponential random variable can become geometrically distributed with parameter through . In turn, can be generated from a standard uniform random variable altering the formula into .[21]: 499–500 [22]

Applications

[edit]

The geometric distribution is used in many disciplines. In queueing theory, the M/M/1 queue has a steady state following a geometric distribution.[23] In stochastic processes, the Yule Furry process is geometrically distributed.[24] The distribution also arises when modeling the lifetime of a device in discrete contexts.[25] It has also been used to fit data including modeling patients spreading COVID-19.[26]

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The geometric distribution is a discrete probability distribution that models the number of independent Bernoulli trials required to achieve the first success, where each trial has a constant success probability pp (with 0<p10 < p \leq 1). It arises in scenarios involving repeated independent experiments with binary outcomes, such as success or failure, and is fundamental in probability theory for analyzing waiting times until an event occurs. There are two common parameterizations of the geometric distribution, differing in whether the random variable counts the total number of trials until the first success (with support X=1,2,3,X = 1, 2, 3, \dots) or the number of failures preceding the first success (with support Y=0,1,2,Y = 0, 1, 2, \dots). For the trials-until-success version, the probability mass function is given by
P(X=k)=(1p)k1p,k=1,2,3,P(X = k) = (1 - p)^{k-1} p, \quad k = 1, 2, 3, \dots
while for the failures-before-success version, it is
P(Y=k)=(1p)kp,k=0,1,2,.P(Y = k) = (1 - p)^k p, \quad k = 0, 1, 2, \dots. The expected value is E[X]=1pE[X] = \frac{1}{p} for the former and E[Y]=1ppE[Y] = \frac{1-p}{p} for the latter, with both sharing the variance 1pp2\frac{1-p}{p^2}.
A defining feature of the geometric distribution is its memoryless property, which states that the probability of requiring additional trials beyond a certain point is independent of the trials already conducted: P(X>s+tX>s)=P(X>t)P(X > s + t \mid X > s) = P(X > t) for non-negative integers ss and tt. This property uniquely characterizes the geometric distribution among discrete distributions on the non-negative integers and parallels the exponential distribution in continuous settings. The distribution also serves as the special case r=1r = 1 of the negative binomial distribution, which generalizes it to the number of trials until the rr-th success. Applications of the geometric distribution are widespread in fields requiring modeling of waiting times or trial counts until an event, such as reliability engineering (e.g., time until an engine failure in independent tests), quality control (e.g., inspections until a defective item is found), and telecommunications (e.g., packet retransmissions until successful delivery). It is also used in ecology for modeling runs of species occurrences and in computer science for analyzing algorithm performance in randomized settings.

Definition

Parameterizations

The geometric distribution arises in the context of independent Bernoulli trials, each with success probability pp where 0<p10 < p \leq 1. One standard parameterization defines the random variable XX as the number of failures preceding the first success, with possible values in the set {0,1,2,}\{0, 1, 2, \dots \}. This formulation, often considered primary in mathematical probability, directly corresponds to the terms of a geometric series indexed from 0. An alternative parameterization specifies the random variable YY as the total number of trials required to achieve the first success, taking values in {1,2,3,}\{1, 2, 3, \dots \}. Here, Y=X+1Y = X + 1, linking the two definitions. Both parameterizations remain in common use due to contextual preferences: the failures version suits derivations involving infinite series and theoretical probability, while the trials version is favored in applied statistics for representing waiting times or experiment counts. The probability of failure on each trial is conventionally denoted q=1pq = 1 - p.

Probability Mass Function

The geometric distribution can be parameterized in terms of the number of failures XX before the first success in a sequence of independent Bernoulli trials, each with success probability pp where 0<p10 < p \leq 1. The probability mass function (PMF) for this parameterization is given by P(X=k)=(1p)kp,k=0,1,2,P(X = k) = (1 - p)^k p, \quad k = 0, 1, 2, \dots This PMF is verified to be a valid probability distribution, as the infinite sum over the support equals 1: k=0P(X=k)=pk=0(1p)k=p11(1p)=1,\sum_{k=0}^{\infty} P(X = k) = p \sum_{k=0}^{\infty} (1 - p)^k = p \cdot \frac{1}{1 - (1 - p)} = 1, using the formula for the sum of an infinite geometric series with common ratio 1p<1|1 - p| < 1. An alternative parameterization models the number of trials YY until the first success, also based on independent Bernoulli trials with success probability pp. The corresponding PMF is P(Y=k)=(1p)k1p,k=1,2,3,P(Y = k) = (1 - p)^{k-1} p, \quad k = 1, 2, 3, \dots This PMF similarly sums to 1 over its support: k=1P(Y=k)=pk=1(1p)k1=pj=0(1p)j=p11(1p)=1,\sum_{k=1}^{\infty} P(Y = k) = p \sum_{k=1}^{\infty} (1 - p)^{k-1} = p \sum_{j=0}^{\infty} (1 - p)^j = p \cdot \frac{1}{1 - (1 - p)} = 1, again applying the infinite geometric series sum. The random variables XX and YY are related by Y=X+1Y = X + 1, which induces a probability shift such that P(Y=k)=P(X=k1)P(Y = k) = P(X = k - 1) for k=1,2,k = 1, 2, \dots, accounting for the difference in their supports and the exponent adjustment in the PMF.

Cumulative Distribution Function

The cumulative distribution function (CDF) of a geometric random variable provides the probability that the number of failures or trials until the first success is at most a given value. There are two common parameterizations of the geometric distribution, which differ in whether the random variable counts the number of failures before the first success (starting from 0) or the number of trials until the first success (starting from 1). Consider first the parameterization where XX denotes the number of failures before the first success in a sequence of independent Bernoulli trials, each with success probability pp where 0<p<10 < p < 1. The CDF is given by FX(k)=P(Xk)=1(1p)k+1,k=0,1,2,F_X(k) = P(X \leq k) = 1 - (1 - p)^{k+1}, \quad k = 0, 1, 2, \dots This closed-form expression is obtained by summing the probability mass function (PMF) from 0 to kk, which forms a finite geometric series. For the alternative parameterization where YY represents the trial number on which the first success occurs, the support begins at 1, and the CDF is FY(k)=P(Yk)=1(1p)k,k=1,2,3,F_Y(k) = P(Y \leq k) = 1 - (1 - p)^k, \quad k = 1, 2, 3, \dots Similarly, this follows from summing the corresponding PMF from 1 to kk. In both cases, the CDF approaches 1 as kk \to \infty, since 1p<1|1 - p| < 1, ensuring that success eventually occurs with probability 1. The survival function, S(k)=1F(k)S(k) = 1 - F(k), is then SX(k)=(1p)k+1S_X(k) = (1 - p)^{k+1} for the failures parameterization and SY(k)=(1p)kS_Y(k) = (1 - p)^k for the trials parameterization, representing the probability that no success has occurred by trial kk. These CDFs interpret the cumulative probability of the first success happening by the kk-th trial (or after at most kk failures), which is fundamental for modeling waiting times in discrete processes.

Properties

Memorylessness

The memoryless property of the geometric distribution states that, for a random variable XX representing the number of trials until the first success with success probability pp (where 0<p<10 < p < 1), the conditional probability P(X>s+tX>s)=P(X>t)P(X > s + t \mid X > s) = P(X > t) holds for all nonnegative integers ss and tt. This means that the probability of requiring more than tt additional trials after already observing ss failures remains unchanged from the original probability of exceeding tt trials from the start. To prove this, consider the survival function derived from the cumulative distribution function (CDF). For this parameterization, P(X>k)=(1p)kP(X > k) = (1 - p)^k for k=0,1,2,k = 0, 1, 2, \dots. Thus, P(X>s+tX>s)=P(X>s+t)P(X>s)=(1p)s+t(1p)s=(1p)t=P(X>t).P(X > s + t \mid X > s) = \frac{P(X > s + t)}{P(X > s)} = \frac{(1 - p)^{s + t}}{(1 - p)^s} = (1 - p)^t = P(X > t). This equality demonstrates that past outcomes do not influence future probabilities. The property implies that the distribution of the remaining number of trials (or failures) is identical to the original distribution, regardless of the number of prior failures observed, reflecting a lack of "aging" or dependence on history in the process. As the sole discrete distribution exhibiting memorylessness, the geometric distribution serves as the discrete analogue to the exponential distribution's continuous memoryless property.

Moments and Cumulants

The expected value of the geometric random variable XX, representing the number of failures before the first success in a sequence of independent Bernoulli trials with success probability pp, is given by E[X]=1ppE[X] = \frac{1-p}{p}. This can be derived directly from the probability mass function P(X=k)=p(1p)kP(X = k) = p (1-p)^k for k=0,1,2,k = 0, 1, 2, \dots, yielding E[X]=k=0kp(1p)k=p(1p)k=1k(1p)k1=1ppE[X] = \sum_{k=0}^{\infty} k \, p (1-p)^k = p (1-p) \sum_{k=1}^{\infty} k (1-p)^{k-1} = \frac{1-p}{p}, where the sum is evaluated using the formula for the expected value of a geometric series. Alternatively, leveraging the memoryless property of the geometric distribution, the tail sum formula provides E[X]=k=0P(X>k)E[X] = \sum_{k=0}^{\infty} P(X > k), where P(X>k)=(1p)k+1P(X > k) = (1-p)^{k+1}, so E[X]=k=0(1p)k+1=1ppE[X] = \sum_{k=0}^{\infty} (1-p)^{k+1} = \frac{1-p}{p}. The variance is Var(X)=1pp2\operatorname{Var}(X) = \frac{1-p}{p^2}. To derive this, first compute the second factorial moment E[X(X1)]=k=0k(k1)p(1p)k=2p(1p)2k=2(1p)k2=2(1p)2p2E[X(X-1)] = \sum_{k=0}^{\infty} k(k-1) \, p (1-p)^k = 2p (1-p)^2 \sum_{k=2}^{\infty} (1-p)^{k-2} = \frac{2(1-p)^2}{p^2}. Then, Var(X)=E[X(X1)]+E[X](E[X])2=2(1p)2p2+1pp(1pp)2=1pp2\operatorname{Var}(X) = E[X(X-1)] + E[X] - (E[X])^2 = \frac{2(1-p)^2}{p^2} + \frac{1-p}{p} - \left( \frac{1-p}{p} \right)^2 = \frac{1-p}{p^2}. In the alternative parameterization where Y=X+1Y = X + 1 denotes the number of trials until the first success, the expected value is E[Y]=1pE[Y] = \frac{1}{p} and the variance is Var(Y)=1pp2\operatorname{Var}(Y) = \frac{1-p}{p^2}. The skewness of the geometric distribution (in the failures parameterization) is 2p1p\frac{2 - p}{\sqrt{1 - p}}
Add your contribution
Related Hubs
User Avatar
No comments yet.