Hubbry Logo
Independence (probability theory)Independence (probability theory)Main
Open search
Independence (probability theory)
Community hub
Independence (probability theory)
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Independence (probability theory)
Independence (probability theory)
from Wikipedia

Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent[1] if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the odds. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.

When dealing with collections of more than two events, two notions of independence need to be distinguished. The events are called pairwise independent if any two events in the collection are independent of each other, while mutual independence (or collective independence) of events means, informally speaking, that each event is independent of any combination of other events in the collection. A similar notion exists for collections of random variables. Mutual independence implies pairwise independence, but not the other way around. In the standard literature of probability theory, statistics, and stochastic processes, independence without further qualification usually refers to mutual independence.

Definition

[edit]

For events

[edit]

Two events

[edit]

Two events and are independent (often written as or , where the latter symbol often is also used for conditional independence) if and only if their joint probability equals the product of their probabilities:[2]: p. 29 [3]: p. 10 

indicates that two independent events and have common elements in their sample space so that they are not mutually exclusive (mutually exclusive if and only if (iff) ). Why this defines independence is made clear by rewriting with conditional probabilities as the probability at which the event occurs provided that the event has or is assumed to have occurred:

and similarly

Thus, the occurrence of does not affect the probability of , and vice versa. In other words, and are independent of each other. Although the derived expressions may seem more intuitive, they are not the preferred definition, as the conditional probabilities may be undefined if or are 0. Furthermore, the preferred definition makes clear by symmetry that when is independent of , is also independent of .

Odds

[edit]

Stated in terms of odds, two events are independent if and only if the odds ratio of and is unity (1). Analogously with probability, this is equivalent to the conditional odds being equal to the unconditional odds:

or to the odds of one event, given the other event, being the same as the odds of the event, given the other event not occurring:

The odds ratio can be defined as

or symmetrically for odds of given , and thus is 1 if and only if the events are independent.

More than two events

[edit]

A finite set of events is pairwise independent if every pair of events is independent[4]—that is, if and only if for all distinct pairs of indices ,

A finite set of events is mutually independent if every event is independent of any intersection of the other events[4][3]: p. 11 —that is, if and only if for every and for every k indices ,

This is called the multiplication rule for independent events. It is not a single condition involving only the product of all the probabilities of all single events; it must hold true for all subsets of events.

For more than two events, a mutually independent set of events is (by definition) pairwise independent; but the converse is not necessarily true.[2]: p. 30 

Log probability and information content

[edit]

Stated in terms of log probability, two events are independent if and only if the log probability of the joint event is the sum of the log probability of the individual events:

In information theory, negative log probability is interpreted as information content, and thus two events are independent if and only if the information content of the combined event equals the sum of information content of the individual events:

See Information content § Additivity of independent events for details.

For real valued random variables

[edit]

Two random variables

[edit]

Two random variables and are independent if and only if (iff) the elements of the π-system generated by them are independent; that is to say, for every and , the events and are independent events (as defined above in Eq.1). That is, and with cumulative distribution functions and , are independent iff the combined random variable has a joint cumulative distribution function[3]: p. 15 

or equivalently, if the probability densities and and the joint probability density exist,

More than two random variables

[edit]

A finite set of random variables is pairwise independent if and only if every pair of random variables is independent. Even if the set of random variables is pairwise independent, it is not necessarily mutually independent as defined next.

A finite set of random variables is mutually independent if and only if for any sequence of numbers , the events are mutually independent events (as defined above in Eq.3). This is equivalent to the following condition on the joint cumulative distribution function . A finite set of random variables is mutually independent if and only if[3]: p. 16 

It is not necessary here to require that the probability distribution factorizes for all possible -element subsets as in the case for events. This is not required because e.g. implies .

The measure-theoretically inclined reader may prefer to substitute events for events in the above definition, where is any Borel set. That definition is exactly equivalent to the one above when the values of the random variables are real numbers. It has the advantage of working also for complex-valued random variables or for random variables taking values in any measurable space (which includes topological spaces endowed by appropriate σ-algebras).

For real valued random vectors

[edit]

Two random vectors and are called independent if[5]: p. 187 

where and denote the cumulative distribution functions of and and denotes their joint cumulative distribution function. Independence of and is often denoted by . Written component-wise, and are called independent if

For stochastic processes

[edit]

For one stochastic process

[edit]

The definition of independence may be extended from random vectors to a stochastic process. Therefore, it is required for an independent stochastic process that the random variables obtained by sampling the process at any times are independent random variables for any .[6]: p. 163 

Formally, a stochastic process is called independent, if and only if for all and for all

where . Independence of a stochastic process is a property within a stochastic process, not between two stochastic processes.

For two stochastic processes

[edit]

Independence of two stochastic processes is a property between two stochastic processes and that are defined on the same probability space . Formally, two stochastic processes and are said to be independent if for all and for all , the random vectors and are independent,[7]: p. 515  i.e. if

Independent σ-algebras

[edit]

The definitions above (Eq.1 and Eq.2) are both generalized by the following definition of independence for σ-algebras. Let be a probability space and let and be two sub-σ-algebras of . and are said to be independent if, whenever and ,

Likewise, a finite family of σ-algebras , where is an index set, is said to be independent if and only if

and an infinite family of σ-algebras is said to be independent if all its finite subfamilies are independent.

The new definition relates to the previous ones very directly:

  • Two events are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by an event is, by definition,
  • Two random variables and defined over are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by a random variable taking values in some measurable space consists, by definition, of all subsets of of the form , where is any measurable subset of .

Using this definition, it is easy to show that if and are random variables and is constant, then and are independent, since the σ-algebra generated by a constant random variable is the trivial σ-algebra . Probability zero events cannot affect independence so independence also holds if is only Pr-almost surely constant.

Properties

[edit]

Self-independence

[edit]

Note that an event is independent of itself if and only if

Thus an event is independent of itself if and only if it almost surely occurs or its complement almost surely occurs; this fact is useful when proving zero–one laws.[8]

Expectation and covariance

[edit]

If and are statistically independent random variables, then the expectation operator has the property

[9]: p. 10 

and the covariance is zero, as follows from

The converse does not hold: if two random variables have a covariance of 0 they still may be not independent.

Similarly for two stochastic processes and : If they are independent, then they are uncorrelated.[10]: p. 151 

Characteristic function

[edit]

Two random variables and are independent if and only if the characteristic function of the random vector satisfies

In particular the characteristic function of their sum is the product of their marginal characteristic functions:

though the reverse implication is not true. Random variables that satisfy the latter condition are called subindependent.

Examples

[edit]

Rolling dice

[edit]

The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are independent. By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trial is 8 are not independent.

Drawing cards

[edit]

If two cards are drawn with replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are independent. By contrast, if two cards are drawn without replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are not independent, because a deck that has had a red card removed has proportionately fewer red cards.

Pairwise and mutual independence

[edit]
Pairwise independent, but not mutually independent, events
Mutually independent events

Consider the two probability spaces shown. In both cases, and . The events in the first space are pairwise independent because , , and ; but the three events are not mutually independent. The events in the second space are both pairwise independent and mutually independent. To illustrate the difference, consider conditioning on two events. In the pairwise independent case, although any one event is independent of each of the other two individually, it is not independent of the intersection of the other two:

In the mutually independent case, however,

Triple-independence but no pairwise-independence

[edit]

It is possible to create a three-event example in which

and yet no two of the three events are pairwise independent (and hence the set of events are not mutually independent).[11] This example shows that mutual independence involves requirements on the products of probabilities of all combinations of events, not just the single events as in this example.

Conditional independence

[edit]

For events

[edit]

The events and are conditionally independent given an event when

.

For random variables

[edit]

Intuitively, two random variables and are conditionally independent given if, once is known, the value of does not add any additional information about . For instance, two measurements and of the same underlying quantity are not independent, but they are conditionally independent given (unless the errors in the two measurements are somehow connected).

The formal definition of conditional independence is based on the idea of conditional distributions. If , , and are discrete random variables, then we define and to be conditionally independent given if

for all , and such that . On the other hand, if the random variables are continuous and have a joint probability density function , then and are conditionally independent given if

for all real numbers , and such that .

If discrete and are conditionally independent given , then

for any , and with . That is, the conditional distribution for given and is the same as that given alone. A similar equation holds for the conditional probability density functions in the continuous case.

Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.

History

[edit]

Before 1933, independence, in probability theory, was defined in a verbal manner. For example, de Moivre gave the following definition: “Two events are independent, when they have no connexion one with the other, and that the happening of one neither forwards nor obstructs the happening of the other”.[12] If there are n independent events, the probability of the event, that all of them happen was computed as the product of the probabilities of these n events. Apparently, there was the conviction, that this formula was a consequence of the above definition. (Sometimes this was called the Multiplication Theorem.), Of course, a proof of his assertion cannot work without further more formal tacit assumptions.

The definition of independence, given in this article, became the standard definition (now used in all books) after it appeared in 1933 as part of Kolmogorov's axiomatization of probability.[13] Kolmogorov credited it to S.N. Bernstein, and quoted a publication which had appeared in Russian in 1927.[14]

Unfortunately, both Bernstein and Kolmogorov had not been aware of the work of the Georg Bohlmann. Bohlmann had given the same definition for two events in 1901[15] and for n events in 1908[16] In the latter paper, he studied his notion in detail. For example, he gave the first example showing that pairwise independence does not imply mutual independence. Even today, Bohlmann is rarely quoted. More about his work can be found in On the contributions of Georg Bohlmann to probability theory from de:Ulrich Krengel.[17]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
In , is a foundational characterizing the lack of influence between events, random variables, or collections thereof, such that the probability of their joint occurrence equals the product of their marginal probabilities. Formally introduced in Andrey Kolmogorov's axiomatic framework in , which grounds in measure theory, it enables the modeling of non-interacting random phenomena. For two events AA and BB in a (Ω,F,P)(\Omega, \mathcal{F}, P), holds if P(AB)=P(A)P(B)P(A \cap B) = P(A) P(B), equivalently P(BA)=P(B)P(B \mid A) = P(B) when P(A)>0P(A) > 0. This definition extends naturally to random variables, where two random variables XX and YY are independent if their joint cumulative distribution function factors as FX,Y(x,y)=FX(x)FY(y)F_{X,Y}(x,y) = F_X(x) F_Y(y) for all x,yRx, y \in \mathbb{R}, implying that the joint probability density (if it exists) is the product of the marginal densities. For families of events or random variables, pairwise independence requires the condition for every pair, while mutual independence demands it for every finite subcollection, a stronger property essential for applications like the law of large numbers. Independence also generalizes to σ\sigma-algebras G\mathcal{G} and H\mathcal{H} within F\mathcal{F}, defined by P(GH)=P(G)P(H)P(G \cap H) = P(G) P(H) for all GGG \in \mathcal{G}, HHH \in \mathcal{H}, facilitating the analysis of stochastic processes where information from one subsystem does not affect another. Key properties include closure under complements and countable unions for independent events, preservation of independence under monotone transformations of random variables, and the zero-one law, which asserts that tail events in a sequence of independent σ\sigma-algebras have probability 0 or 1. In statistics and stochastic modeling, independence underpins assumptions in hypothesis testing, Bayesian inference, and Markov chains, allowing simplification of complex joint distributions into tractable products. Violations of independence, such as dependence in financial time series or biological correlations, highlight its role in distinguishing random from structured variability.

Definitions

Events

In a probability space (Ω,F,P)(\Omega, \mathcal{F}, P), where Ω\Omega is the sample space, F\mathcal{F} is a σ\sigma-algebra of events, and PP is a , two events A,BFA, B \in \mathcal{F} are independent if the probability of their equals the product of their individual probabilities: P(AB)=P(A)P(B).P(A \cap B) = P(A) P(B). This definition, introduced by Kolmogorov, captures the notion that the occurrence of one event provides no information about the other. Equivalent formulations hold when conditional probabilities are defined: P(AB)=P(A)P(A \mid B) = P(A) provided P(B)>0P(B) > 0, and P(BA)=P(B)P(B \mid A) = P(B) provided P(A)>0P(A) > 0. These equivalences follow directly from the definition using Bayes' rule, P(AB)=P(AB)/P(B)P(A \mid B) = P(A \cap B)/P(B). The concept extends to finite collections of events {A1,,An}\{A_1, \dots, A_n\}. Pairwise independence requires that every pair Ai,AjA_i, A_j (for iji \neq j) satisfies the two-event condition, but this does not guarantee independence for larger subsets. Mutual independence, a stronger property, demands that for every nonempty finite subcollection {Ai1,,Aik}\{A_{i_1}, \dots, A_{i_k}\}, P(m=1kAim)=m=1kP(Aim).P\left( \bigcap_{m=1}^k A_{i_m} \right) = \prod_{m=1}^k P(A_{i_m}). Mutual independence implies pairwise independence but not conversely; for instance, three events can be pairwise independent yet fail mutual independence if the intersection of all three deviates from the product of their probabilities. Independence also manifests additively in logarithmic scale: logP(AB)=logP(A)+logP(B)\log P(A \cap B) = \log P(A) + \log P(B). This property links to information theory, where the self-information or surprise of an event AA is defined as I(A)=log2P(A)I(A) = -\log_2 P(A) in bits; for independent events, the total information is additive, I(AB)=I(A)+I(B)I(A \cap B) = I(A) + I(B), reflecting non-overlapping uncertainty. This additivity axiom underpins Shannon's entropy measure for random variables.

Random Variables

Two real-valued random variables XX and YY defined on the same probability space are independent if, for all measurable sets AA and BB in the Borel σ\sigma-algebra on R\mathbb{R}, the joint probability satisfies P(XA,YB)=P(XA)P(YB)P(X \in A, Y \in B) = P(X \in A) P(Y \in B). An equivalent formulation uses cumulative distribution functions (CDFs): the joint CDF FX,Y(x,y)=P(Xx,Yy)F_{X,Y}(x,y) = P(X \leq x, Y \leq y) factors as FX,Y(x,y)=FX(x)FY(y)F_{X,Y}(x,y) = F_X(x) F_Y(y) for all x,yRx, y \in \mathbb{R}. For discrete random variables, independence holds if and only if the joint probability mass function (PMF) is the product of the marginal PMFs: pX,Y(x,y)=pX(x)pY(y)p_{X,Y}(x,y) = p_X(x) p_Y(y) for all x,yx, y in the support. Similarly, for continuous random variables, independence is equivalent to the joint probability density function (PDF) factoring as fX,Y(x,y)=fX(x)fY(y)f_{X,Y}(x,y) = f_X(x) f_Y(y) for almost all x,yRx, y \in \mathbb{R}, where the marginal PDFs are obtained by integrating the joint PDF over the other variable, and the joint PDF integrates to 1 over R2\mathbb{R}^2: fX,Y(x,y)dxdy=1\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx \, dy = 1. The definition extends to a collection of nn random variables X1,,XnX_1, \dots, X_n by requiring that the joint distribution factors into the product of marginal distributions for every finite ; that is, for any knk \leq n and indices i1,,iki_1, \dots, i_k, the joint distribution of (Xi1,,Xik)(X_{i_1}, \dots, X_{i_k}) is the product of the marginals of each XijX_{i_j}. By the uniqueness of measures on the Borel σ\sigma-algebra, if X1,,XnX_1, \dots, X_n are independent, their joint distribution is uniquely determined as the of the marginal distributions. An equivalent condition for the independence of XX and YY is that E[g(X)h(Y)]=E[g(X)]E[h(Y)]\mathbb{E}[g(X) h(Y)] = \mathbb{E}[g(X)] \mathbb{E}[h(Y)] for all bounded continuous functions gg and h:RRh: \mathbb{R} \to \mathbb{R}.

Random Vectors and Stochastic Processes

Independence extends naturally to random vectors, which are finite-dimensional collections of random variables. Consider two random vectors X=(X1,,Xn)\mathbf{X} = (X_1, \dots, X_n) and Y=(Y1,,Ym)\mathbf{Y} = (Y_1, \dots, Y_m) defined on the same probability space. These vectors are independent if the joint cumulative distribution function (CDF) of (X,Y)(\mathbf{X}, \mathbf{Y}) factors as the product of their marginal vector CDFs, that is, F(X,Y)(x,y)=FX(x)FY(y)F_{(\mathbf{X},\mathbf{Y})}(\mathbf{x}, \mathbf{y}) = F_{\mathbf{X}}(\mathbf{x}) F_{\mathbf{Y}}(\mathbf{y}) for all xRn\mathbf{x} \in \mathbb{R}^n and yRm\mathbf{y} \in \mathbb{R}^m. This condition ensures that the distribution of X\mathbf{X} provides no information about Y\mathbf{Y}, and vice versa, generalizing the scalar case to multivariate settings where components within each vector may themselves be dependent. A related but weaker condition involves the covariance structure. If X\mathbf{X} and Y\mathbf{Y} are independent, then the of the concatenated vector (X,Y)(\mathbf{X}^\top, \mathbf{Y}^\top)^\top is block-diagonal, with off-diagonal blocks consisting of zero covariances between components of X\mathbf{X} and Y\mathbf{Y}. This uncorrelation (zero cross-covariances) is necessary for but not sufficient in general, as counterexamples exist where vectors are uncorrelated yet their joint distribution does not factor into marginals—for instance, certain non-Gaussian distributions where higher-order dependencies persist despite zero covariances. In the special case of jointly Gaussian vectors, however, uncorrelation is equivalent to due to the characterization of Gaussian distributions. For stochastic processes, independence concepts adapt to infinite collections indexed by time or another parameter. A single stochastic process {Xt:tT}\{X_t : t \in T\} exhibits independence across disjoint index sets if the sigma-algebras generated by {Xt:tA}\{X_t : t \in A\} and {Xt:tB}\{X_t : t \in B\} are independent for any disjoint A,BTA, B \subset T. A prominent example is the independent increments property, where increments XtXsX_t - X_s for non-overlapping intervals (s,t](s, t] are independent random variables; this holds for standard , a continuous-time process with stationary, normally distributed increments that are independent over disjoint intervals. Such properties underpin the Markovian behavior and lack of memory in these processes. Independence between two stochastic processes {Xt}\{X_t\} and {Yt}\{Y_t\} is defined via the sigma-algebras they generate: the processes are independent if the sigma-algebra σ({Xt:tT})\sigma(\{X_t : t \in T\}) is independent of σ({Yt:tT})\sigma(\{Y_t : t \in T\}), meaning joint events from each process have probabilities multiplying as products of marginals. This framework, rooted in measure-theoretic probability, ensures that observations from one process do not influence the other. Examples illustrate these notions in applied contexts. Independent Poisson processes, such as two counting processes for separate event streams (e.g., arrivals at distinct queues), have increments that are independent across the processes, with the superposition forming another Poisson process under suitable rate conditions. Similarly, a white noise sequence {ϵt}\{ \epsilon_t \} is a discrete-time stochastic process where the ϵt\epsilon_t are independent and identically distributed (often with mean zero and finite variance), serving as a foundational model for innovations in time series analysis. These cases highlight how independence facilitates decomposition and simulation in stochastic modeling.

Sigma-Algebras

In measure-theoretic probability, the notion of independence is generalized to sigma-algebras, providing a foundational framework that encompasses independence of events, random variables, and more complex structures. Two sub-sigma-algebras F\mathcal{F} and G\mathcal{G} of the sigma-algebra A\mathcal{A} on a probability space (Ω,A,P)(\Omega, \mathcal{A}, P) are independent if, for every AFA \in \mathcal{F} and BGB \in \mathcal{G}, P(AB)=P(A)P(B).P(A \cap B) = P(A) P(B). This definition captures the idea that events measurable with respect to F\mathcal{F} provide no probabilistic information about events measurable with respect to G\mathcal{G}, and vice versa. The concept extends naturally to families of sigma-algebras. A collection {Fi}iI\{\mathcal{F}_i\}_{i \in I} of sub-sigma-algebras is mutually independent if, for every finite JIJ \subseteq I and every choice of sets AjFjA_j \in \mathcal{F}_j for jJj \in J, P(jJAj)=jJP(Aj).P\left( \bigcap_{j \in J} A_j \right) = \prod_{j \in J} P(A_j).
Add your contribution
Related Hubs
User Avatar
No comments yet.