Hubbry Logo
search
logo
2204207

Mutual information

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Mutual information

In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the "amount of information" (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected "amount of information" held in a random variable.

Not limited to real-valued random variables and linear dependence like the correlation coefficient, MI is more general and determines how different the joint distribution of the pair is from the product of the marginal distributions of and . MI is the expected value of the pointwise mutual information (PMI).

The quantity was defined and analyzed by Claude Shannon in his landmark paper "A Mathematical Theory of Communication", although he did not call it "mutual information". This term was coined later by Robert Fano. Mutual Information is also known as information gain.

Let be a pair of random variables with values over the space . If their joint distribution is and the marginal distributions are and , the mutual information is defined as

where is the Kullback–Leibler divergence, and is the outer product distribution which assigns probability to each .

Expressed in terms of the entropy and the conditional entropy of the random variables and , one also has (See relation to conditional and joint entropy):

Notice, as per property of the Kullback–Leibler divergence, that is equal to zero precisely when the joint distribution coincides with the product of the marginals, i.e. when and are independent (and hence observing tells you nothing about ). is non-negative, it is a measure of the price for encoding as a pair of independent random variables when in reality they are not.

If the natural logarithm is used, the unit of mutual information is the nat. If the log base 2 is used, the unit of mutual information is the shannon, also known as the bit. If the log base 10 is used, the unit of mutual information is the hartley, also known as the ban or the dit.

See all
User Avatar
No comments yet.