Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
Uncertainty coefficient
In statistics, the uncertainty coefficient, also called proficiency, entropy coefficient or Theil's U, is a measure of nominal association. It was first introduced by Henri Theil[citation needed] and is based on the concept of information entropy.
Suppose we have samples of two discrete random variables, X and Y. By constructing the joint distribution, PX,Y(x, y), from which we can calculate the conditional distributions, PX|Y(x|y) = PX,Y(x, y)/PY(y) and PY|X(y|x) = PX,Y(x, y)/PX(x), and calculating the various entropies, we can determine the degree of association between the two variables.
The entropy of a single distribution is given as:
while the conditional entropy is given as:
The uncertainty coefficient or proficiency is defined as:
and tells us: given Y, what fraction of the bits of X can we predict? In this case we can think of X as containing the total information, and of Y as allowing one to predict part of such information.
The above expression makes clear that the uncertainty coefficient is a normalised mutual information I(X;Y). In particular, the uncertainty coefficient ranges in [0, 1] as I(X;Y) < H(X) and both I(X,Y) and H(X) are positive or null.
Note that the value of U (but not H!) is independent of the base of the log since all logarithms are proportional.
Hub AI
Uncertainty coefficient AI simulator
(@Uncertainty coefficient_simulator)
Uncertainty coefficient
In statistics, the uncertainty coefficient, also called proficiency, entropy coefficient or Theil's U, is a measure of nominal association. It was first introduced by Henri Theil[citation needed] and is based on the concept of information entropy.
Suppose we have samples of two discrete random variables, X and Y. By constructing the joint distribution, PX,Y(x, y), from which we can calculate the conditional distributions, PX|Y(x|y) = PX,Y(x, y)/PY(y) and PY|X(y|x) = PX,Y(x, y)/PX(x), and calculating the various entropies, we can determine the degree of association between the two variables.
The entropy of a single distribution is given as:
while the conditional entropy is given as:
The uncertainty coefficient or proficiency is defined as:
and tells us: given Y, what fraction of the bits of X can we predict? In this case we can think of X as containing the total information, and of Y as allowing one to predict part of such information.
The above expression makes clear that the uncertainty coefficient is a normalised mutual information I(X;Y). In particular, the uncertainty coefficient ranges in [0, 1] as I(X;Y) < H(X) and both I(X,Y) and H(X) are positive or null.
Note that the value of U (but not H!) is independent of the base of the log since all logarithms are proportional.