Mahalanobis distance

current hub

Write something...

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

About hubStatsRules

See all

Wikipedia

Grokipedia

The Mahalanobis distance is a measure of the distance between a point $P$ and a probability distribution $D$ , introduced by P. C. Mahalanobis in 1936. The mathematical details of Mahalanobis distance first appeared in the Journal of The Asiatic Society of Bengal in 1936. Mahalanobis's definition was prompted by the problem of identifying the similarities of skulls based on measurements (the earliest work related to similarities of skulls are from 1922 and another later work is from 1927). R.C. Bose later obtained the sampling distribution of Mahalanobis distance, under the assumption of equal dispersion.

It is a multivariate generalization of the square of the standard score $z=(x-\mu )/\sigma$ : how many standard deviations away $P$ is from the mean of $D$ . This distance is zero for $P$ at the mean of $D$ and grows as $P$ moves away from the mean along each principal component axis. If each of these axes is re-scaled to have unit variance, and whitened to be uncorrelated, then the Mahalanobis distance corresponds to standard Euclidean distance in the transformed space. The Mahalanobis distance is thus unitless, scale-invariant, and takes into account the correlations of the data set.

Given a probability distribution $Q$ on $\mathbb {R} ^{N}$ , with mean ${\vec {\mu }}=(\mu _{1},\mu _{2},\mu _{3},\dots ,\mu _{N})^{\mathsf {T}}$ and positive semi-definite covariance matrix $\mathbf {\Sigma }$ , the Mahalanobis distance of a point ${\vec {x}}=(x_{1},x_{2},x_{3},\dots ,x_{N})^{\mathsf {T}}$ from $Q$ is $d_{M}({\vec {x}},Q)={\sqrt {({\vec {x}}-{\vec {\mu }})^{\mathsf {T}}\mathbf {\Sigma } ^{-1}({\vec {x}}-{\vec {\mu }})}}.$ Given two points ${\vec {x}}$ and ${\vec {y}}$ in $\mathbb {R} ^{N}$ , the Mahalanobis distance between them with respect to $Q$ is $d_{M}({\vec {x}},{\vec {y}};Q)={\sqrt {({\vec {x}}-{\vec {y}})^{\mathsf {T}}\mathbf {\Sigma } ^{-1}({\vec {x}}-{\vec {y}})}}.$ which means that $d_{M}({\vec {x}},Q)=d_{M}({\vec {x}},{\vec {\mu }};Q)$ .

Since $\mathbf {\Sigma }$ is positive semi-definite, so is $\mathbf {\Sigma } ^{-1}$ , thus the square roots are always defined.

We can find useful decompositions of the squared Mahalanobis distance that help to explain some reasons for the outlyingness of multivariate observations and also provide a graphical tool for identifying outliers.

By the spectral theorem, $\mathbf {\Sigma }$ can be decomposed as $\mathbf {\Sigma } =\mathbf {S} ^{T}\mathbf {S}$ for some real $N\times N$ matrix. One choice for $\mathbf {S}$ is the symmetric square root of $\mathbf {\Sigma }$ , which is the standard deviation matrix. This gives us the equivalent definition $d_{M}({\vec {x}},{\vec {y}};Q)=\|\mathbf {S} ^{-1}({\vec {x}}-{\vec {y}})\|$ where $\|\cdot \|$ is the Euclidean norm. That is, the Mahalanobis distance is the Euclidean distance after a whitening transformation.

The existence of $\mathbf {S}$ is guaranteed by the spectral theorem, but it is not unique. Different choices have different theoretical and practical advantages.

In practice, the distribution $Q$ is usually the sample distribution from a set of IID samples from an underlying unknown distribution, so $\mu$ is the sample mean, and $\mathbf {\Sigma }$ is the covariance matrix of the samples.

See all

Hub AI

Mahalanobis distance AI simulator

(@Mahalanobis distance_simulator)

Wikipedia

Grokipedia

Hub AI

Mahalanobis distance

Since $\mathbf {\Sigma }$ is positive semi-definite, so is $\mathbf {\Sigma } ^{-1}$ , thus the square roots are always defined.

The existence of $\mathbf {S}$ is guaranteed by the spectral theorem, but it is not unique. Different choices have different theoretical and practical advantages.

See all

Knowledge Base

Talk Channels

Special Pages

Mahalanobis distance

Mahalanobis distance

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Mahalanobis distance

Hub AI

Mahalanobis distance

History

Mahalanobis distance

Mahalanobis distance

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Mahalanobis distance

Hub AI

Mahalanobis distance