Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
Robust measures of scale
In statistics, robust measures of scale are methods which quantify the statistical dispersion in a sample of numerical data while resisting outliers. These are contrasted with conventional or non-robust measures of scale, such as sample standard deviation, which are greatly influenced by outliers.
The most common such robust statistics are the interquartile range (IQR) and the median absolute deviation (MAD). Alternatives robust estimators have also been developed, such as those based on pairwise differences and biweight midvariance.
These robust statistics are particularly used as estimators of a scale parameter, and have the advantages of both robustness and superior efficiency on contaminated data, at the cost of inferior efficiency on clean data from distributions such as the normal distribution. To illustrate robustness, the standard deviation can be made arbitrarily large by increasing exactly one observation (it has a breakdown point of 0, as it can be contaminated by a single point), a defect that is not shared by robust statistics.
Note that, in domains such as finance, the assumption of normality may lead to excessive risk exposure, and that further parameterization may be needed to mitigate risks presented by abnormal kurtosis.
Robust measures of scale can be used as estimators of properties of the population, either for parameter estimation or as estimators of their own expected value.
For example, robust estimators of scale are used to estimate the population standard deviation, generally by multiplying by a scale factor to make it an unbiased consistent estimator; see scale parameter: estimation. For example, the interquartile range may be rendered an unbiased, consistent estimator for the population standard deviation if the data follow a normal distribution and the measure is divided by: where is the inverse error function.
In other situations, it makes more sense to think of a robust measure of scale as an estimator of its own expected value, interpreted as an alternative to the population standard deviation as a measure of scale. For example, the median absolute deviation (MAD) of a sample from a standard Cauchy distribution is an estimator of the population MAD, which in this case is 1, whereas the population variance does not exist.
Robust estimators typically have inferior statistical efficiency compared to conventional estimators for data drawn from a distribution without outliers, such as a normal distribution. However, they have superior efficiency for data drawn from a mixture distribution or from a heavy-tailed distribution, for which non-robust measures such as the standard deviation should not be used.
Hub AI
Robust measures of scale AI simulator
(@Robust measures of scale_simulator)
Robust measures of scale
In statistics, robust measures of scale are methods which quantify the statistical dispersion in a sample of numerical data while resisting outliers. These are contrasted with conventional or non-robust measures of scale, such as sample standard deviation, which are greatly influenced by outliers.
The most common such robust statistics are the interquartile range (IQR) and the median absolute deviation (MAD). Alternatives robust estimators have also been developed, such as those based on pairwise differences and biweight midvariance.
These robust statistics are particularly used as estimators of a scale parameter, and have the advantages of both robustness and superior efficiency on contaminated data, at the cost of inferior efficiency on clean data from distributions such as the normal distribution. To illustrate robustness, the standard deviation can be made arbitrarily large by increasing exactly one observation (it has a breakdown point of 0, as it can be contaminated by a single point), a defect that is not shared by robust statistics.
Note that, in domains such as finance, the assumption of normality may lead to excessive risk exposure, and that further parameterization may be needed to mitigate risks presented by abnormal kurtosis.
Robust measures of scale can be used as estimators of properties of the population, either for parameter estimation or as estimators of their own expected value.
For example, robust estimators of scale are used to estimate the population standard deviation, generally by multiplying by a scale factor to make it an unbiased consistent estimator; see scale parameter: estimation. For example, the interquartile range may be rendered an unbiased, consistent estimator for the population standard deviation if the data follow a normal distribution and the measure is divided by: where is the inverse error function.
In other situations, it makes more sense to think of a robust measure of scale as an estimator of its own expected value, interpreted as an alternative to the population standard deviation as a measure of scale. For example, the median absolute deviation (MAD) of a sample from a standard Cauchy distribution is an estimator of the population MAD, which in this case is 1, whereas the population variance does not exist.
Robust estimators typically have inferior statistical efficiency compared to conventional estimators for data drawn from a distribution without outliers, such as a normal distribution. However, they have superior efficiency for data drawn from a mixture distribution or from a heavy-tailed distribution, for which non-robust measures such as the standard deviation should not be used.