Hubbry Logo
Robust measures of scaleRobust measures of scaleMain
Open search
Robust measures of scale
Community hub
Robust measures of scale
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Robust measures of scale
Robust measures of scale
from Wikipedia

In statistics, robust measures of scale are methods which quantify the statistical dispersion in a sample of numerical data while resisting outliers. These are contrasted with conventional or non-robust measures of scale, such as sample standard deviation, which are greatly influenced by outliers.

The most common such robust statistics are the interquartile range (IQR) and the median absolute deviation (MAD). Alternatives robust estimators have also been developed, such as those based on pairwise differences and biweight midvariance.

These robust statistics are particularly used as estimators of a scale parameter, and have the advantages of both robustness and superior efficiency on contaminated data, at the cost of inferior efficiency on clean data from distributions such as the normal distribution. To illustrate robustness, the standard deviation can be made arbitrarily large by increasing exactly one observation (it has a breakdown point of 0, as it can be contaminated by a single point), a defect that is not shared by robust statistics.

Note that, in domains such as finance, the assumption of normality may lead to excessive risk exposure, and that further parameterization may be needed to mitigate risks presented by abnormal kurtosis.

Approaches to estimation

[edit]

Robust measures of scale can be used as estimators of properties of the population, either for parameter estimation or as estimators of their own expected value.

For example, robust estimators of scale are used to estimate the population standard deviation, generally by multiplying by a scale factor to make it an unbiased consistent estimator; see scale parameter: estimation. For example, the interquartile range may be rendered an unbiased, consistent estimator for the population standard deviation if the data follow a normal distribution and the measure is divided by: where is the inverse error function.

In other situations, it makes more sense to think of a robust measure of scale as an estimator of its own expected value, interpreted as an alternative to the population standard deviation as a measure of scale. For example, the median absolute deviation (MAD) of a sample from a standard Cauchy distribution is an estimator of the population MAD, which in this case is 1, whereas the population variance does not exist.

Statistical efficiency

[edit]

Robust estimators typically have inferior statistical efficiency compared to conventional estimators for data drawn from a distribution without outliers, such as a normal distribution. However, they have superior efficiency for data drawn from a mixture distribution or from a heavy-tailed distribution, for which non-robust measures such as the standard deviation should not be used.

For example, for data drawn from the normal distribution, the median absolute deviation is 37% as efficient as the sample standard deviation, while the Rousseeuw–Croux estimator Qn is 88% as efficient as the sample standard deviation.

Common robust estimators

[edit]

One of the most common robust measures of scale is the interquartile range (IQR), the difference between the 75th percentile and the 25th percentile of a sample; this is the 25% trimmed range, an example of an L-estimator. Other trimmed ranges, such as the interdecile range (10% trimmed range) can also be used.

For a Gaussian distribution, IQR is related to , the standard deviation, as:[1]

Another commonly used robust measure of scale is the median absolute deviation (MAD), the median of the absolute values of the differences between the data values and the overall median of the data set; for a Gaussian distribution, MAD is related to as:[2] For details, visit the section on relation to standard deviation in the main article on MAD.

Sn and Qn

[edit]

Rousseeuw and Croux[3] proposed 2 alternatives to the Median Absolute Deviation, motivated by two of its weaknesses:

  1. It is inefficient (37% efficiency) at Gaussian distributions.
  2. it computes a symmetric statistic about a location estimate, thus not dealing with skewness.

They propose two alternative statistics based on pairwise differences: Sn and Qn.

Sn is defined as: Qn is defined as:[4]

Where:

  • The factor 2.2219 is a consistency constant,
  • The set consists of all pairwise absolute differences between the observations and , and
  • The subscript represents the th order statistic, or

These can be computed in O(n log n) time and O(n) space.

Neither of these requires location estimation, as they are based only on differences between values. They are both more efficient than the MAD under a Gaussian distribution: Sn is 58% efficient, while Qn is 82% efficient.

For a sample from a normal distribution, Sn is approximately unbiased for the population standard deviation even down to very modest sample sizes (<1% bias for n = 10).

For a large sample from a normal distribution, 2.22Qn is approximately unbiased for the population standard deviation. For small or moderate samples, the expected value of Qn under a normal distribution depends markedly on the sample size, so finite-sample correction factors (obtained from a table or from simulations) are used to calibrate the scale of Qn.

The biweight midvariance

[edit]

Like Sn and Qn, the biweight midvariance is intended to be robust without sacrificing too much efficiency. It is defined as:[5]

where I is the indicator function, Q is the sample median of the Xi, and

Its square root is a robust estimator of scale, since data points are downweighted as their distance from the median increases, with points more than 9 MAD units from the median having no influence at all.

The biweight's efficiency has been estimated at around 84.7% for sets of 20 samples drawn from synthetically generated distributions with added excess kurtosis ("stretched tails"). For Gaussian distributions, its efficiency has been estimated at 98.2%.[6]

Location-scale depth

[edit]

Mizera and Müller extended the approach offered by Rousseeuw and Hubert by proposing a robust depth-based estimator for location and scale simultaneously, called location-scale depth. It is defined as follows:[7]

Where:

  • is a shorthand for ,
  • and depend on a fixed density

They suggest that the most tractable version of location-scale depth is the one based on Student's t-distribution.

Confidence intervals

[edit]

A robust confidence interval is a robust modification of confidence intervals, meaning that one modifies the non-robust calculations of the confidence interval so that they are not badly affected by outlying or aberrant observations in a data-set.

Example

[edit]

In the process of weighing 1000 objects, under practical conditions, it is easy to believe that the operator might make a mistake in procedure and so report an incorrect mass (thereby making one type of systematic error). Suppose there were 100 objects and the operator weighed them all, one at a time, and repeated the whole process ten times. Then the operator can calculate a sample standard deviation for each object, and look for outliers. Any object with an unusually large standard deviation probably has an outlier in its data. These can be removed by various non-parametric techniques. If the operator repeated the process only three times, simply taking the median of the three measurements and using σ would give a confidence interval. The 200 extra weighings served only to detect and correct for operator error and did nothing to improve the confidence interval. With more repetitions, one could use a truncated mean, discarding the largest and smallest values and averaging the rest. A bootstrap calculation could be used to determine a confidence interval narrower than that calculated from σ, and so obtain some benefit from a large amount of extra work.

These procedures are robust against procedural errors which are not modeled by the assumption that the balance has a fixed known standard deviation σ. In practical applications where the occasional operator error can occur, or the balance can malfunction, the assumptions behind simple statistical calculations cannot be taken for granted. Before trusting the results of 100 objects weighed just three times each to have confidence intervals calculated from σ, it is necessary to test for and remove a reasonable number of outliers (testing the assumption that the operator is careful and correcting for the fact that he is not perfect), and to test the assumption that the data really have a normal distribution with standard deviation σ.

Computer simulation

[edit]

The theoretical analysis of such an experiment is complicated, but it is easy to set up a spreadsheet which draws random numbers from a normal distribution with standard deviation σ to simulate the situation; this can be done in Microsoft Excel using =NORMINV(RAND(),0,σ)), as discussed in [8] and the same techniques can be used in other spreadsheet programs such as in OpenOffice.org Calc and gnumeric.

After removing obvious outliers, one could subtract the median from the other two values for each object, and examine the distribution of the 200 resulting numbers. It should be normal with mean near zero and standard deviation a little larger than σ. A simple Monte Carlo spreadsheet calculation would reveal typical values for the standard deviation (around 105 to 115% of σ). Or, one could subtract the mean of each triplet from the values, and examine the distribution of 300 values. The mean is identically zero, but the standard deviation should be somewhat smaller (around 75 to 85% of σ).

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Robust measures of scale are statistical estimators designed to quantify the dispersion or spread of a while being highly resistant to the effects of outliers and deviations from normality, unlike classical measures such as the sample standard deviation that can be severely distorted by extreme values. These estimators are affine equivariant, meaning they scale appropriately under linear transformations of the data, and are fundamental in for providing stable assessments of variability in contaminated or non-normal distributions. Key properties of robust scale measures include the breakdown point, which represents the maximum proportion of outliers (up to 50% for the most robust estimators) that the measure can tolerate before collapsing, and asymptotic , which measures relative performance under ideal conditions like distribution. For instance, (MAD), defined as 1.4826×\medianixi\medianjxj1.4826 \times \median_i |x_i - \median_j x_j| to achieve consistency at distribution, has a 50% breakdown point but only 37% under normality, making it simple yet limited for clean . More advanced estimators like the Qn, proposed by Rousseeuw and Croux, use the first of pairwise absolute differences among observations and offer a higher 82% while maintaining a 50% breakdown point, computed via efficient algorithms in O(nlogn)O(n \log n) time. The (IQR), the difference between the 75th and 25th percentiles, is another basic robust option with a 25% breakdown point, focusing variability on the central portion and proving for heavy-tailed distributions like the Cauchy. The development of robust measures of scale emerged in the mid-20th century as part of broader , with Peter J. Huber introducing foundational concepts for and scale estimation resistant to gross errors in 1964. Frank R. Hampel advanced the field in 1971 with a general qualitative definition of robustness, and in 1974 by introducing the influence function, which quantifies an estimator's sensitivity to infinitesimal contamination, further elaborating its role. In the , Peter J. Rousseeuw pioneered high-breakdown-point methods, including multivariate extensions in 1985 that influenced scale estimation by emphasizing maximum contamination resistance. Subsequent work by Rousseeuw and Christophe Croux in 1993 provided explicit, computationally feasible alternatives to MAD, such as Qn and Sn, which balance robustness and efficiency for practical applications in . These measures are widely applied in fields like , , and to handle real-world data imperfections without undue influence from anomalies.

Introduction and Background

Definition and Motivation

A measure of scale in quantifies the dispersion or spread of a , providing an indication of how much the data points vary around a central value. Robust measures of scale are specifically designed to be insensitive to outliers and heavy-tailed distributions, ensuring that the estimate of variability remains reliable even when the data contains anomalies or deviates from assumed normality. The motivation for robust measures of scale arises from the vulnerabilities of classical dispersion metrics, such as the standard deviation, which can be severely distorted by even a small proportion of outliers or contamination in the . For instance, a single extreme value can inflate the variance dramatically, leading to misleading inferences about data spread in real-world applications where datasets often include measurement errors or unexpected anomalies. Robust alternatives address this by prioritizing stability and maintaining their properties under such deviations, thereby enhancing the reliability of statistical analyses in fields like , , and . The development of robust measures of scale emerged in the 1960s and 1970s as part of the broader field of , pioneered by researchers seeking to overcome the limitations of least-squares methods and parametric assumptions in the presence of non-normal errors. initiated key ideas in 1960 by demonstrating the advantages of trimmed means and deviations over traditional estimators under slight departures from normality, while Peter Huber advanced M-estimators in 1964. Frank Hampel further formalized the framework in 1968, emphasizing the need for procedures that withstand gross errors commonly found in scientific data. A fundamental property evaluating the robustness of scale estimators is the breakdown point, which represents the smallest proportion of contaminated observations that can cause the to produce an arbitrarily large or small value. Introduced by Hampel in , this criterion highlights why classical measures like the standard deviation have a breakdown point of zero—they fail completely with even one —whereas robust measures can tolerate up to 50% , making them suitable for practical, impure datasets.

Comparison to Classical Measures of Scale

Classical measures of scale, such as the sample variance s2=1n1i=1n(xixˉ)2s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2 and its , the sample standard deviation ss, are maximum likelihood estimators assuming normally distributed data. These estimators achieve 100% asymptotic relative efficiency under distribution but possess a breakdown point of 0%, meaning a single can render them arbitrarily large or undefined. The primary sensitivity of these classical measures arises from their reliance on squared deviations, which amplify the impact of extreme values; for instance, replacing one with an arbitrarily large value can dominate the entire , inflating the estimate without bound. In contrast, robust measures of scale limit the influence of such s, maintaining finite values even in the presence of . Robust measures are particularly preferable in scenarios involving data contamination, modeled by Huber's ϵ\epsilon-contamination framework where the true distribution is a mixture (1ϵ)F+ϵG(1-\epsilon) F + \epsilon G, with FF representing the ideal model (e.g., normal) and GG an arbitrary contaminating distribution. Under this model, classical estimators like the standard deviation lose consistency for any ϵ>0\epsilon > 0, whereas robust alternatives preserve consistency and bounded influence. A key is that robust scale estimators typically exhibit lower asymptotic under uncontaminated normality—often around 37% to 88% relative to the standard deviation, depending on the method—due to their downweighting of extreme but legitimate observations. However, in contaminated settings with even small (e.g., 5-10%), their surpasses that of classical measures, providing superior performance in real-world data prone to outliers.

Common Robust Estimators

Median Absolute Deviation (MAD)

The (MAD) is a robust of scale that measures the typical deviation of observations from the data's using the L1 norm. It is defined for a univariate sample {x1,,xn}\{x_1, \dots, x_n\} as MAD=c\mediani=1,,n(xi\medianj=1,,nxj),\text{MAD} = c \cdot \median_{i=1,\dots,n} \bigl( |x_i - \median_{j=1,\dots,n} x_j| \bigr), where the constant c=1.4826c = 1.4826 ensures consistency with the standard deviation σ\sigma under the distribution, as this value equals 1/Φ1(3/4)1 / \Phi^{-1}(3/4) and Φ1(3/4)0.6745\Phi^{-1}(3/4) \approx 0.6745. To compute the MAD, first determine the sample median m=\median(xi)m = \median(x_i), which orders the data and selects the middle value (or average of the two central values for even nn). Next, calculate the absolute deviations di=ximd_i = |x_i - m| for each ii. The unscaled MAD is then the median of the did_i, and the final value is obtained by multiplying by cc. This process relies solely on order statistics and avoids squaring, making it less sensitive to extreme values than the sample standard deviation. The MAD exhibits strong robustness properties, including a breakdown point of 50%, the maximum attainable for affine-equivariant scale estimators, which means it remains bounded even if up to half the observations are arbitrarily far from the bulk of the data. Under the normal distribution, its asymptotic relative relative to the sample standard deviation is approximately 37%, reflecting a between under ideal conditions and resilience to departures from normality such as outliers or heavy tails. It is also location-scale equivariant: if the data are transformed to a+bxia + b x_i with aRa \in \mathbb{R} and b>0b > 0, then the MAD transforms to b|b| times the original. Key advantages of the MAD include its straightforward computation, which requires only O(nlogn)O(n \log n) time due to sorting for the medians, and its suitability for distribution-free in non-parametric settings, such as tests or Wilcoxon procedures, where its under the null does not depend on the underlying error distribution.

Interquartile Range (IQR)

The (IQR) is a non-parametric robust measure of scale that quantifies the spread of the middle 50% of a by subtracting the first from the third . It provides a stable estimate of variability that is less sensitive to outliers compared to the full range or standard deviation, as it ignores the lowest 25% and highest 25% of the . Introduced in the context of , the IQR is particularly useful for visualizing distribution in box plots, where it forms the length of the box to highlight central spread without distortion from extreme values. The IQR is formally defined as
IQR=Q3Q1,\text{IQR} = Q_3 - Q_1,
where Q1Q_1 is the 25th (first ) and Q3Q_3 is the 75th (third ) of the ordered sample. Unlike some scale estimators, the IQR requires no scaling factor for direct interpretation as a measure of dispersion, though it can be adjusted under normality assumptions for comparability to the standard deviation.
To compute the IQR, sort the dataset in ascending order to obtain the ordered values x(1)x(2)x(n)x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}, where nn is the sample size. The position for Q1Q_1 is (n+1)/4(n+1)/4, and for Q3Q_3 is 3(n+1)/43(n+1)/4; if these positions fall between integers, linear interpolation is applied between the adjacent ordered values. This method ensures a consistent estimate even for moderate sample sizes, focusing solely on quartile positions without additional transformations. The IQR exhibits a breakdown point of 25%, meaning it remains bounded and reliable as long as fewer than 25% of the observations are outliers, since contamination in the outer quartiles does not affect the inner ones until that threshold is exceeded. This property makes it a simple yet effective tool in for detecting and understanding data spread amid potential anomalies. Variants of the IQR address challenges in small samples or further enhance robustness. For small nn, adjusted computations use alternative quantile definitions, such as those based on inverse cumulative distribution functions or modified rules, to avoid in quartile estimates; for example, nine standard methods are compared, with types 6–8 often preferred for their balance of simplicity and accuracy in finite samples.

Sn and Qn Estimators

The Sn and Qn estimators are two prominent robust measures of scale introduced by Peter J. Rousseeuw and Christophe Croux as alternatives to the (MAD), offering maximal breakdown robustness while improving statistical efficiency under distribution. These estimators are based on order statistics derived from all pairwise absolute differences among the observations, making them location-invariant and particularly effective against outliers. Unlike simpler quartile-based methods such as the , Sn and Qn leverage the full structure of pairwise comparisons to achieve a breakdown point of 50%, the theoretical maximum for location-scale equivariant estimators. The Sn estimator is defined as the scaled nested median of pairwise absolute differences: Sn=1.1926\mediani(\medianjXiXj),S_n = 1.1926 \cdot \median_i \left( \median_j |X_i - X_j| \right), where the outer median is over i=1,,ni = 1, \dots, n and the inner over j=1,,nj = 1, \dots, n, and the constant 1.1926 ensures that SnS_n is a consistent estimator of the scale parameter σ\sigma for the standard normal distribution. This nested structure effectively captures the central tendency of the differences, providing a robust summary of dispersion. For computational efficiency, avoiding the O(n2)O(n^2) enumeration of pairs, SnS_n is computed using the above formula, which requires sorting the data once and can be implemented in O(nlogn)O(n \log n) time. The estimator's influence function is bounded but discontinuous at zero, reflecting its high robustness to gross errors. The Qn estimator, in contrast, uses a lower-order statistic from the pairwise differences to enhance efficiency: Qn=2.2219({XiXj:1i<jn})(k),Q_n = 2.2219 \cdot \left( \{ |X_i - X_j| : 1 \leq i < j \leq n \} \right)_{(k)}, where ()(k)( \cdot )_{(k)} denotes the kk-th (with k=n/2(n/2+1)/2k = \left\lfloor n/2 \right\rfloor \left( \left\lfloor n/2 \right\rfloor + 1 \right) / 2), and the constant 2.2219 provides asymptotic consistency under normality. This kk corresponds approximately to the first quartile position among the (n2)\binom{n}{2} pairwise differences, selecting a value that resists contamination from the upper tail. Like Sn, Qn admits an O(nlogn)O(n \log n)-time algorithm based on sorting, but its structure—focusing on the lower half of ordered differences—makes it simpler and faster in practice, often requiring less memory. Finite-sample bias corrections dnd_n can be applied for small nn to improve unbiasedness, though they are typically near 1 for n>20n > 20. Both estimators possess a 50% breakdown point, meaning arbitrary of up to (n1)/2\lfloor (n-1)/2 \rfloor observations cannot cause SnS_n or QnQ_n to diverge to or zero. Under distribution, Sn attains an asymptotic relative of 58% relative to the sample standard deviation, while Qn reaches 82%, outperforming MAD's 37% efficiency without sacrificing robustness. Qn's influence function is continuous and redescending, contributing to its superior finite-sample performance in contaminated settings. These properties were derived analytically in the original proposal, with empirical validations confirming their behavior even for moderate sample sizes. Rousseeuw and Croux developed Sn and Qn in 1993, motivated by the need for high-breakdown estimators suitable for extending to multivariate robust estimation, such as in the minimum determinant method. The accompanying 1992 work provided the efficient algorithms essential for practical use, enabling their adoption in statistical software like R's robustbase package. These estimators have since become staples in for applications requiring resistance to outliers, such as and regression diagnostics.

Advanced Robust Measures

Biweight Midvariance

The biweight midvariance is a tuned robust of scale that employs Tukey's biweight weighting function to downweight the influence of outliers while maintaining high statistical under normality. Developed by John W. Tukey in 1977 as part of techniques for resistant line fitting, it addresses the limitations of classical variance by iteratively applying weights that smoothly reduce the contribution of extreme observations. This estimator is particularly valued in applications requiring both robustness and efficiency, such as analyzing residuals in . The biweight midvariance is defined using the sample median mm as the location estimate and the (MAD) as an initial scale measure. Let ui=xim9MADu_i = \frac{x_i - m}{9 \cdot \mathrm{MAD}} for each observation xix_i, with the biweight function ψ(u)=u(1u2)\psi(u) = u (1 - u^2) for u<1|u| < 1 and 0 otherwise. The estimator is then given by σ2=ui2(xim)2(1ui2)2ui2(1ui2)(15ui2)nn(n+1)ui2(1ui2),\sigma^2 = \frac{ \sum u_i^2 (x_i - m)^2 (1 - u_i^2)^2 }{ \sum u_i^2 (1 - u_i^2) (1 - 5 u_i^2) } \cdot \frac{n}{n - (n+1) \sum u_i^2 (1 - u_i^2)}, where the sums are over indices ii with ui<1|u_i| < 1, and nn is the sample size. This formula provides a consistent estimate of the scale squared, incorporating the biweight influence to emphasize central data points. Computation of the biweight midvariance is iterative and begins with an initial robust scale estimate from the MAD. The median mm is calculated first, followed by the MAD to define the uiu_i. Weights derived from ψ(ui)\psi(u_i) are then applied to trim the influence of extremes beyond ui=1|u_i| = 1, effectively rejecting about 11% of the data in a normal distribution due to the choice of tuning constant 9. Subsequent iterations refine the location and scale until convergence, though a one-step approximation starting from the median and MAD often suffices for practical purposes. Key properties of the biweight midvariance include a breakdown point of approximately 50%, indicating it can withstand up to 50% contaminated observations before the estimate can be arbitrarily large. It achieves an asymptotic relative efficiency of approximately 85% relative to the sample standard deviation under the normal distribution, balancing robustness with precision in uncontaminated data. These attributes make it suitable for estimating the scale of residuals in robust regression models, where outliers from model misspecification are common. In multivariate settings, it relates to projection-based approaches like location-scale depth but remains primarily univariate with fixed tuning.

Location-Scale Depth

Location-scale depth provides a multivariate robust measure that simultaneously assesses the centrality of both location and scale parameters in a data cloud, extending univariate notions to higher dimensions through depth functions such as projection or halfspace depths. In the projection-based approach, the depth for a scale parameter is defined as the infimum over all unit vectors u\mathbf{u} of a robust univariate scale measure (e.g., median absolute deviation) applied to the projections uTX\mathbf{u}^T \mathbf{X} of the data points X\mathbf{X}, capturing the minimum "spread" across directions. Similarly, in the halfspace framework, it involves the minimum robust scale (such as interquartile range or MAD) computed over all halfspaces containing at least half the data points, ensuring robustness against directional outliers. This combined location-scale perspective, as formalized in the work of Mizera and Müller, treats the pair (μ,Σ)(\boldsymbol{\mu}, \boldsymbol{\Sigma}) as a point in an extended space, with depth quantifying its admissibility relative to the empirical distribution. Computation of location-scale depth typically relies on approximations due to the optimization over infinite directions or halfspaces. For projection depth, one evaluates the robust scale on a finite grid of directions (e.g., randomly sampled unit vectors or spherical designs) and takes the minimum, with exact computation feasible in low dimensions but requiring Monte Carlo methods in higher ones; Zuo and Serfling outline properties enabling such approximations while preserving robustness. In the halfspace case, algorithms enumerate supporting halfspaces or use linear programming to identify the minimizing halfspace's scale measure, achieving polynomial time complexity for the Student depth variant, a tractable form of halfspace depth in the location-scale model. These methods scale to moderate dimensions but become intensive beyond p>10p > 10, often mitigated by subsampling. Key properties include affine invariance, ensuring the depth remains unchanged under nonsingular linear transformations, which is inherited from the underlying univariate robust scales and depth notions. Breakdown points up to 50% are attainable, meaning the estimator resists contamination by up to half the sample, making it suitable for outlier-heavy data; for instance, the projection-based scale depth achieves this when paired with high-breakdown univariate scales like Qn. Additionally, it facilitates shape analysis in high dimensions by providing contour regions that highlight central variability structures, aiding in anomaly detection and covariance estimation without assuming ellipticity. The concept builds on general statistical depth functions introduced by and Serfling, who extended univariate robust measures to multivariate settings via projections, laying the groundwork for scale depths as infima of univariate scales. Mizera and Müller further developed the halfspace-based location-scale depth, integrating likelihood principles for . Extensions to have been pursued by applying projection depths to infinite-dimensional spaces, enabling robust of curves while maintaining affine-like invariance under transformations.

Estimation and Inference

Approaches to Estimation

Robust measures of scale can be estimated using a variety of computational approaches, each balancing efficiency, robustness, and applicability to different sample sizes and data structures. These methods generally fall into direct, iterative, and resampling-based categories, with choices depending on the specific and desired accuracy. Direct methods are particularly advantageous for their simplicity and speed in large datasets, while iterative and bootstrap techniques offer flexibility for more complex or adaptive estimation. Direct methods compute scale estimators without iteration, typically leveraging order statistics from sorted data or pairwise absolute differences. The interquartile range (IQR), for instance, is obtained by sorting the sample and subtracting the first quartile from the third, providing a straightforward robust scale measure resistant to up to 25% outliers. Similarly, the Qn estimator, proposed by Rousseeuw and Croux, selects a consistent multiple of the first quartile of all pairwise absolute deviations, achieving a 50% breakdown point through this non-iterative process based on order statistics. The Sn estimator follows a comparable direct approach using medians of pairwise deviations, also attaining maximal breakdown robustness. These methods avoid convergence issues inherent in iterative procedures, making them suitable for initial screening or high-dimensional applications. Iterative methods, such as those for M-estimators, solve estimating equations to find a that minimizes the influence of outliers through a bounded . For a robust scale σ\sigma given a location estimate μ^\hat{\mu}, one common formulation seeks to satisfy 1ni=1nρ(xiμ^σ)=κ\frac{1}{n} \sum_{i=1}^n \rho\left( \frac{|x_i - \hat{\mu}|}{\sigma} \right) = \kappa, where ρ\rho is a robust (e.g., Huber's) and κ=E[ρ(Z)]\kappa = E[\rho(Z)] for standardization under the model distribution Z. This is often implemented via iteratively reweighted least squares (IRLS), which alternates between updating weights based on current residuals and solving weighted least squares problems until convergence. IRLS enhances efficiency for M-estimators by reformulating the problem as a sequence of linear regressions, though it requires careful initialization (e.g., with a direct estimator like IQR) to avoid local minima. Bootstrap approaches provide a resampling-based alternative, particularly useful for estimating robust scale in small samples or assessing variability without strong parametric assumptions. By repeatedly drawing bootstrap samples from the original data and recomputing the scale estimator on each, one can approximate the of the , yielding bias-corrected estimates or standard errors. For robust scale measures, adapted bootstrap methods, such as those reweighting samples to mimic the estimator's robustness, ensure consistency even with contaminants, as demonstrated in extensions of standard Efron bootstrapping to M-estimators and regression contexts. Computational considerations are crucial for practical implementation, especially with large datasets. Sorting-based direct methods like IQR and efficient algorithms for Qn and Sn achieve O(n log n) time complexity and O(n) space, enabling scalability to millions of observations. In contrast, naive pairwise computations for estimators like Qn require O(n^2) operations, which becomes prohibitive for n > 10,000, though optimized algorithms mitigate this to linearithmic performance. Iterative methods like IRLS typically converge in O(n) per iteration but may require 10-50 iterations, while bootstrap variants scale with the number of resamples B (often 1,000-10,000), adding O(B \times T) overhead where T is the base estimator's time.

Confidence Intervals for Scale

Confidence intervals for robust measures of scale quantify the in estimates of dispersion, particularly when may contain outliers or deviate from normality. These intervals can be constructed using asymptotic approximations, resampling techniques like the bootstrap, or exact methods in specific distributional cases. Asymptotic approaches rely on the applied to the estimators, while bootstrap methods are versatile for non-normal , and exact methods are available for particular scenarios such as the under uniform distributions or adaptations of sign tests for scale parameters. For the median absolute deviation (MAD), asymptotic confidence intervals are derived from its limiting normal distribution. Specifically, n(MAD^MAD)dN(0,V)\sqrt{n} (\widehat{\mathrm{MAD}} - \mathrm{MAD}) \xrightarrow{d} N(0, V)
Add your contribution
Related Hubs
User Avatar
No comments yet.