Hubbry Logo
Efficiency (statistics)Efficiency (statistics)Main
Open search
Efficiency (statistics)
Community hub
Efficiency (statistics)
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Efficiency (statistics)
Efficiency (statistics)
from Wikipedia

In statistics, efficiency is a measure of quality of an estimator, of an experimental design,[1] or of a hypothesis testing procedure.[2] Essentially, a more efficient estimator needs fewer input data or observations than a less efficient one to achieve the Cramér–Rao bound. An efficient estimator is characterized by having the smallest possible variance, indicating that there is a small deviance between the estimated value and the "true" value in the L2 norm sense.[1]

The relative efficiency of two procedures is the ratio of their efficiencies, although often this concept is used where the comparison is made between a given procedure and a notional "best possible" procedure. The efficiencies and the relative efficiency of two procedures theoretically depend on the sample size available for the given procedure, but it is often possible to use the asymptotic relative efficiency (defined as the limit of the relative efficiencies as the sample size grows) as the principal comparison measure.

Estimators

[edit]

The efficiency of an unbiased estimator, T, of a parameter θ is defined as [3]

where is the Fisher information of the sample. Thus e(T) is the minimum possible variance for an unbiased estimator divided by its actual variance. The Cramér–Rao bound can be used to prove that e(T) ≤ 1.

Efficient estimators

[edit]

An efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors of different magnitudes. The most common choice of the loss function is quadratic, resulting in the mean squared error criterion of optimality.[4]

In general, the spread of an estimator around the parameter θ is a measure of estimator efficiency and performance. This performance can be calculated by finding the mean squared error. More formally, let T be an estimator for the parameter θ. The mean squared error of T is the value , which can be decomposed as a sum of its variance and bias:

An estimator T1 performs better than an estimator T2 if .[5] For a more specific case, if T1 and T2 are two unbiased estimators for the same parameter θ, then the variance can be compared to determine performance. In this case, T2 is more efficient than T1 if the variance of T2 is smaller than the variance of T1, i.e. for all values of θ. This relationship can be determined by simplifying the more general case above for mean squared error; since the expected value of an unbiased estimator is equal to the parameter value, . Therefore, for an unbiased estimator, , as the term drops out for being equal to 0.[5]

If an unbiased estimator of a parameter θ attains for all values of the parameter, then the estimator is called efficient.[3]

Equivalently, the estimator achieves equality in the Cramér–Rao inequality for all θ. The Cramér–Rao lower bound is a lower bound of the variance of an unbiased estimator, representing the "best" an unbiased estimator can be.

An efficient estimator is also the minimum variance unbiased estimator (MVUE). This is because an efficient estimator maintains equality on the Cramér–Rao inequality for all parameter values, which means it attains the minimum variance for all parameters (the definition of the MVUE). The MVUE estimator, even if it exists, is not necessarily efficient, because "minimum" does not mean equality holds on the Cramér–Rao inequality.

Thus an efficient estimator need not exist, but if it does, it is the MVUE.

Finite-sample efficiency

[edit]

Suppose { Pθ | θ ∈ Θ } is a parametric model and X = (X1, …, Xn) are the data sampled from this model. Let T = T(X) be an estimator for the parameter θ. If this estimator is unbiased (that is, E[ T ] = θ), then the Cramér–Rao inequality states the variance of this estimator is bounded from below:

where is the Fisher information matrix of the model at point θ. Generally, the variance measures the degree of dispersion of a random variable around its mean. Thus estimators with small variances are more concentrated, they estimate the parameters more precisely. We say that the estimator is a finite-sample efficient estimator (in the class of unbiased estimators) if it reaches the lower bound in the Cramér–Rao inequality above, for all θ ∈ Θ. Efficient estimators are always minimum variance unbiased estimators. However the converse is false: There exist point-estimation problems for which the minimum-variance mean-unbiased estimator is inefficient.[6]

Historically, finite-sample efficiency was an early optimality criterion. However this criterion has some limitations:

  • Finite-sample efficient estimators are extremely rare. In fact, it was proved that efficient estimation is possible only in an exponential family, and only for the natural parameters of that family.[7]
  • This notion of efficiency is sometimes restricted to the class of unbiased estimators. (Often it is not.[8]) Since there are no good theoretical reasons to require that estimators are unbiased, this restriction is inconvenient. In fact, if we use mean squared error as a selection criterion, many biased estimators will slightly outperform the “best” unbiased ones. For example, in multivariate statistics for dimension three or more, the mean-unbiased estimator, sample mean, is inadmissible: Regardless of the outcome, its performance is worse than for example the James–Stein estimator.[citation needed]
  • Finite-sample efficiency is based on the variance, as a criterion according to which the estimators are judged. A more general approach is to use loss functions other than quadratic ones, in which case the finite-sample efficiency can no longer be formulated.[citation needed][dubiousdiscuss]

As an example, among the models encountered in practice, efficient estimators exist for: the mean μ of the normal distribution (but not the variance σ2), parameter λ of the Poisson distribution, the probability p in the binomial or multinomial distribution.

Consider the model of a normal distribution with unknown mean but known variance: { Pθ = N(θ, σ2) | θR }. The data consists of n independent and identically distributed observations from this model: X = (x1, …, xn). We estimate the parameter θ using the sample mean of all observations:

This estimator has mean θ and variance of σ2 / n, which is equal to the reciprocal of the Fisher information from the sample. Thus, the sample mean is a finite-sample efficient estimator for the mean of the normal distribution.

Asymptotic efficiency

[edit]

Asymptotic efficiency requires Consistency (statistics), asymptotically normal distribution of the estimator, and an asymptotic variance-covariance matrix no worse than that of any other estimator.[9]

Example: Median

[edit]

Consider a sample of size drawn from a normal distribution of mean and unit variance, i.e.,

The sample mean, , of the sample , defined as

The variance of the mean, 1/N (the square of the standard error) is equal to the reciprocal of the Fisher information from the sample and thus, by the Cramér–Rao inequality, the sample mean is efficient in the sense that its efficiency is unity (100%).

Now consider the sample median, . This is an unbiased and consistent estimator for . For large the sample median is approximately normally distributed with mean and variance [10]

The efficiency of the median for large is thus

In other words, the relative variance of the median will be , or 57% greater than the variance of the mean – the standard error of the median will be 25% greater than that of the mean.[11]

Note that this is the asymptotic efficiency — that is, the efficiency in the limit as sample size tends to infinity. For finite values of the efficiency is higher than this (for example, a sample size of 3 gives an efficiency of about 74%).[citation needed]

The sample mean is thus more efficient than the sample median in this example. However, there may be measures by which the median performs better. For example, the median is far more robust to outliers, so that if the Gaussian model is questionable or approximate, there may advantages to using the median (see Robust statistics).

Dominant estimators

[edit]

If and are estimators for the parameter , then is said to dominate if:

  1. its mean squared error (MSE) is smaller for at least some value of
  2. the MSE does not exceed that of for any value of θ.

Formally, dominates if

holds for all , with strict inequality holding somewhere.

Relative efficiency

[edit]

The relative efficiency of two unbiased estimators is defined as[12]

Although is in general a function of , in many cases the dependence drops out; if this is so, being greater than one would indicate that is preferable, regardless of the true value of .

An alternative to relative efficiency for comparing estimators, is the Pitman closeness criterion. This replaces the comparison of mean-squared-errors with comparing how often one estimator produces estimates closer to the true value than another estimator.

Estimators of the mean of u.i.d. variables

[edit]

In estimating the mean of uncorrelated, identically distributed variables we can take advantage of the fact that the variance of the sum is the sum of the variances. In this case efficiency can be defined as the square of the coefficient of variation, i.e.,[13]

Relative efficiency of two such estimators can thus be interpreted as the relative sample size of one required to achieve the certainty of the other. Proof:

Now because we have , so the relative efficiency expresses the relative sample size of the first estimator needed to match the variance of the second.

Robustness

[edit]

Efficiency of an estimator may change significantly if the distribution changes, often dropping. This is one of the motivations of robust statistics – an estimator such as the sample mean is an efficient estimator of the population mean of a normal distribution, for example, but can be an inefficient estimator of a mixture distribution of two normal distributions with the same mean and different variances. For example, if a distribution is a combination of 98% N(μ, σ) and 2% N(μ, 10σ), the presence of extreme values from the latter distribution (often "contaminating outliers") significantly reduces the efficiency of the sample mean as an estimator of μ. By contrast, the trimmed mean is less efficient for a normal distribution, but is more robust (i.e., less affected) by changes in the distribution, and thus may be more efficient for a mixture distribution. Similarly, the shape of a distribution, such as skewness or heavy tails, can significantly reduce the efficiency of estimators that assume a symmetric distribution or thin tails.

Efficiency in statistics

[edit]

Efficiency in statistics is important because it allows the performance of various estimators to be compared. Although an unbiased estimator is usually favored over a biased one, a more efficient biased estimator can sometimes be more valuable than a less efficient unbiased estimator. For example, this can occur when the values of the biased estimator gathers around a number closer to the true value. Thus, estimator performance can be predicted easily by comparing their mean squared errors or variances.

Uses of inefficient estimators

[edit]

While efficiency is a desirable quality of an estimator, it must be weighed against other considerations, and an estimator that is efficient for certain distributions may well be inefficient for other distributions. Most significantly, estimators that are efficient for clean data from a simple distribution, such as the normal distribution (which is symmetric, unimodal, and has thin tails) may not be robust to contamination by outliers, and may be inefficient for more complicated distributions. In robust statistics, more importance is placed on robustness and applicability to a wide variety of distributions, rather than efficiency on a single distribution. M-estimators are a general class of estimators motivated by these concerns. They can be designed to yield both robustness and high relative efficiency, though possibly lower efficiency than traditional estimators for some cases. They can be very computationally complicated, however.

A more traditional alternative are L-estimators, which are very simple statistics that are easy to compute and interpret, in many cases robust, and often sufficiently efficient for initial estimates. See applications of L-estimators for further discussion. Inefficient statistics in this sense are discussed in detail in The Atomic Nucleus by R. D. Evans, written before the advent of computers, when efficiently estimating even the arithmetic mean of a sorted series of measurements was laborious.[14]

Hypothesis tests

[edit]

For comparing significance tests, a meaningful measure of efficiency can be defined based on the sample size required for the test to achieve a given task power.[15]

Pitman efficiency[16] and Bahadur efficiency (or Hodges–Lehmann efficiency)[17][18][19] relate to the comparison of the performance of statistical hypothesis testing procedures.

Experimental design

[edit]

For experimental designs, efficiency relates to the ability of a design to achieve the objective of the study with minimal expenditure of resources such as time and money. In simple cases, the relative efficiency of designs can be expressed as the ratio of the sample sizes required to achieve a given objective.[20]

See also

[edit]

Notes

[edit]

References

[edit]

Further reading

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
In statistics, efficiency refers to the optimality of an in terms of achieving the minimum possible variance for estimating a parameter, particularly among unbiased estimators, as quantified by the Cramér-Rao lower bound. This bound establishes that the variance of any unbiased θ^\hat{\theta} of a parameter θ\theta cannot be smaller than the reciprocal of the I(θ)I(\theta), formally stated as Var(θ^)1nI(θ)\mathrm{Var}(\hat{\theta}) \geq \frac{1}{n I(\theta)} for a sample of size nn, where I(θ)=E[(θlogf(Xθ))2]I(\theta) = \mathbb{E}\left[\left(\frac{\partial}{\partial \theta} \log f(X|\theta)\right)^2\right] and f(Xθ)f(X|\theta) is the . An is deemed efficient if its variance exactly equals this bound, thereby minimizing the and representing the best possible precision under the given model. The Cramér-Rao bound, a cornerstone of estimation theory, was independently derived by Calyampudi Radhakrishna Rao in 1945 and Harald Cramér in 1946, building on earlier work by Maurice Fréchet in 1943. Rao's formulation appeared in his paper "Information and the Accuracy Attainable in the Estimation of Statistical Parameters" in the Bulletin of the Calcutta Mathematical Society, while Cramér's contribution appeared in his 1946 book Mathematical Methods of Statistics (Princeton University Press). The bound relies on regularity conditions, such as the differentiability of the log-likelihood and the ability to interchange differentiation and integration, ensuring its applicability to parametric models where the parameter influences the distribution support only through the density. When these conditions hold, efficient estimators, such as the maximum likelihood estimator under certain asymptotics, attain the bound and are thus minimum variance unbiased estimators (MVUE). Beyond absolute , the concept extends to relative efficiency, which compares the performance of two by the of their variances or, asymptotically, by the of sample sizes required to achieve equivalent precision. For instance, the asymptotic relative (ARE) of estimator T1T_1 relative to T2T_2 is limnVar(T2)Var(T1)\lim_{n \to \infty} \frac{\mathrm{Var}(T_2)}{\mathrm{Var}(T_1)}, often used to evaluate methods like the sample mean versus the in normal distributions, where the former is fully efficient but the latter has an ARE of approximately 0.637. In biased or asymptotic contexts, may be assessed via the attainment of the bound in the limit, highlighting trade-offs in robustness, computational feasibility, and model assumptions. These notions underpin broader , influencing the design of tests, confidence intervals, and large-sample approximations.

Fundamentals

Definition

In statistics, efficiency serves as a measure of performance for estimators, hypothesis tests, and experimental designs, focusing on achieving optimal precision or power relative to theoretical limits. For point estimation, an efficient estimator is one that attains the minimal possible variance among all unbiased estimators of a parameter, thereby providing the most precise estimate from available data. This concept presupposes unbiasedness, where the expected value of the estimator equals the true parameter, or at least consistency, where the estimator converges in probability to the true value as the sample size increases. Efficiency is often evaluated relative to benchmarks such as the maximum likelihood estimator (MLE), which achieves this bound under suitable regularity conditions. The foundational limit on estimator variance is given by the Cramér-Rao lower bound (CRLB), which quantifies the theoretical minimum variance for any unbiased . For a parameter θ\theta estimated from a sample of nn independent and identically distributed observations with density f(x;θ)f(x; \theta), the CRLB states: Var(θ^)1nI(θ),\operatorname{Var}(\hat{\theta}) \geq \frac{1}{n I(\theta)}, where I(θ)=E[(θlogf(X;θ))2]I(\theta) = \mathbb{E}\left[ \left( \frac{\partial}{\partial \theta} \log f(X; \theta) \right)^2 \right] is the , measuring the amount of information about θ\theta carried by the data. An is fully efficient if its variance equals this bound. This inequality, derived independently by Cramér and Rao, establishes efficiency as a property of variance attainment rather than mere low variance in isolation. Efficiency extends beyond estimation to hypothesis testing and experimental design, distinguishing these contexts by their optimization criteria. In testing, an efficient test maximizes the power—the probability of correctly rejecting a false —for a fixed significance level (, or Type I rate), ensuring the strongest detection capability under constraints. In experimental , efficiency pertains to maximizing the precision of inferences (e.g., reducing variance in estimates or increasing test power) per unit of resources, such as sample or cost, often through balanced allocation of treatments. These types share the goal of optimal but apply distinct benchmarks tailored to , detection, or planning.

Historical Development

The concept of efficiency in statistical estimation traces its roots to early 19th-century work on linear models, where derived what is now known as the Gauss-Markov theorem in his 1823 treatise Theoria combinationis observationum erroribus minimis obnoxiae (Theory of the Combination of Observations Least Subject to Errors). This theorem established the ordinary least squares estimator as the best linear unbiased estimator for parameters in under assumptions of homoscedasticity and no correlation in errors, laying foundational principles for efficiency in parametric linear settings. The theorem's implications for estimator variance minimization were later rediscovered and popularized in the 1920s by statisticians including , who integrated it into broader inference frameworks. In the 1920s, Ronald Fisher advanced the theory of efficiency through his development of maximum likelihood estimation, emphasizing estimators that achieve minimal variance relative to the Fisher information. In his seminal 1922 paper "On the Mathematical Foundations of Theoretical Statistics," Fisher introduced concepts of estimator efficiency tied to likelihood principles, arguing that maximum likelihood estimators are asymptotically efficient under regularity conditions, thus shifting focus from ad hoc methods to information-theoretic bounds. This work formalized efficiency as a criterion for estimator optimality, influencing subsequent parametric theory. Building on this, Edwin J. G. Pitman in 1937 developed the notion of relative efficiency to compare estimators across populations, defining it asymptotically as the ratio of sample sizes needed for two estimators to achieve equivalent performance, particularly in nonparametric contexts like tests for means. Key milestones in the mid-20th century included the independent derivations of the Cramér-Rao lower bound (CRLB), which provides a fundamental limit on the variance of unbiased estimators and defines efficiency as attaining this bound. Calyampudi R. Rao derived the bound in 1945 in his paper "Information and the Accuracy Attainable in the Estimation of Statistical Parameters," linking it directly to for multiparameter cases. Harald Cramér independently presented a similar result in 1946 in Mathematical Methods of Statistics, extending it to general unbiased estimation and solidifying the CRLB as a cornerstone for assessing efficiency. Lucien Le Cam further refined asymptotic efficiency in the 1950s, particularly through his 1953 work "On Some Asymptotic Properties of Maximum Likelihood Estimates and Related Bayes' Estimates," where he explored conditions under which estimators achieve optimal asymptotic performance beyond strict parametric assumptions. Modern extensions of efficiency concepts emerged in the 1960s, notably in robust and . Peter J. Huber, in his 1964 paper "Robust Estimation of a ," introduced efficiency considerations for estimators resilient to distributional outliers, balancing high efficiency under nominal models with robustness, as measured by asymptotic variance under contaminated distributions. In nonparametric settings, efficiency evolved through comparisons of rank-based tests to their parametric counterparts, with foundational work by Jacob Wolfowitz in the and later asymptotic analyses showing that nonparametric procedures can achieve relative efficiencies close to 1 under normality, as detailed in historical reviews of the field's development.

Estimator Efficiency

Efficient Estimators

In statistics, an efficient is defined as an unbiased that attains the Cramér–Rao lower bound (CRLB), which provides the theoretical minimum variance for any unbiased of a . This attainment signifies that the utilizes all available from the sample in estimating the , achieving the highest possible precision under the given model. The CRLB, derived independently by Harald Cramér and Calyampudi Radhakrishna Rao, states that for an unbiased estimator θ^\hat{\theta} of a scalar θ\theta, the variance satisfies Var(θ^)1nI(θ)\operatorname{Var}(\hat{\theta}) \geq \frac{1}{n I(\theta)}, where nn is the sample size and I(θ)I(\theta) is the . A key property of unbiased efficient estimators is that they are minimum variance unbiased estimators (MVUE), meaning no other unbiased can have lower variance for all values. For instance, in the case of independent and identically distributed observations from a N(μ,σ2)N(\mu, \sigma^2) with known σ2\sigma^2, the sample Xˉ\bar{X} is an efficient estimator of the μ\mu, as its variance σ2/n\sigma^2/n exactly equals the CRLB. Similarly, under certain conditions, the maximum likelihood estimator (MLE) achieves efficiency by reaching the CRLB. Efficiency requires specific regularity conditions on the to ensure the CRLB is applicable, including the differentiability of the log-likelihood with respect to the and the of the necessary moments for interchanging differentiation and integration. These conditions guarantee that the is well-defined and positive, allowing the bound to hold. The of an θ^\hat{\theta} can be quantified by the e(θ^)=CRLBVar(θ^)e(\hat{\theta}) = \frac{\operatorname{CRLB}}{\operatorname{Var}(\hat{\theta})}, where e(θ^)=1e(\hat{\theta}) = 1 indicates full . e(θ^)=CRLBVar(θ^)e(\hat{\theta}) = \frac{\operatorname{CRLB}}{\operatorname{Var}(\hat{\theta})} This ratio measures how closely the estimator approaches the theoretical limit, with values less than 1 signaling room for improvement in variance.

Relative Efficiency

Relative efficiency provides a means to compare the performance of two unbiased estimators θ^1\hat{\theta}_1 and θ^2\hat{\theta}_2 of a parameter θ\theta, typically through the ratio of their variances: eff(θ^1,θ^2)=Var(θ^2)Var(θ^1)\text{eff}(\hat{\theta}_1, \hat{\theta}_2) = \frac{\text{Var}(\hat{\theta}_2)}{\text{Var}(\hat{\theta}_1)}. If this value exceeds 1, θ^1\hat{\theta}_1 is more efficient, as it achieves the same precision with less variability for a given sample size. This metric is particularly useful in finite samples where the Cramér-Rao lower bound may not be attained by either estimator. For asymptotic comparisons, especially in translation families where the parameter θ\theta represents a location shift, Pitman relative efficiency extends this idea by evaluating the limiting ratio of sample sizes required for equivalent precision between two procedures. Under regularity conditions, this efficiency is determined by comparing the integrals [f(x)f(x)]2f(x)dx\int \left[ \frac{f'(x)}{f(x)} \right]^2 f(x) \, dx for the respective densities f1f_1 and f2f_2, which correspond to the Fisher informations for location parameters; the relative efficiency is the ratio of these informations (or their square roots, depending on the context of estimators versus tests). This measure, introduced by Pitman in , facilitates robust comparisons without relying on exact variance calculations in large samples. A classic illustration is the comparison between the sample mean and the sample median as estimators of the in a . The sample mean attains full (100% relative to the Cramér-Rao bound), while the asymptotic relative efficiency of the sample median with respect to the mean is 2π0.637\frac{2}{\pi} \approx 0.637, or about 64%, meaning the median requires roughly 57% more observations to match the mean's precision. In practice, this guides the selection of estimators in scenarios where robustness to outliers is traded against efficiency under normality, such as in preliminary . Relative efficiency metrics like these are applied when no estimator achieves the theoretical bound, enabling practitioners to choose the most suitable option based on distributional assumptions and computational constraints—for instance, opting for the in Gaussian settings or the in contaminated data to balance with practical utility.

Asymptotic Efficiency

Asymptotic efficiency evaluates the performance of an as the sample size nn tends to , focusing on whether it achieves the lowest possible asymptotic variance among consistent estimators. An θ^n\hat{\theta}_n is asymptotically efficient if it is consistent and the normalized error n(θ^nθ)\sqrt{n} (\hat{\theta}_n - \theta)
Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.