Recent from talks
Nothing was collected or created yet.
Index of dispersion
View on WikipediaIn probability theory and statistics, the index of dispersion,[1] dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a probability distribution: it is a measure used to quantify whether a set of observed occurrences are clustered or dispersed compared to a standard statistical model.
It is defined as the ratio of the variance to the mean ,
It is also known as the Fano factor, though this term is sometimes reserved for windowed data (the mean and variance are computed over a subpopulation), where the index of dispersion is used in the special case where the window is infinite. Windowing data is frequently done: the VMR is frequently computed over various intervals in time or small regions in space, which may be called "windows", and the resulting statistic called the Fano factor.
It is only defined when the mean is non-zero, and is generally only used for positive statistics, such as count data or time between events, or where the underlying distribution is assumed to be the exponential distribution or Poisson distribution.
Terminology
[edit]In this context, the observed dataset may consist of the times of occurrence of predefined events, such as earthquakes in a given region over a given magnitude, or of the locations in geographical space of plants of a given species. Details of such occurrences are first converted into counts of the numbers of events or occurrences in each of a set of equal-sized time- or space-regions.
The above defines a dispersion index for counts.[2] A different definition applies for a dispersion index for intervals,[3] where the quantities treated are the lengths of the time-intervals between the events. Common usage is that "index of dispersion" means the dispersion index for counts.
Interpretation
[edit]Some distributions, most notably the Poisson distribution, have equal variance and mean, giving them a VMR = 1. The geometric distribution and the negative binomial distribution have VMR > 1, while the binomial distribution has VMR < 1, and the constant random variable has VMR = 0. This yields the following table:
| Distribution | VMR | |
|---|---|---|
| constant random variable | VMR = 0 | not dispersed |
| binomial distribution | 0 < VMR < 1 | under-dispersed |
| Poisson distribution | VMR = 1 | |
| negative binomial distribution | VMR > 1 | over-dispersed |
This can be considered analogous to the classification of conic sections by eccentricity; see Cumulants of particular probability distributions for details.
The relevance of the index of dispersion is that it has a value of 1 when the probability distribution of the number of occurrences in an interval is a Poisson distribution. Thus the measure can be used to assess whether observed data can be modeled using a Poisson process. When the coefficient of dispersion is less than 1, a dataset is said to be "under-dispersed": this condition can relate to patterns of occurrence that are more regular than the randomness associated with a Poisson process. For instance, regular, periodic events will be under-dispersed. If the index of dispersion is larger than 1, a dataset is said to be over-dispersed.
A sample-based estimate of the dispersion index can be used to construct a formal statistical hypothesis test for the adequacy of the model that a series of counts follow a Poisson distribution.[4][5] In terms of the interval-counts, over-dispersion corresponds to there being more intervals with low counts and more intervals with high counts, compared to a Poisson distribution: in contrast, under-dispersion is characterised by there being more intervals having counts close to the mean count, compared to a Poisson distribution.
The VMR is also a good measure of the degree of randomness of a given phenomenon. For example, this technique is commonly used in currency management.
Example
[edit]For randomly diffusing particles (Brownian motion), the distribution of the number of particle inside a given volume is poissonian, i.e. VMR=1. Therefore, to assess if a given spatial pattern (assuming you have a way to measure it) is due purely to diffusion or if some particle-particle interaction is involved : divide the space into patches, Quadrats or Sample Units (SU), count the number of individuals in each patch or SU, and compute the VMR. VMRs significantly higher than 1 denote a clustered distribution, where random walk is not enough to smother the attractive inter-particle potential.
History
[edit]The first to discuss the use of a test to detect deviations from a Poisson or binomial distribution appears to have been Lexis in 1877. One of the tests he developed was the Lexis ratio.
This index was first used in botany by Clapham in 1936.
Hoel studied the first four moments of its distribution.[6] He found that the approximation to the χ2 statistic is reasonable if μ > 5.
Skewed distributions
[edit]For highly skewed distributions, it may be more appropriate to use a linear loss function, as opposed to a quadratic one. The analogous coefficient of dispersion in this case is the ratio of the average absolute deviation from the median to the median of the data,[7] or, in symbols:
where n is the sample size, m is the sample median and the sum taken over the whole sample. Iowa, New York and South Dakota use this linear coefficient of dispersion to estimate dues taxes.[8][9][10]
For a two-sample test in which the sample sizes are large, both samples have the same median, and differ in the dispersion around it, a confidence interval for the linear coefficient of dispersion is bounded inferiorly by
where tj is the mean absolute deviation of the jth sample and zα is the confidence interval length for a normal distribution of confidence α (e.g., for α = 0.05, zα = 1.96).[7]
See also
[edit]Similar ratios
[edit]- Coefficient of variation,
- Standardized moment,
- Fano factor, (windowed VMR)
- signal-to-noise ratio, (in signal processing)
Notes
[edit]- ^ Cox &Lewis (1966)
- ^ Cox & Lewis (1966), p72
- ^ Cox & Lewis (1966), p71
- ^ Cox & Lewis (1966), p158
- ^ Upton & Cook(2006), under index of dispersion
- ^ Hoel, P. G. (1943). "On Indices of Dispersion". Annals of Mathematical Statistics. 14 (2): 155–162. doi:10.1214/aoms/1177731457. JSTOR 2235818.
- ^ a b Bonett, DG; Seier, E (2006). "Confidence interval for a coefficient of dispersion in non-normal distributions". Biometrical Journal. 48 (1): 144–148. doi:10.1002/bimj.200410148. PMID 16544819. S2CID 33665632.
- ^ "Statistical Calculation Definitions for Mass Appraisal" (PDF). Iowa.gov. Archived from the original (PDF) on 11 November 2010.
Median Ratio: The ratio located midway between the highest ratio and the lowest ratio when individual ratios for a class of realty are ranked in ascending or descending order. The median ratio is most frequently used to determine the level of assessment for a given class of real estate.
- ^ "Assessment equity in New York: Results from the 2010 market value survey". Archived from the original on 6 November 2012.
- ^ "Summary of the Assessment Process" (PDF). state.sd.us. South Dakota Department of Revenue - Property/Special Taxes Division. Archived from the original (PDF) on 10 May 2009.
References
[edit]- Cox, D. R.; Lewis, P. A. W. (1966). The Statistical Analysis of Series of Events. London: Methuen.
- Upton, G.; Cook, I. (2006). Oxford Dictionary of Statistics (2nd ed.). Oxford University Press. ISBN 978-0-19-954145-4.
Index of dispersion
View on GrokipediaFundamentals
Definition
Count data, also known as event counts, consist of non-negative integer values that represent the number of discrete events occurring within a fixed interval of time, space, or another unit of observation.[5] For such data, the sample mean provides the average count across observations, while the sample variance measures the average squared deviation from this mean, capturing the spread or variability in the counts.[1] The index of dispersion , also referred to as the variance-to-mean ratio (VMR), is a dimensionless statistic defined as the ratio of the sample variance to the sample mean: [1][6] This VMR quantifies the relative dispersion in non-negative integer data by comparing the observed variability to the central tendency, with values around 1 indicating Poisson-like randomness where variance equals the mean.[1]Terminology
The index of dispersion is commonly referred to by several synonymous terms in statistical literature, including the dispersion index, the variance-to-mean ratio (VMR), and sometimes the coefficient of dispersion. These names emphasize its role as a simple ratio-based measure for assessing variability in discrete data distributions. The term "index of dispersion" itself was introduced by Ronald A. Fisher in his 1925 work on statistical methods for research workers, where it was applied to evaluate deviations from expected patterns in count-based observations.[4] It is frequently abbreviated as in both theoretical and applied contexts. Note that "coefficient of dispersion" can also refer to other measures of relative dispersion, such as the mean absolute deviation from the median divided by the median, which applies more broadly to various data types.[7] A key distinction in usage arises between count data—such as the number of events in fixed spatial areas or time intervals—and interval data, like the durations between successive events; the index of dispersion typically pertains to the former, while the latter may involve related metrics such as the index of dispersion for intervals (IDI).[8] This focus on counts aligns with its origins in Poisson process analysis, where it helps diagnose distributional assumptions.[4] The variance-to-mean ratio designation, in particular, highlights its unnormalized nature, distinguishing it from percentage-based coefficients that scale relative to the mean or range.[9]Interpretation
Poisson Process Context
In a homogeneous Poisson process, events occur continuously and independently at a constant average rate , such that the number of events within any fixed interval of length follows a Poisson distribution with both mean and variance equal to . This equidispersion property—where variance equals the mean—implies that the index of dispersion , defined as the ratio of variance to mean, equals 1 under the Poisson assumption.[10] Such processes model phenomena like random arrivals in queueing systems or point occurrences in space, assuming no underlying patterns beyond the constant rate.[10] The index of dispersion serves as a diagnostic tool to assess whether observed counts in temporal or spatial bins align with the expectations of a homogeneous Poisson process. When , the data are consistent with random, independent events at a steady rate; deviations from this value signal potential non-homogeneity, such as temporal bunching or spatial aggregation that violates the independence assumption.[11] This evaluation is particularly relevant in fields like ecology and reliability engineering, where testing for Poisson conformity helps distinguish true randomness from structured variability.[12] Application of the index of dispersion in this context requires familiarity with the core properties of the Poisson distribution, including its equidispersion (mean = variance) and the fact that inter-event times are exponentially distributed under homogeneity.[1] However, the test's reliability depends on sufficient data volume; it assumes large sample sizes for asymptotic validity, as small samples can lead to unstable estimates.[11]Over- and Under-dispersion
Overdispersion arises when the index of dispersion exceeds 1 (D > 1), signifying greater variability in count data than anticipated under a Poisson model, often reflecting clustering or contagion where occurrences are positively dependent or aggregated. This pattern is prevalent in biological contexts, such as the spatial aggregation of organisms due to habitat preferences or social behaviors.[13] In contrast, underdispersion occurs when D < 1, indicating reduced variability and a tendency toward regularity or inhibition, where observations are more evenly spaced than random, as seen in sampling schemes with fixed allocations or territorial inhibition among individuals.[13] Extreme boundary cases delineate the full spectrum: D = 0 corresponds to complete uniformity, with zero variance across all counts implying identical outcomes in every unit; as D approaches infinity, variability becomes unbounded, characteristic of highly concentrated events in few units amid widespread zeros. Such deviations from the Poisson expectation of D = 1 are typically driven by heterogeneity in underlying event rates, which introduces extra variance through mechanisms like finite mixtures of Poisson processes, or by positive dependence between observations, such as spatial or temporal correlations that amplify clustering.[13][14] Diagnostic thresholds provide informal benchmarks for interpretation; for instance, in large samples, D > 1.5 often signals notable overdispersion warranting alternative modeling, though these are supplementary to rigorous hypothesis testing.[15]Computation
Formula and Estimation
The index of dispersion, also known as the variance-to-mean ratio, for a set of count data drawn from a Poisson process is given by the population formula where is the population mean and is the population variance.[1] Under the Poisson assumption, since .[1] For estimation from sample data, the standard point estimator uses the unbiased sample variance: where is the sample mean and is the sample size.[3] This adjustment with the denominator ensures is unbiased for , though the ratio itself remains approximately unbiased for large due to the consistency of both components.[16] An adaptation exists for waiting times or interarrival intervals in a renewal process, such as those from a Poisson process, where the index is For exponential interarrivals (as in a Poisson process), this yields .[17] More generally, the index of dispersion for intervals over blocks of consecutive times is the variance of the block sum divided by the square of its mean.[17] Computational considerations include ensuring the sample mean to avoid division by zero, with zero counts permissible as they are inherent to count data like Poisson realizations.[1] However, the estimator exhibits bias when is small (e.g., near zero), as the discrete nature of counts amplifies variability relative to the continuous approximation. For large samples under the null Poisson hypothesis, the test statistic approximately follows a chi-squared distribution with degrees of freedom, providing a basis for inference on dispersion.[3][18]Hypothesis Testing
The chi-squared test for the index of dispersion provides a standard approach to assess whether observed count data conform to the Poisson distribution's equidispersion assumption, where the variance equals the mean. Under the null hypothesis of a Poisson process, the test statistic , with denoting the index of dispersion and the number of observations, is asymptotically distributed as a chi-squared random variable with degrees of freedom.[19] Rejection of the null occurs if the statistic exceeds the critical value from the chi-squared distribution at a chosen significance level (e.g., 5%), indicating overdispersion, or if it falls below the lower critical value, suggesting underdispersion; however, the test is typically applied to detect overdispersion in practice.[19] For small samples where the chi-squared approximation may lack accuracy, exact tests based on the conditional distribution of the counts are recommended, particularly when the expected mean is low. These include binomial exact tests or conditional approaches that enumerate the probability under the null, avoiding reliance on asymptotic distributions.[20] Significance tables for critical values in such overdispersion tests have been derived to facilitate computation without approximation errors. The power of the chi-squared dispersion test increases with sample size , as larger enhances sensitivity to deviations from the Poisson null, but it requires sufficiently large expected counts per cell (typically ) for the approximation to hold reliably. In overdispersed scenarios, the test exhibits controlled Type I error rates near the nominal level but may suffer elevated Type II errors if the overdispersion is mild or clustered, necessitating careful consideration of effect size in study design.[21] As an alternative to the dispersion-based chi-squared test, likelihood ratio tests compare the Poisson model directly to overdispersion alternatives like the negative binomial distribution, where the null hypothesis posits no overdispersion (dispersion parameter ). These tests leverage the nested structure of the models and are asymptotically chi-squared distributed with one degree of freedom under the null, offering greater power against specific overdispersion forms such as those induced by unobserved heterogeneity. Implementations of these tests are available in statistical software, such as thegoodfit function in R's vcd package for chi-squared and exact dispersion tests, with likelihood ratio comparisons supported in packages like MASS for negative binomial models.
Examples and Applications
Illustrative Example
Consider a hypothetical scenario where the number of insects observed in 10 equal-sized quadrats is recorded as the dataset {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. This setup allows for a straightforward demonstration of the index of dispersion in count data analysis.[22] To compute the index, first calculate the mean : Next, determine the sample variance (dividing by ): The index of dispersion is then the ratio of sample variance to mean: This calculation follows the standard definition for assessing dispersion in count data, as applied in ecological sampling.[1][22] The value indicates overdispersion relative to a Poisson distribution, suggesting some clustering of insects within the quadrats rather than complete randomness. To test this formally, compute the chi-squared dispersion statistic: which follows an approximate distribution with degrees of freedom under the null hypothesis of a Poisson process. The p-value for this statistic is approximately 0.03, providing evidence against the null at the 5% significance level.[22] For visualization, plot the observed counts as a bar chart or histogram, where the x-axis represents quadrat number and the y-axis shows insect counts. This reveals the spread from 0 to 9, visually highlighting the overdispersion through asymmetry and outliers at the extremes, consistent with clustering patterns in spatial data.[22]Real-World Applications
In ecology, the index of dispersion is widely applied to assess spatial clustering of species distributions, particularly in quadrat sampling of plant populations, where it quantifies deviations from random spatial patterns by comparing the variance to the mean count of individuals per sampling unit. For instance, post-2010 studies have utilized variants like the Morisita index of dispersion to evaluate intraspecific aggregation in forest ecosystems, revealing clustered patterns driven by environmental heterogeneity that influence biodiversity assessments.[23] In a 2019 analysis of insect counts across time-based and quadrat samples, the index confirmed overdispersion indicative of non-random aggregation, aiding in the design of more effective ecological monitoring protocols.[24] These applications highlight its role in identifying habitat preferences and conservation priorities, such as in fragmented landscapes where clustering signals vulnerability to habitat loss.[25] In genetics, the index of dispersion measures overdispersion in allele counts and mutation rates, providing insights into evolutionary processes and data variability in next-generation sequencing (NGS) analyses. A 2021 study on mutation burden in cellular lineages employed the index to quantify rate volatility, showing values exceeding 3.4 that reflect asymmetric segregation and increased heterogeneity in cancer-related mutations.[26] Similarly, a 2025 investigation into viral mutation rates during influenza A replication used the index to demonstrate overdispersion in viable genome counts, with ratios indicating non-Poisson variability that impacts estimates of lethal mutagenesis thresholds in NGS datasets.[27] These metrics are crucial for adjusting models in population genetics, where overdispersion signals factors like selection pressures or sequencing errors, enhancing the accuracy of variant calling in large-scale genomic studies. In quality control for manufacturing, the index of dispersion evaluates defect counts in production batches, detecting overdispersion that suggests process instability or clustering of faults. A 2023 nonparametric control chart approach applied the index to monitor count data dispersion, identifying shifts in variance-to-mean ratios above 1 that prompt interventions in assembly lines for items like electronics.[28] For underdispersion, observed in tightly regulated processes such as pharmaceutical packaging where defect variability is minimized below Poisson expectations (index < 1), the metric guides adjustments to ensure consistent quality, as seen in analyses of bounded count data from automated inspection systems.[29] This dual application supports robust statistical process control, reducing waste by flagging deviations early in high-volume manufacturing. Modern computational tools facilitate the index's application through accessible software for estimation and simulations. In R, the vegan package computes spatial variants like the Morisita index of dispersion for ecological count data, enabling power analysis via bootstrapped simulations to assess clustering significance.[30] For general count data, base R functions such as var() and mean() allow straightforward calculation, often integrated with packages like good for modeling under- or overdispersed processes in simulations. In Python, the index is computed using NumPy's var function (with ddof=1 for sample variance) divided by the sample mean—for applications in genetics and quality control, supporting simulations for hypothesis testing on NGS datasets or defect rates without built-in specialized functions.[31] Recent developments in epidemiology have leveraged the index of dispersion to analyze disease clustering, particularly in spatial assessments of COVID-19 transmission from 2020 to 2025. A 2022 school-based study calculated the index for cluster sizes, yielding values around 2.29 in Texas outbreaks, indicating overdispersion that informed targeted interventions like ventilation improvements.[32] In a 2023 modeling effort for incidence forecasting, the index captured overdispersion in daily cases (ω > 1), enhancing predictions of epidemic peaks across regions and highlighting superspreading events in urban clusters.[33] By 2025, extensions to spatiotemporal data used the index to quantify variability in vaccination-era outbreaks, aiding public health strategies for emerging variants.[34]Historical Development
Origins
The conceptual foundations of the index of dispersion trace back to mid-19th-century advancements in probability theory, particularly the work of Siméon Denis Poisson. In his 1837 treatise Recherches sur la probabilité des jugements en matière criminelle et en matière civile, Poisson developed key ideas on the distribution of rare events and the law of small numbers, which posited that under certain conditions, the frequency of events approximates a Poisson distribution where variance equals the mean. These principles provided an early framework for analyzing count data and testing deviations from expected randomness, influencing later statistical measures of variability in discrete processes.[35] Wilhelm Lexis, a German economist and statistician, built upon this probabilistic foundation in his seminal 1877 publication Zur Theorie der Massenerscheinungen in der menschlichen Gesellschaft. Lexis introduced early ratio-based measures to quantify deviations in biometric and population data, focusing on the stability of statistical series derived from counts such as birth and death rates. He proposed comparing observed variability to theoretical expectations under binomial or Poisson-like assumptions, categorizing series as exhibiting normal, super-, or subnormal stability based on whether the observed dispersion aligned with, fell below, or exceeded the predicted value.[36] This approach marked an initial formulation of dispersion indices as tools for assessing homogeneity in grouped data.[37] Lexis' motivations stemmed from practical needs in actuarial science and population statistics, where testing randomness in count data was essential for reliable predictions in insurance and demographic analysis. Using Prussian monthly data on sex ratios at birth across districts, he demonstrated how variability ratios could reveal underlying non-random patterns, such as age-specific mortality fluctuations, thereby challenging simplistic probabilistic models.[37] His work emphasized the ratio of empirical variance to mean as a diagnostic for deviations, laying groundwork for later refinements while highlighting the limitations of constant probability assumptions in real-world counts.[36] Building on these foundations, Ronald A. Fisher introduced the index of dispersion in 1925 in his book Statistical Methods for Research Workers. Fisher defined it as the variance-to-mean ratio for testing departures from the Poisson distribution in small samples of count data, such as bacterial colony counts in biological experiments. He applied it in the context of dilution methods and parallel sampling, developing a chi-squared test statistic to compare observed variability against Poisson expectations.[4]Key Advancements
In 1936, A. R. Clapham advanced the application of the index of dispersion by employing it to study over-dispersion in grassland plant communities through quadrat sampling methods. He formalized the variance-to-mean ratio (VMR) as a practical statistic for detecting non-random spatial patterns, such as clumping, in ecological data, thereby extending its utility from demographic contexts to botanical analysis. This work highlighted the index's sensitivity to deviations from Poisson expectations in natural populations, influencing subsequent ecological surveys. Building on these applications, Paul G. Hoel provided a rigorous theoretical refinement in 1943 through a moments-based analysis of the index's sampling distribution under the null Poisson hypothesis. He computed the first four moments using Fisher's k-statistics and established that, for mean counts μ exceeding 5, the index closely approximates a chi-squared distribution with n-1 degrees of freedom, enabling reliable large-sample approximations for hypothesis testing. Hoel also derived exact distribution formulas for smaller samples, addressing limitations in earlier approximations and improving precision for low-count scenarios. Post-2000 developments have emphasized robust estimation techniques to handle real-world data imperfections. Bonett and Seier (2006) introduced a method for constructing confidence intervals for a robust coefficient of dispersion—defined as the mean absolute deviation from the median divided by the median—applicable to non-normal distributions, which mitigates bias from outliers in dispersion assessment. This approach enhances the index's reliability in empirical settings where Poisson assumptions may be mildly violated. The index has been increasingly integrated into generalized linear models (GLMs) for count data, where it informs the estimation of a dispersion parameter to account for over-dispersion beyond the Poisson variance. In quasi-Poisson and negative binomial GLMs, the index guides model selection and parameter scaling, allowing variance to exceed the mean while maintaining computational tractability. Addressing historical gaps in manual computation, the transition to digital tools since the late 20th century has enabled automated estimation via software packages, reducing errors and scaling analyses to large datasets. Recent Bayesian frameworks, such as those modeling spatially varying dispersion parameters in negative binomial regressions, offer hierarchical inference for heterogeneous count data, with applications demonstrated through Markov chain Monte Carlo simulations as of 2022. These methods incorporate prior distributions on the dispersion parameter, improving posterior estimates in complex, spatially structured scenarios up to 2025.Extensions
Skewed Distributions
The standard index of dispersion, which computes the ratio of variance to mean (VMR), can be misleading in skewed distributions because it relies on the mean as the central tendency measure, and skewness—particularly positive skewness in count data—inflates the variance due to outliers in the tail, exaggerating apparent overdispersion.[38] This assumption of symmetry aligns with the Poisson model underlying the index, but real-world data often deviates, leading to inaccurate inferences about clustering or regularity.[39] To address this, the linear coefficient of dispersion serves as a robust alternative, employing the median instead of the mean to mitigate the impact of skewness. One common formulation is the coefficient of dispersion (COD), calculated as the average absolute deviation from the median ratio divided by the median, multiplied by 100 to express it as a percentage:where are individual ratios, is the median ratio, and is the number of observations.[38] Another variant, the quartile coefficient of dispersion, uses the interquartile range relative to the sum of the outer quartiles:
where and are the third and first quartiles; this focuses on the central 50% of the data, further reducing outlier influence in skewed cases.[40] These median-based approaches provide a more stable estimate of relative spread when distributions are asymmetric. In property tax assessments, the COD is a key tool for measuring appraisal uniformity across skewed ratio distributions, where high-value outliers can distort mean-based metrics. Iowa applies COD in its annual sales ratio studies to adjust assessments if values deviate by more than 5% from market levels, targeting COD thresholds under 15% for residential properties to ensure equity.[41] Similarly, New York uses COD in market value surveys to evaluate county-level performance, aiming for values below 20% for all property types and under 15% for larger jurisdictions, triggering reappraisals if exceeded.[42] South Dakota codifies COD in its statutes, requiring it to stay under 25% for real property to confirm compliance and uniformity in tax levies.[43] For ecological count data featuring excess zeros—which induce positive skewness and overdispersion—the linear coefficient of dispersion offers a practical adaptation by centering on the median to better capture typical variability without tail dominance. Such data, common in species counts or event occurrences, benefits from this measure to assess spatial dispersion patterns reliably, avoiding the inflated VMR that zeros exacerbate.[44] When selecting between median-based and mean-based dispersion indices, the former is recommended for skewed data where the mean exceeds the median by more than 20-30% of the standard deviation, indicating asymmetry that could bias VMR; symmetric distributions (skewness near zero) favor the standard index for its alignment with Poisson assumptions. Switching thresholds vary by field—e.g., COD below 15% signals adequate uniformity in tax contexts—but generally prioritize median metrics if kurtosis or outlier presence suggests non-normality.[38][39]
