Hubbry Logo
Studentized range distributionStudentized range distributionMain
Open search
Studentized range distribution
Community hub
Studentized range distribution
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Studentized range distribution
Studentized range distribution
from Wikipedia
Studentized range distribution
Probability density function
Cumulative distribution function
Parameters k > 1, the number of groups
> 0, the degrees of freedom
Support
PDF
CDF

In probability and statistics, studentized range distribution is the continuous probability distribution of the studentized range of an i.i.d. sample from a normally distributed population.

Suppose that we take a sample of size n from each of k populations with the same normal distribution N(μσ2) and suppose that is the smallest of these sample means and is the largest of these sample means, and suppose s² is the pooled sample variance from these samples. Then the following statistic has a Studentized range distribution.

Definition

[edit]

Probability density function

[edit]

Differentiating the cumulative distribution function with respect to q gives the probability density function.

Note that in the outer part of the integral, the equation

was used to replace an exponential factor.

Cumulative distribution function

[edit]

The cumulative distribution function is given by [1]

Special cases

[edit]

If k is 2 or 3,[2] the studentized range probability distribution function can be directly evaluated, where is the standard normal probability density function and is the standard normal cumulative distribution function.

When the degrees of freedom approaches infinity the studentized range cumulative distribution can be calculated for any k using the standard normal distribution.

Applications

[edit]

Critical values of the studentized range distribution are used in Tukey's range test.[3]

The studentized range is used to calculate significance levels for results obtained by data mining, where one selectively seeks extreme differences in sample data, rather than only sampling randomly.

The Studentized range distribution has applications to hypothesis testing and multiple comparisons procedures. For example, Tukey's range test and Duncan's new multiple range test (MRT), in which the sample x1, ..., xn is a sample of means and q is the basic test-statistic, can be used as post-hoc analysis to test between which two groups means there is a significant difference (pairwise comparisons) after rejecting the null hypothesis that all groups are from the same population (i.e. all means are equal) by the standard analysis of variance.[4]

[edit]

When only the equality of the two groups means is in question (i.e. whether μ1 = μ2), the studentized range distribution is similar to the Student's t distribution, differing only in that the first takes into account the number of means under consideration, and the critical value is adjusted accordingly. The more means under consideration, the larger the critical value is. This makes sense since the more means there are, the greater the probability that at least some differences between pairs of means will be significantly large due to chance alone.

Derivation

[edit]

The studentized range distribution function arises from re-scaling the sample range R by the sample standard deviation s, since the studentized range is customarily tabulated in units of standard deviations, with the variable q = Rs . The derivation begins with a perfectly general form of the distribution function of the sample range, which applies to any sample data distribution.

In order to obtain the distribution in terms of the "studentized" range q, we will change variable from R to s and q. Assuming the sample data is normally distributed, the standard deviation s will be χ distributed. By further integrating over s we can remove s as a parameter and obtain the re-scaled distribution in terms of q alone.

General form

[edit]

For any probability density function fX, the range probability density fR is:[2]

What this means is that we are adding up the probabilities that, given k draws from a distribution, two of them differ by r, and the remaining k − 2 draws all fall between the two extreme values. If we change variables to u where is the low-end of the range, and define FX as the cumulative distribution function of fX, then the equation can be simplified:

We introduce a similar integral, and notice that differentiating under the integral-sign gives

which recovers the integral above,[a] so that last relation confirms

because for any continuous cdf

Special form for normal data

[edit]

The range distribution is most often used for confidence intervals around sample averages, which are asymptotically normally distributed by the central limit theorem.

In order to create the studentized range distribution for normal data, we first switch from the generic fX and FX to the distribution functions φ and Φ for the standard normal distribution, and change the variable r to s·q, where q is a fixed factor that re-scales r by scaling factor s:

Choose the scaling factor s to be the sample standard deviation, so that q becomes the number of standard deviations wide that the range is. For normal data s is chi distributed[b] and the distribution function fS of the chi distribution is given by:

Multiplying the distributions fR and fS and integrating to remove the dependence on the standard deviation s gives the studentized range distribution function for normal data:

where

q is the width of the data range measured in standard deviations,
ν is the number of degrees of freedom for determining the sample standard deviation,[c] and
k is the number of separate averages that form the points within the range.

The equation for the pdf shown in the sections above comes from using

to replace the exponential expression in the outer integral.

Notes

[edit]

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The Studentized range distribution is the continuous of the studentized range QQ, defined as the range (maximum minus minimum) of kk sample means from independent normal populations with equal variances and equal sample sizes, divided by the estimated of a single mean, where the for the error estimate is ν\nu. This arises in the context of comparing multiple means following an analysis of variance (ANOVA), providing critical values for determining significant differences while controlling the . The concept of the studentized range was first introduced by D. Newman in 1939, who derived its distribution for samples from a normal population and expressed it in terms of an independent estimate of the standard deviation, building on earlier suggestions by "" (William Sealy ). In 1952, M. Keuls extended its application to multiple comparisons in ANOVA, proposing a step-down procedure that uses quantiles from this distribution to test ordered means. John W. Tukey further developed the framework in 1949 for comparing individual means post-ANOVA, and comprehensive tables of the distribution's percentage points were published by H. Leon Harter in 1960, facilitating practical computations. The distribution is central to several multiple comparison procedures, including Tukey's Honestly Significant Difference (HSD) test, which employs a single-step approach using the studentized range to compare all pairwise means while maintaining the overall Type I error rate at a specified level α\alpha. It is also integral to the Newman-Keuls test, a more powerful but less conservative step-down method for ordered comparisons. Quantiles from the studentized range distribution depend on three parameters: the number of groups kk, the sample size per group nn (assuming balanced design), the degrees of freedom ν\nu, and α\alpha, and are tabulated or computed numerically for use in these tests under the assumption of normality and homogeneity of variances. For unbalanced designs, modified procedures such as the Tukey-Kramer method adjust the critical values based on this distribution. Modern statistical software implements these calculations, enabling robust post-hoc analyses in experimental designs.

Definition

Probability density function

The studentized range QQ is defined as the range of kk independent standard normal random variables divided by an independent estimate of the standard deviation, specifically Q=RSQ = \frac{R}{S}, where RR is the range among the kk normals and S=χν2/νS = \sqrt{\chi^2_\nu / \nu}
Add your contribution
Related Hubs
User Avatar
No comments yet.