Recent from talks
Nothing was collected or created yet.
Mantel test
View on WikipediaThe Mantel test, named after Nathan Mantel, is a statistical test of the correlation between two matrices. The matrices must be of the same dimension; in most applications, they are matrices of interrelations between the same vectors of objects. The test was first published by Nathan Mantel, a biostatistician at the National Institutes of Health, in 1967.[1] Accounts of it can be found in advanced statistics books (e.g., Sokal & Rohlf 1995[2]).
Usage
[edit]The test is commonly used in ecology, where the data are usually estimates of the "distance" between objects such as species of organisms. For example, one matrix might contain estimates of the genetic distances (i.e., the amount of difference between two different genomes) between all possible pairs of species in the study, obtained by the methods of molecular systematics; while the other might contain estimates of the geographical distance between the ranges of each species to every other species. In this case, the hypothesis being tested is whether the variation in genetics for these organisms is correlated to the variation in geographical distance.
Method
[edit]This article may require cleanup to meet Wikipedia's quality standards. The specific problem is: better description of permutation test strategy. (February 2025) |
If there are n objects, and the matrix is symmetrical (so the distance from object a to object b is the same as the distance from b to a) such a matrix contains
distances. Because distances are not independent of each other – since changing the "position" of one object would change of these distances (the distance from that object to each of the others) – we can not assess the relationship between the two matrices by simply evaluating the correlation coefficient between the two sets of distances and testing its statistical significance. The Mantel test deals with this problem.
The procedure adopted is a kind of randomization or permutation test. The correlation between the two sets of distances is calculated, and this is both the measure of correlation reported and the test statistic on which the test is based. In principle, any correlation coefficient could be used, but normally the Pearson product-moment correlation coefficient is used.
In contrast to the ordinary use of the correlation coefficient, to assess significance of any apparent departure from a zero correlation, the rows and columns of one of the matrices are subjected to random permutations many times, with the correlation being recalculated after each permutation. The significance of the observed correlation is the proportion of such permutations that lead to a higher correlation coefficient.
The reasoning is that if the null hypothesis of there being no relation between the two matrices is true, then permuting the rows and columns of the matrix should be equally likely to produce a larger or a smaller coefficient. In addition to overcoming the problems arising from the statistical dependence of elements within each of the two matrices, use of the permutation test means that no reliance is being placed on assumptions about the statistical distributions of elements in the matrices.
Many statistical packages include routines for carrying out the Mantel test.
Criticism
[edit]The various papers introducing the Mantel test (and its extension, the partial Mantel test) lack a clear statistical framework specifying fully the null and alternative hypotheses. This may convey the wrong idea that these tests are universal. For example, the Mantel and partial Mantel tests can be flawed in the presence of spatial auto-correlation and return erroneously low p-values. See, e.g., Guillot and Rousset (2013).[3]
See also
[edit]References
[edit]- ^ Mantel, N. (1967). "The detection of disease clustering and a generalized regression approach". Cancer Research. 27 (2): 209–220. PMID 6018555.
- ^ Sokal RR, Rohlf FJ (1995). Biometry (3rd ed.). New York: Freeman. pp. 813–819. ISBN 0-7167-2411-1.
- ^ Guillot G, Rousset F (2013). "Dismantling the Mantel tests". Methods in Ecology and Evolution. 4 (4): 336–344. arXiv:1112.0651. Bibcode:2013MEcEv...4..336G. doi:10.1111/2041-210x.12018. S2CID 2108402.
External links
[edit]Mantel test
View on GrokipediaBackground and Overview
Definition and Purpose
The Mantel test is a non-parametric statistical method designed to evaluate the correlation between two symmetric distance matrices, each capturing pairwise dissimilarities among the same set of objects for different variables, such as genetic distances and geographic distances.[1][2] These matrices typically represent multivariate data where direct variable-by-variable comparisons are impractical, allowing the test to assess overall associations without requiring the data to be in Euclidean space.[5] The primary purpose of the Mantel test is to test the null hypothesis that there is no association between the two distance matrices, providing a robust approach for detecting relationships in datasets that may involve non-linear patterns or non-metric dissimilarities.[2] This makes it particularly valuable in multivariate analysis, where traditional parametric methods might fail due to violations of assumptions like linearity or normality.[5] For instance, it can conceptually examine whether an environmental distance matrix correlates with a species composition dissimilarity matrix, revealing potential ecological linkages without assuming specific distributional forms.[2] A key advantage of the Mantel test lies in its ability to handle non-metric data and avoid parametric assumptions, such as multivariate normality, thereby enabling reliable inference in complex, real-world datasets where such conditions are rarely met.[2] Significance under the null hypothesis is assessed via a permutation-based procedure, which resamples the data to generate an empirical distribution of test statistics.[5]Historical Development
The Mantel test was introduced by Nathan Mantel in 1967 as a statistical method for detecting disease clustering and testing associations between incidence matrices in epidemiological contexts, such as spatiotemporal patterns of leukemia.[6] Originally framed as a generalized regression approach to matrix correspondence, it provided a non-parametric way to assess linear relationships while accounting for interdependencies in pairwise data.[7] During the 1970s and 1980s, the test gained traction in ecology for spatial analysis, with Robert R. Sokal applying it first in biology in 1979 to examine geographic variation in taxonomic data.[2] Ecologists like Pierre Legendre further popularized its use in the 1980s, integrating it into studies of community structure and environmental gradients through distance matrix comparisons.[8] A key milestone came in 1986 with the development of the partial Mantel test by Peter E. Smouse, Jeffrey C. Long, and Robert R. Sokal, which extended the original method to control for confounding variables via multiple regression on matrices.[9] By the 1990s, the Mantel test saw widespread adoption in population genetics, becoming a standard tool for evaluating isolation by distance and spatial genetic structure.[4] Advancements in computing power during the 2000s made permutation-based significance testing viable for larger matrices, broadening the test's applicability to more complex datasets. In recent years up to 2025, the Mantel test has integrated with high-throughput genomic data, as evidenced by 2023 benchmarking studies evaluating its performance against alternatives for matrix associations in evolutionary and genetic analyses, alongside 2022 efforts to address criticisms of its extensions.[10][11]Mathematical Foundations
Core Test Statistic
The Mantel test requires two symmetric distance matrices, and , where the diagonal elements are zeros and the off-diagonal elements and (for ) represent pairwise distances or dissimilarities between objects or locations.[1][2] The core test statistic, denoted , is formulated as the Pearson product-moment correlation coefficient applied to the corresponding off-diagonal elements of and . This is derived by vectorizing the upper (or lower) triangular portions of the matrices into vectors of length , excluding the diagonals, and computing their correlation. The explicit formula is where and are the means of the off-diagonal elements in and , respectively. This statistic originates from the original sum-of-products form proposed by Mantel, which was later standardized to the correlation form for comparability with Pearson's .[1] The normalization ensures ranges from -1 (perfect negative linear association between distances) to +1 (perfect positive linear association), with 0 indicating no linear association; values near ±1 suggest strong monotonic relationships interpretable in context, such as isolation by distance in spatial data.[2] The test assumes that the distance matrices derive from Euclidean configurations or can be converted to such (e.g., via principal coordinates analysis to ensure non-negative eigenvalues); non-Euclidean dissimilarities may distort interpretations if underlying data violate metric properties. Additionally, the null model assumes no inherent spatial structure or association between the matrices, with exchangeability among off-diagonal elements under permutation.[2]Permutation Procedure for Significance
The null hypothesis of the Mantel test posits no association between the two matrices, implying that the correspondence between their elements is random while preserving the internal structure of each matrix. To generate the null distribution, rows and columns of one matrix (typically the second) are simultaneously permuted to disrupt the pairwise associations without altering the matrix's symmetry or marginal properties. The procedure begins by computing the Mantel correlation coefficient for the original matrices. Then, random permutations are generated—commonly 999 to 9999 for adequate precision—and is recalculated for each permuted version. The one-sided p-value is obtained by ranking the observed against the permuted values, specifically as , where is the number of permuted greater than or equal to the observed value; this conservative adjustment avoids p-values of exactly zero.[12] Computationally, simultaneous row and column permutations ensure the matrix remains symmetric, which is essential for distance or similarity matrices with zero diagonals. For small sample sizes (e.g., fewer than 7 objects), an exact test can enumerate all possible permutations; otherwise, Monte Carlo sampling suffices, though larger improves accuracy at higher computational cost. Under the assumption of independent and identically distributed data, the permutation procedure controls the Type I error rate at nominal levels (e.g., 5%). However, it is sensitive to autocorrelation within the matrices, which can inflate Type I error rates significantly (e.g., up to 40% or more in moderate autocorrelation scenarios).[13] Implementations are available in R packages such as vegan (using themantel function with default 999 permutations) and ade4 (via mantel.rtest), both relying on permutations for significance; some software also offers asymptotic approximations, such as a pseudo-F test, for large samples.[14][15][16]
