Recent from talks
Nothing was collected or created yet.
Procrustes analysis
View on Wikipedia
In statistics, Procrustes analysis is a form of statistical shape analysis used to analyse the distribution of a set of shapes. The name Procrustes (Greek: Προκρούστης) refers to a bandit from Greek mythology who made his victims fit his bed either by stretching their limbs or cutting them off.
In mathematics:
- an orthogonal Procrustes problem is a method which can be used to find out the optimal rotation and/or reflection (i.e., the optimal orthogonal linear transformation) for the Procrustes Superimposition (PS) of an object with respect to another.
- a constrained orthogonal Procrustes problem, subject to det(R) = 1 (where R is an orthogonal matrix), is a method which can be used to determine the optimal rotation for the PS of an object with respect to another (reflection is not allowed). In some contexts, this method is called the Kabsch algorithm.
When a shape is compared to another, or a set of shapes is compared to an arbitrarily selected reference shape, Procrustes analysis is sometimes further qualified as classical or ordinary, as opposed to generalized Procrustes analysis (GPA), which compares three or more shapes to an optimally determined "mean shape".
Introduction
[edit]To compare the shapes of two or more objects, the objects must be first optimally "superimposed". Procrustes superimposition (PS) is performed by optimally translating, rotating and uniformly scaling the objects. In other words, both the placement in space and the size of the objects are freely adjusted. The aim is to obtain a similar placement and size, by minimizing a measure of shape difference called the Procrustes distance between the objects. This is sometimes called full, as opposed to partial PS, in which scaling is not performed (i.e. the size of the objects is preserved). Notice that, after full PS, the objects will exactly coincide if their shape is identical. For instance, with full PS two spheres with different radii will always coincide, because they have exactly the same shape. Conversely, with partial PS they will never coincide. This implies that, by the strict definition of the term shape in geometry, shape analysis should be performed using full PS. A statistical analysis based on partial PS is not a pure shape analysis as it is not only sensitive to shape differences, but also to size differences. Both full and partial PS will never manage to perfectly match two objects with different shape, such as a cube and a sphere, or a right hand and a left hand.
In some cases, both full and partial PS may also include reflection. Reflection allows, for instance, a successful (possibly perfect) superimposition of a right hand to a left hand. Thus, partial PS with reflection enabled preserves size but allows translation, rotation and reflection, while full PS with reflection enabled allows translation, rotation, scaling and reflection.
Optimal translation and scaling are determined with much simpler operations (see below).
Ordinary Procrustes analysis
[edit]Here we just consider objects made up from a finite number k of points in n dimensions. Often, these points are selected on the continuous surface of complex objects, such as a human bone, and in this case they are called landmark points.
The shape of an object can be considered as a member of an equivalence class formed by removing the translational, rotational and uniform scaling components.
Translation
[edit]For example, translational components can be removed from an object by translating the object so that the mean of all the object's points (i.e. its centroid) lies at the origin.
Mathematically: take points in two dimensions, say
- .
The mean of these points is where
Now translate these points so that their mean is translated to the origin , giving the point .
Uniform scaling
[edit]Likewise, the scale component can be removed by scaling the object so that the root mean square distance (RMSD) from the points to the translated origin is 1. This RMSD is a statistical measure of the object's scale or size:
The scale becomes 1 when the point coordinates are divided by the object's initial scale:
- .
Notice that other methods for defining and removing the scale are sometimes used in the literature.
Rotation
[edit]Removing the rotational component is more complex, as a standard reference orientation is not always available. Consider two objects composed of the same number of points with scale and translation removed. Let the points of these be , . One of these objects can be used to provide a reference orientation. Fix the reference object and rotate the other around the origin, until you find an optimum angle of rotation such that the sum of the squared distances (SSD) between the corresponding points is minimised (an example of least squares technique).
A rotation by angle gives
- .
where (u,v) are the coordinates of a rotated point. Taking the derivative of with respect to and solving for when the derivative is zero gives
When the object is three-dimensional, the optimum rotation is represented by a 3-by-3 rotation matrix R, rather than a simple angle, and in this case singular value decomposition can be used to find the optimum value for R (see the solution for the constrained orthogonal Procrustes problem, subject to det(R) = 1).
Shape comparison
[edit]The difference between the shape of two objects can be evaluated only after "superimposing" the two objects by translating, scaling and optimally rotating them as explained above. The square root of the above mentioned SSD between corresponding points can be used as a statistical measure of this difference in shape:
This measure is often called Procrustes distance. Notice that other more complex definitions of Procrustes distance, and other measures of "shape difference" are sometimes used in the literature.
Superimposing a set of shapes
[edit]We showed how to superimpose two shapes. The same method can be applied to superimpose a set of three or more shapes, as far as the above mentioned reference orientation is used for all of them. However, Generalized Procrustes analysis provides a better method to achieve this goal.
Generalized Procrustes analysis (GPA)
[edit]GPA applies the Procrustes analysis method to optimally superimpose a set of objects, instead of superimposing them to an arbitrarily selected shape.
Generalized and ordinary Procrustes analysis differ only in their determination of a reference orientation for the objects, which in the former technique is optimally determined, and in the latter one is arbitrarily selected. Scaling and translation are performed the same way by both techniques. When only two shapes are compared, GPA is equivalent to ordinary Procrustes analysis.
The algorithm outline is the following:
- arbitrarily choose a reference shape (typically by selecting it among the available instances)
- superimpose all instances to current reference shape
- compute the mean shape of the current set of superimposed shapes
- if the Procrustes distance between mean and reference shape is above a threshold, set reference to mean shape and continue to step 2.
Variations
[edit]There are many ways of representing the shape of an object. The shape of an object can be considered as a member of an equivalence class formed by taking the set of all sets of k points in n dimensions, that is Rkn and factoring out the set of all translations, rotations and scalings. A particular representation of shape is found by choosing a particular representation of the equivalence class. This will give a manifold of dimension kn-4. Procrustes is one method of doing this with particular statistical justification.
Bookstein obtains a representation of shape by fixing the position of two points called the bases line. One point will be fixed at the origin and the other at (1,0) the remaining points form the Bookstein coordinates.
It is also common to consider shape and scale that is with translational and rotational components removed.
Examples
[edit]Shape analysis is used in biological data to identify the variations of anatomical features characterised by landmark data, for example in considering the shape of jaw bones.[1]
One study by David George Kendall examined the triangles formed by standing stones to deduce if these were often arranged in straight lines. The shape of a triangle can be represented as a point on the sphere, and the distribution of all shapes can be thought of a distribution over the sphere. The sample distribution from the standing stones was compared with the theoretical distribution to show that the occurrence of straight lines was no more than average.[2]
See also
[edit]References
[edit]- ^ "Exploring Space Shape" Archived 2006-09-01 at the Wayback Machine by Nancy Marie Brown, Research/Penn State, Vol. 15, no. 1, March 1994
- ^ "A Survey of the Statistical Theory of Shape", by David G. Kendall, Statistical Science, Vol. 4, No. 2 (May, 1989), pp. 87–99
- F.L. Bookstein, Morphometric tools for landmark data, Cambridge University Press, (1991).
- J.C. Gower, G.B. Dijksterhuis, Procrustes Problems, Oxford University Press (2004).
- I.L.Dryden, K.V. Mardia, Statistical Shape Analysis, Wiley, Chichester, (1998).
External links
[edit]- Extensions to continuum of points and distributions Procrustes Methods, Shape Recognition, Similarity and Docking, by Michel Petitjean.
Procrustes analysis
View on GrokipediaOverview
Definition and Purpose
Procrustes analysis is a statistical technique for superimposing configurations of points, such as landmark coordinates from biological specimens, by removing variations due to translation, rotation, and uniform scaling to isolate underlying shape information.[6] This method, rooted in geometric morphometrics, standardizes disparate point sets into a common framework, allowing for the quantification of shape differences without confounding effects from position, orientation, or size. The purpose of Procrustes analysis is to facilitate direct comparisons of shapes in disciplines like morphometrics, anthropology, and evolutionary biology, where raw landmark data from different individuals or species often differ systematically due to non-shape factors.[7] By aligning configurations, it enables subsequent statistical analyses, such as principal component analysis of shape variation or tests for group differences, providing insights into evolutionary patterns, developmental processes, or functional adaptations. A key prerequisite for Procrustes analysis is the concept of shape as a geometric property invariant to similarity transformations—specifically, translation (location), rotation (orientation), and isotropic scaling (size)—which ensures that only intrinsic form is compared across configurations. This invariance allows the method to focus on homologous landmarks that are biologically meaningful and consistently identifiable.[7] In its basic workflow, Procrustes analysis takes input as matrices of landmark coordinates (typically k landmarks in m dimensions for multiple specimens) and applies transformations to produce aligned configurations, or Procrustes coordinates, which serve as the basis for residual analysis and shape metric computations like Procrustes distance. The foundational approach, Ordinary Procrustes Analysis, performs this superimposition on pairs of configurations to establish optimal alignment.Historical Development
The term "Procrustes analysis" draws its name from the figure in Greek mythology who forced travelers to conform to the length of his bed by either stretching their limbs or amputating them, symbolizing the imposition of uniformity on diverse forms.[8] This metaphorical resonance later inspired statistical methods for aligning configurations to assess underlying similarities. The statistical origins of Procrustes analysis trace back to the orthogonal Procrustes problem, introduced by Peter H. Schönemann in 1966 as a technique for optimally rotating one matrix to match another via an orthogonal transformation, originally applied in factor analysis to align loading matrices.[9] This was extended by John C. Gower in 1975 with generalized Procrustes analysis, which simultaneously aligns multiple configurations through translation, rotation, reflection, and scaling to minimize discrepancies, broadening its utility in multivariate comparisons.[10] A pivotal advancement occurred in the 1980s through David G. Kendall's foundational work on shape theory, where he formalized Procrustes metrics for analyzing configurations modulo similarity transformations, notably in his 1984 paper on shape manifolds and complex projective spaces.[11] In the 1990s, Fred L. Bookstein adopted and refined these methods within geometric morphometrics, emphasizing landmark-based alignments in his 1991 book Morphometric Tools for Landmark Data, which established Procrustes superimposition as a core tool for biological shape studies.[12] The approach evolved from two-dimensional applications to higher-dimensional data, with software implementations facilitating widespread use; for instance, the R package geomorph, introduced in 2013, provides tools for Procrustes analysis of landmarks, curves, and surfaces in 2D and 3D contexts.[13] In the 2020s, Procrustes methods have seen integration with machine learning, particularly for aligning neural network representations, as in analyses of representational similarity and functional gradients to compare model architectures.[14]Mathematical Foundations
Configuration Spaces
In Procrustes analysis, a configuration of landmarks in -dimensional Euclidean space is represented by a matrix , where each row corresponds to the coordinates of a landmark point. This matrix encapsulates the positional information of the points, assuming the landmarks are in general position, meaning the configuration has full rank and the points span the -dimensional space without degeneracy, such as collinearity in 2D. The configuration space is the ambient Euclidean space comprising all possible such matrices, serving as the starting point for shape comparisons. To isolate shape from location effects, configurations are preprocessed by centering, which translates the landmarks so their centroid is at the origin. The centered configuration is given by , where is the centroid vector (the average of the row vectors of ). Equivalently, this can be expressed using the centering matrix , yielding . Centering removes the translational degrees of freedom, reducing the effective dimensionality while preserving relative positions. The shape space in Procrustes analysis is the manifold of configurations modulo Euclidean similarity transformations, which include translations, rotations, and uniform scalings, thereby focusing solely on intrinsic form. In Kendall's framework, after centering and scaling to unit norm (forming the preshape space as a hypersphere of unit radius in dimensions), the shape space emerges as the quotient under rotations, a Riemannian manifold known as Kendall's shape space. For points in 2D, this space has dimension , accounting for the removal of 2 translational, 1 scaling, and 1 rotational parameter. The Procrustes distance provides a natural metric on this space for quantifying shape differences.Procrustes Distance Measures
The Procrustes distance quantifies the dissimilarity between two shapes represented as landmark configurations after optimal alignment under rigid transformations. For centered configurations and of size (with landmarks in -dimensions), the partial Procrustes distance is defined as the minimum Frobenius norm over rotations : where are the singular values of , assuming unit scaling .[15] This measure is invariant to translation and rotation but holds scale constant, making it suitable for comparing shapes normalized to the same size.[15] A variant, the full Procrustes distance, extends this by also optimizing over isotropic scaling : where is the centering matrix.[15] This distance is invariant to the full group of similarity transformations (translation, rotation, and uniform scaling), providing a metric on the shape space that captures pure form differences independent of size.[15] Both distances range from 0 (identical shapes) to a maximum value depending on the variant ( for partial, 1 for full when normalized), and they satisfy the properties of a metric in the pre-shape space.[15] Related measures include the Riemannian metric on Kendall's shape space, which interprets the partial Procrustes distance geometrically as a chord length on the unit hypersphere of pre-shapes, with the intrinsic geodesic distance given by .[15] This arc-length formulation, ranging from 0 to , better reflects the curved geometry of the shape manifold for larger dissimilarities, approximating the Euclidean distances for small variances.[15] Statistically, Procrustes distances serve as measures of shape dissimilarity in variance decomposition and hypothesis testing; for instance, Goodall's F-test uses the ratio of between-group to within-group Procrustes sums of squares to assess significant shape differences, following an approximate F-distribution under normality assumptions in the tangent space.[16] Recent extensions address distributional shapes, such as unlabeled point clouds, via the Procrustes-Wasserstein distance, which combines optimal transport with rigid alignment to minimize where is a coupling matrix with marginals uniform on the points. This barycenter-invariant metric enables comparison of empirical distributions without fixed correspondences, with applications in aligning high-dimensional embeddings and shape populations, as developed in the late 2010s and refined in subsequent works.Ordinary Procrustes Analysis
Removing Translation
In ordinary Procrustes analysis, the initial step addresses differences in location by removing the effects of translation, which manifest as variations in the centroid positions of landmark configurations. This process centers each configuration to achieve translation invariance, allowing shape comparisons to focus solely on relative positions rather than absolute placement in space. The centering procedure involves calculating the centroid of a configuration with landmarks as the arithmetic mean of their coordinates: . Each landmark point is then translated by subtracting this centroid: , resulting in a configuration where the centroid coincides with the origin. For a configuration represented as an matrix (with dimensions), the centered matrix is given by , where is the identity matrix and is a column vector of ones. This centering operation is mathematically justified as it minimizes the contribution of translation to the least-squares criterion used in Procrustes alignment, effectively isolating variance due to position and ensuring the sum of squared distances between configurations is reduced without bias from location shifts. By aligning centroids to the origin, the method renders the configurations invariant under rigid translations, which is essential for equitable shape assessment. The impact of removing translation is to simplify the overall alignment problem, reducing it to adjustments for rotation and uniform scaling in subsequent steps of ordinary Procrustes analysis. As a foundational prerequisite, centering ensures that later optimizations operate on standardized forms, enhancing the accuracy of shape residuals. For illustration, consider two configurations of 2D points representing similar shapes but displaced by a translation vector ; after centering, both sets have their centroids at the origin, enabling their forms to overlap precisely in position for further comparison.Uniform Scaling Adjustment
In ordinary Procrustes analysis, the uniform scaling adjustment follows the removal of translation and normalizes the centered configurations by applying an isotropic scaling factor, thereby eliminating differences in overall size while preserving relative landmark positions and shapes. This step addresses variations in magnitude that could otherwise confound shape comparisons, ensuring that subsequent alignments focus solely on rotational and reflective differences.[16][1] The centroid size serves as the key measure of overall size in this context, defined as the square root of the sum of squared Euclidean distances from each landmark to the configuration's centroid. Mathematically, for a centered configuration matrix with landmarks, the centroid size is given by where denotes the -th row of . This metric captures the isotropic scale of the configuration without regard to orientation or position.[7][1] To perform the scaling, the scale factor is computed as the centroid size, and the normalized configuration is then obtained by , which sets the centroid size to 1. This procedure is applied independently to each configuration before alignment.[1][16] The justification for this uniform scaling lies in its ability to standardize configurations for direct shape comparison, as it removes size variability while maintaining the integrity of inter-landmark distances scaled proportionally.[7] By resulting in unit-norm matrices (where ), it facilitates numerical stability and comparability across datasets in shape analysis.[1] Unlike methods allowing anisotropic scaling, which permit differential adjustments along coordinate axes, ordinary Procrustes analysis restricts scaling to be uniform (isotropic) to ensure that only overall size is neutralized without distorting shape aspects related to aspect ratios.[16] This adjustment precedes the optimal rotation step to achieve full size-and-rotation normalization.[1]Optimal Rotation Alignment
In ordinary Procrustes analysis, after centering the configurations to remove translation effects and adjusting for uniform scaling, the remaining discrepancies often arise from differences in orientation. The optimal rotation alignment addresses this by finding an orthogonal matrix that minimizes the Frobenius norm of the residual between the transformed source configuration and the target configuration, effectively superimposing them while preserving distances within each set. The procedure solves the orthogonal Procrustes problem: given centered and scaled matrices and (both ), determine the rotation matrix that minimizes subject to . The closed-form solution is obtained via the singular value decomposition (SVD) of the matrix , where and are orthogonal matrices and is diagonal with non-negative singular values. The optimal is then . This aligns to in the least-squares sense. This minimization is equivalent to maximizing the trace under the orthogonality constraint, which quantifies the alignment quality through the inner product of the configurations after rotation. The solution is unique provided that has full rank or that the singular values of are distinct, ensuring a single optimal orientation. To restrict to proper rotations (excluding reflections), one verifies ; if , the last singular value in can be negated before computing , though this option depends on whether reflections are permissible in the analysis.Shape Residuals and Comparison
In ordinary Procrustes analysis, the shape residuals represent the differences between the superimposed configurations after alignment, defined as , where and are the translated, scaled, and rotated forms of the original landmark configurations and .[17] These residuals isolate the non-affine components of shape variation that remain after removing the effects of location, scale, and orientation.[17] To compare shapes using these residuals, Procrustes analysis of variance (Procrustes ANOVA) decomposes the total variance into components attributable to individual effects, such as group differences or measurement error, with the residuals forming the error term.[17] The sum of squares for the residuals is computed as , which quantifies the squared Euclidean distance between the aligned configurations and serves as a basis for statistical tests like F-ratios in the ANOVA framework.[17] The residuals embody pure shape differences, free from non-shape influences, and are central to assessing the goodness-of-fit in shape comparisons, where smaller Procrustes sums of squares indicate closer alignment and less residual shape variation.[17] This metric enables hypothesis testing on shape variability, such as evaluating whether observed differences exceed expected error levels under isotropic Gaussian assumptions.[17] Visualization of residuals typically involves plotting the aligned landmark configurations to highlight positional deviations, often using thin-plate spline deformation grids to illustrate localized shape changes.[18] For high-dimensional data, residuals are projected onto the tangent space—a linear approximation to the curved shape space—via orthogonal projection of Procrustes coordinates, allowing principal component analysis (PCA) to reduce dimensions and produce interpretable scatter plots of shape variation.[19] Ordinary Procrustes methods assume no reflection in the alignment, restricting transformations to proper rotations to preserve chirality, though improper rotations can be considered separately.[17] Additionally, the approach is sensitive to outliers, which can disproportionately influence the least-squares minimization, though this is mitigated in robust variants that downweight anomalous landmarks.[17]Generalized Procrustes Analysis
Iterative Alignment of Configurations
Generalized Procrustes analysis (GPA) extends ordinary Procrustes analysis to align more than two configurations () by iteratively normalizing them toward a consensus form that minimizes the sum of squared Procrustes distances to the mean across all configurations. This process achieves superposition without favoring any single configuration as a fixed reference, enabling the comparison of multiple shapes or data matrices in fields like morphometrics and sensory analysis.[10] The iterative alignment procedure in GPA proceeds in successive steps. First, an initial reference configuration is selected, typically the average of the unrotated configurations or one derived from principal components of the dataset, to mitigate potential bias from arbitrary choice. All configurations are then centered by subtracting their centroids and scaled to unit centroid size (i.e., Frobenius norm of 1) independently, ensuring comparability under translation and size. A provisional mean shape is computed as the element-wise average of these pre-aligned configurations.[10] Next, each configuration is realigned to the provisional mean using ordinary Procrustes analysis, which applies optimal rotation/reflection () to minimize the residual sum of squares for that pair (translation already handled by centering, and scaling removed by pre-normalization). The mean shape is then updated as the average of these realigned configurations: where denotes the centered and unit-scaled configuration at iteration , followed by normalization of to unit size. This step is repeated, with the updated mean serving as the new reference for the next round of alignments.[10] Convergence is determined when the change in the mean shape between iterations falls below a predefined threshold, such as in the relative Frobenius norm or residual sum-of-squares criterion, ensuring the configuration shifts are negligible. The algorithm converges monotonically and rapidly in practice due to the bounded decrease in the objective function. The iterative nature avoids reliance on a fixed reference, promoting an unbiased consensus across all configurations.[10]Computation of Mean Shape
In Generalized Procrustes Analysis (GPA), the mean shape, also referred to as the consensus or centroid shape, is the configuration that minimizes the pooled within-group sum of squared Procrustes distances across a set of input configurations, after removing the effects of translation, rotation, and uniform scaling. This mean serves as a central reference for subsequent shape comparisons and statistical analysis of shape variation. The computation is inherently iterative, as the optimal alignments depend on the current estimate of the mean, which in turn is derived from those alignments.[10] The process begins with preprocessing: each configuration (an matrix representing landmarks in -dimensional space, for ) is centered by subtracting its centroid to eliminate translation, yielding a centered matrix . Each is then scaled by its centroid size (the Euclidean norm of the centered configuration) to unit size, producing preshape matrices with , where denotes the Frobenius norm. An initial estimate of the mean preshape is selected, often as the unrotated average .[10] Subsequent iterations proceed as follows. For the current mean preshape at iteration :- For each , compute the optimal rotation matrix that aligns to by minimizing . This is achieved via the singular value decomposition (SVD) of the cross-covariance matrix , setting (with adjustment for reflections if needed to ensure proper rotations, i.e., determinant 1).[10]
- Align each configuration: .
- Update the mean preshape:
