Recent from talks
Nothing was collected or created yet.
Vectorization (mathematics)
View on WikipediaIn mathematics, especially in linear algebra and matrix theory, the vectorization of a matrix is a linear transformation which converts the matrix into a vector. Specifically, the vectorization of a m × n matrix A, denoted vec(A), is the mn × 1 column vector obtained by stacking the columns of the matrix A on top of one another: Here, represents the element in the i-th row and j-th column of A, and the superscript denotes the transpose. Vectorization expresses, through coordinates, the isomorphism between these (i.e., of matrices and vectors) as vector spaces.
For example, for the 2×2 matrix , the vectorization is .
The connection between the vectorization of A and the vectorization of its transpose is given by the commutation matrix.
Compatibility with Kronecker products
[edit]The vectorization is frequently used together with the Kronecker product to express matrix multiplication as a linear transformation on matrices. In particular, for matrices A, B, and C of dimensions k×l, l×m, and m×n.[note 1] For example, if (the adjoint endomorphism of the Lie algebra gl(n, C) of all n×n matrices with complex entries), then , where is the n×n identity matrix.
There are two other useful formulations:
If B is a diagonal matrix (i.e., ), the vectorization can be written using the column-wise Kronecker product (see Khatri-Rao product) and the main diagonal of B:
More generally, it has been shown that vectorization is a self-adjunction in the monoidal closed structure of any category of matrices.[1]
Compatibility with Hadamard products
[edit]Vectorization is an algebra homomorphism from the space of n × n matrices with the Hadamard (entrywise) product to Cn2 with its Hadamard product:
Compatibility with inner products
[edit]Vectorization is a unitary transformation from the space of n×n matrices with the Frobenius (or Hilbert–Schmidt) inner product to Cn2: where the superscript † denotes the conjugate transpose.
Vectorization as a linear sum
[edit]The matrix vectorization operation can be written in terms of a linear sum. Let X be an m × n matrix that we want to vectorize, and let ei be the i-th canonical basis vector for the n-dimensional space, that is . Let Bi be a (mn) × m block matrix defined as follows:
Bi consists of n block matrices of size m × m, stacked column-wise, and all these matrices are all-zero except for the i-th one, which is a m × m identity matrix Im.
Then the vectorized version of X can be expressed as follows:
Multiplication of X by ei extracts the i-th column, while multiplication by Bi puts it into the desired position in the final vector.
Alternatively, the linear sum can be expressed using the Kronecker product:
Half-vectorization
[edit]For a symmetric matrix A, the vector vec(A) contains more information than is strictly necessary, since the matrix is completely determined by the symmetry together with the lower triangular portion, that is, the n(n + 1)/2 entries on and below the main diagonal. For such matrices, the half-vectorization is sometimes more useful than the vectorization. The half-vectorization, vech(A), of a symmetric n × n matrix A is the n(n + 1)/2 × 1 column vector obtained by vectorizing only the lower triangular part of A:
For example, for the 2×2 matrix , the half-vectorization is .
There exist unique matrices transforming the half-vectorization of a matrix to its vectorization and vice versa called, respectively, the duplication matrix and the elimination matrix.
Programming language
[edit]Programming languages that implement matrices may have easy means for vectorization.
In Matlab/GNU Octave a matrix A can be vectorized by A(:).
GNU Octave also allows vectorization and half-vectorization with vec(A) and vech(A) respectively. Julia has the vec(A) function as well.
In Python NumPy arrays implement the flatten method,[note 1] while in R the desired effect can be achieved via the c() or as.vector() functions or, more efficiently, by removing the dimensions attribute of a matrix A with dim(A) <- NULL. In R, function vec() of package 'ks' allows vectorization and function vech() implemented in both packages 'ks' and 'sn' allows half-vectorization.[2][3][4]
Applications
[edit]Vectorization is used in matrix calculus and its applications in establishing e.g., moments of random vectors and matrices, asymptotics, as well as Jacobian and Hessian matrices.[5] It is also used in local sensitivity and statistical diagnostics.[6]
Notes
[edit]See also
[edit]References
[edit]- ^ Macedo, H. D.; Oliveira, J. N. (2013). "Typing Linear Algebra: A Biproduct-oriented Approach". Science of Computer Programming. 78 (11): 2160–2191. arXiv:1312.4818. doi:10.1016/j.scico.2012.07.012. S2CID 9846072.
- ^ Duong, Tarn (2018). "ks: Kernel Smoothing". R package version 1.11.0.
- ^ Azzalini, Adelchi (2017). "The R package 'sn': The Skew-Normal and Related Distributions such as the Skew-t". R package version 1.5.1.
- ^ Vinod, Hrishikesh D. (2011). "Simultaneous Reduction and Vec Stacking". Hands-on Matrix Algebra Using R: Active and Motivated Learning with Applications. Singapore: World Scientific. pp. 233–248. ISBN 978-981-4313-69-8 – via Google Books.
- ^ Magnus, Jan; Neudecker, Heinz (2019). Matrix differential calculus with applications in statistics and econometrics. New York: John Wiley. ISBN 978-1-119-54120-2.
- ^ Liu, Shuangzhe; Leiva, Victor; Zhuang, Dan; Ma, Tiefeng; Figueroa-Zúñiga, Jorge I. (March 2022). "Matrix differential calculus with applications in the multivariate linear model and its diagnostics". Journal of Multivariate Analysis. 188 104849. doi:10.1016/j.jmva.2021.104849.
Vectorization (mathematics)
View on GrokipediaDefinition and Notation
Basic Definition
In linear algebra, vectorization is an operation that transforms an matrix into a column vector of dimension by stacking the columns of into a single vector while preserving all entries of the original matrix. This process, often denoted by the operator, rearranges the elements of such that the -th column becomes the segment from the -th to the -th position in the resulting vector. The vectorization operation establishes a bijective mapping, or isomorphism, between the vector space of all matrices over the real numbers (denoted ) and the Euclidean space , allowing matrices to be treated as elements of a standard vector space for algebraic manipulations. This isomorphism preserves the underlying linear structure, enabling the application of vector space theorems and operations directly to matrices. A simple example illustrates the process: for a matrix , the vectorization yields , following the column-stacking convention where the first column precedes the second column . This convention, rooted in the need for consistency in matrix calculus and linear transformations, ensures that the order of elements aligns with standard indexing in tensor products and Kronecker products. Vectorization is particularly useful because it facilitates the conversion of matrix equations into vector equations, simplifying the analysis of systems involving multiple matrices and enabling the use of efficient vector-based algorithms in numerical linear algebra.Stacking Procedure and Notation
The stacking procedure for the vectorization of an matrix in column-major order begins by extracting the columns sequentially from left to right and concatenating them into a single column vector of length . The first column, consisting of entries , forms the initial elements of . This is followed by the second column occupying positions to $2mna_{1n}, a_{2n}, \dots, a_{mn}(n-1)m+1nm$. This column-wise concatenation ensures that the resulting vector preserves the vertical structure of each column while linearizing the entire matrix.[9] The standard mathematical notation for this operation is , which denotes the column vector obtained via the described stacking. This notation is widely adopted in linear algebra and has been formalized in foundational treatments of matrix calculus.[10] Alternative terms, such as the inverse of matricization from tensor analysis, occasionally appear but are less emphasized in core matrix theory contexts.[11] The precise positioning of entries within follows the index formula where, for and , the -th entry with equals : This mapping systematically assigns matrix indices to vector positions, facilitating algebraic manipulations.[9] Column-major stacking is the conventional procedure in mathematical literature for the operator, corresponding to the default storage format in numerical software like Fortran. In contrast, row-major order—where rows are stacked sequentially—is common in computing environments influenced by C, such as NumPy in Python; consistency in choice is essential to prevent discrepancies in index-based operations or implementations.[10]Core Properties
Kronecker Product Compatibility
One of the fundamental properties of the vectorization operator is its compatibility with the Kronecker product in the context of matrix multiplication. For conformable matrices , , and , the identity states that This relation, first systematically explored in the context of system theory and matrix calculus, transforms the product of three matrices into a linear operation on the vectorized form of the middle matrix. To derive this identity, consider the column-wise structure of matrix multiplication. The -th column of is given by , where is the -th column of . Since with the columns of , it follows that . Vectorizing this column yields . Stacking over all produces , where are standard basis vectors. Recognizing that and , this simplifies to the action of on .[12] This identity is particularly useful for reformulating matrix equations into vectorized linear systems, which can be solved using standard techniques for vectors. For instance, equations of the form can be rewritten as , enabling efficient computation via Kronecker-structured matrices, especially in applications like least squares problems or Sylvester equations in control theory.[13] As a concrete example, consider the two-matrix case, which is a special instance with . Let and . Then , so . Using the identity with , , where . Computing this gives and , matching upon stacking.[14] The three-matrix form generalizes naturally to chains of multiplications, such as by grouping as the middle term, preserving the structure for longer products while avoiding explicit computation of intermediates. This compatibility underscores vectorization's role in linear algebra, bridging matrix operations with vector spaces through the Kronecker product.Hadamard Product Compatibility
The vectorization operator exhibits compatibility with the Hadamard (elementwise) product, serving as an algebra homomorphism between the space of matrices equipped with the Hadamard product and the corresponding vector space under elementwise multiplication. Specifically, for matrices and of compatible dimensions (i.e., the same size), the identity holds, where denotes the Hadamard product on both sides.[15] This relation underscores the structural alignment between matrix and vector representations under entrywise operations. A proof sketch follows directly from the definition of vectorization: the operator stacks matrix columns into a single column vector while preserving the relative positions of all entries. Since the Hadamard product multiplies entries at identical positions in and , the resulting matrix has entries that, when stacked, match the elementwise product of the already-stacked vectors and .[16] Thus, the reshaping inherent to vectorization commutes with the entrywise multiplication. This homomorphism property has implications for componentwise functions applied to matrices of equal dimensions, enabling such operations to be performed equivalently in vectorized form. For instance, if a scalar function is applied elementwise to a matrix , then , where on the vector is also elementwise; this facilitates computations in vector spaces while maintaining consistency with matrix algebra. To illustrate, consider the matrices The Hadamard product is so Meanwhile, and their Hadamard product yields confirming the identity.[15]Frobenius Inner Product Preservation
The vectorization operator preserves the structure of the Frobenius inner product defined on matrices. For any two matrices and , the Frobenius inner product is given by , which equals the standard Euclidean inner product of their vectorizations: .[17] This identity establishes a direct correspondence between the bilinear form on the matrix space and the one on the vector space induced by vectorization. To see why this holds, expand the trace expression. The trace , which is the sum of the products of corresponding entries in and . The vectorization stacks the columns of sequentially, so computes exactly the same sum of entry-wise products, confirming the equality.[17] This preservation implies that vectorization is an isometry between the matrix space equipped with the Frobenius inner product and the Euclidean vector space. In particular, it is a unitary linear transformation that maintains distances and angles under these metrics, ensuring , where denotes the Frobenius norm and the Euclidean norm. For illustration, consider the matrices The Frobenius inner product is . The vectorizations are and , with dot product , verifying the preservation.Linear and Structural Aspects
Representation as a Linear Transformation
The vectorization operator, denoted , maps an matrix to a column vector in by stacking the columns of . This operator is linear, meaning that for any matrices and scalars , The proof follows directly from the stacking definition: the -th entry of is , where and are the corresponding entries of and , matching the -th entry of .[17] An explicit representation of vectorization as a linear combination arises from the standard basis of . For , where denotes the Kronecker product and is the -th column of . This sum stacks the scaled basis vectors to reconstruct the vectorized form, underscoring the operator's linearity over the matrix space. The vectorization operator itself can be viewed as multiplication by an permutation matrix that reorders the entries of the matrix to achieve column stacking, though the basis sum provides a constructive linear expression without explicit enumeration of the permutation.[17] As a linear map from the space of matrices (isomorphic to ) to , it preserves dimension and is bijective, with the inverse given by reshaping the vector into the original matrix layout.[17]Isomorphism Between Matrix and Vector Spaces
The vectorization operator defines a linear isomorphism between the space of real matrices, denoted , and the Euclidean vector space . This mapping stacks the columns of a matrix into a single column vector, establishing a bijective correspondence that preserves the underlying vector space structure. As a linear transformation, it maintains addition and scalar multiplication: and for any matrices and scalar .[18] The bijectivity of ensures it is both injective and surjective. Injectivity holds because distinct matrices differ in at least one entry, which corresponds to a differing component in their vectorized forms, so implies . Surjectivity follows from the explicit inverse, unvectorization (or reshaping), which partitions any vector into segments of length to form the columns of a unique matrix in . This invertibility confirms that is a one-to-one correspondence, rendering the two spaces structurally equivalent.[18] Through this isomorphism, inherits all vector space properties from , including dimension , linear independence, and spanning sets, with operations defined via the inverse map. For instance, matrix addition and scalar multiplication align directly with those in the vector space under . The standard basis \{E_{ij}\}_{i=1}^m_{j=1}^n for , where has a 1 in position and zeros elsewhere, maps to the basis \{e_i \otimes e_j\}_{i=1}^m_{j=1}^n in , with (noting the column-stacking convention). This basis correspondence underscores the equivalence, allowing matrices to be treated as vectors in abstract linear algebra contexts.[18] This isomorphism enables the application of vector-based algorithms to matrix problems by transforming them into equivalent vector formulations. For example, the matrix equation (with , , ) vectorizes to , which can then be solved using standard vector least squares methods when overdetermined. Such transformations simplify computations in optimization and statistics, leveraging efficient vector space solvers without altering the problem's structure.[18]Variants
Half-Vectorization
In the context of symmetric matrices, half-vectorization provides a compact representation by focusing on the unique elements. For an symmetric matrix , the half-vectorization operator, denoted , produces an column vector by stacking the elements of the lower triangular part of , including those on the main diagonal.[19] This approach leverages the symmetry to exclude redundant upper triangular entries.[19] For instance, consider the symmetric matrix Here, .[19] The half-vectorization relates directly to the full vectorization operator via , where is the elimination matrix—a selection matrix with 1's placed in the rows corresponding to the lower triangular positions of , and 0's elsewhere.[19] This matrix effectively discards the duplicated elements above the diagonal in the full vectorized form.[19] A key advantage of half-vectorization lies in its dimensionality reduction from to elements, which is essential for efficient parameterizations of symmetric matrices in applications requiring uniqueness without redundancy.[19] To illustrate for a larger matrix, take the symmetric matrix The half-vectorization yields , a 6-dimensional vector.[19] In contrast, the full vectorization produces , an 9-dimensional vector that repeats the off-diagonal elements , , and .[19] This comparison highlights how half-vectorization streamlines the representation while preserving all distinct information.[19]Duplication and Elimination Matrices
In the context of vectorizing symmetric matrices, the elimination matrix and the duplication matrix provide the linear algebraic framework for converting between the full vectorization and the half-vectorization .[20] The elimination matrix is a matrix, where , such that for any symmetric matrix . Its entries consist solely of 0s and 1s, with if the -th index of corresponds to the -th position in the lower triangular part of (including the diagonal), and otherwise; this structure selects only the unique elements while discarding the redundant upper triangular entries.[20] The duplication matrix is the matrix satisfying for symmetric . The entries of are also 0s and 1s: for each column corresponding to a diagonal element in , there is a single 1 in the row matching the -position of ; for each off-diagonal element () in , the column has 1s in both the rows corresponding to the - and -positions of , thereby duplicating the value to enforce symmetry.[20] These matrices exhibit complementary properties: has full column rank , while has full row rank ; is a partial isometry, satisfying . The composition , the identity matrix, holds unconditionally, reflecting that applying after recovers the unique elements exactly. In contrast, is the symmetrizer matrix that maps to ; for symmetric , , so acts as the identity on the subspace of vectorized symmetric matrices.[20] For an explicit example with , consider the symmetric matrix where and . The elimination matrix is so The duplication matrix is so This construction verifies the relation and illustrates how duplicates the off-diagonal entry across symmetric positions.[20]Applications
In Matrix Calculus
In matrix calculus, vectorization facilitates the computation of derivatives for functions defined on matrix arguments by transforming matrix operations into vector operations, particularly through the use of Kronecker products for Jacobians and Hessians. For a vector-valued function , the Jacobian matrix at a point is defined as , which is a matrix. This formulation leverages the Kronecker product identity to express chain rules in a compact vectorized form, enabling systematic differentiation of composite functions.[21] For scalar-valued functions , the Hessian is represented in vectorized form as , obtained from second-order differentials using the same Kronecker product identities to handle products and compositions. The second differential is , where the symmetric Hessian matrix is , ensuring compatibility with quadratic forms in optimization and statistical applications. This approach simplifies the application of chain rules by vectorizing higher-order terms, such as in for second derivatives.[21] A concrete example illustrates this in the derivative of the quadratic trace function , where and is fixed. The vectorized gradient with respect to yields , derived from the first differential , which uses the Kronecker structure for left-multiplication terms: and .[17] The systematic use of vectorization in matrix differentiation was popularized in the seminal text by Magnus and Neudecker, which established these rules for handling Jacobians and Hessians in a vec-based framework to avoid index-heavy computations.[22]In Multivariate Statistics and Optimization
In multivariate statistics, vectorization facilitates the computation of moments for random matrices derived from multivariate normal distributions. For a random vector , the expected value of the vectorized outer product is , which follows directly from the linearity of the expectation operator and the known second-moment formula for the multivariate normal. This expression is fundamental for deriving higher-order moments and cumulants of functions involving random matrices, such as those arising in sample covariance structures, where stacking columns allows for compact representation using Kronecker products. For instance, the covariance of involves the commutation matrix and takes the form when , enabling efficient calculation of variability in matrix-valued statistics. Vectorization also plays a key role in asymptotic theory and sensitivity analysis for estimators in multivariate settings, particularly for covariance matrix estimation. In large-sample approximations, the vectorized sample covariance , where , converges in distribution to a normal with mean and asymptotic covariance under multivariate normality, providing the basis for Wald-type tests and confidence regions. For parameter sensitivity, such as in maximum likelihood estimation of , the influence function of the estimator can be expressed in vectorized form as , allowing assessment of robustness to outliers in high-dimensional data.[23] This vectorized sensitivity measure quantifies how perturbations in individual observations affect the entire covariance structure, aiding in diagnostic checks for model adequacy. In optimization contexts, vectorization transforms matrix constraints into vector spaces, simplifying semidefinite programming (SDP) formulations involving positive semidefinite matrices. For an SDP of the form subject to , vectorizing yields subject to and the linear image of being positive semidefinite, where ; however, for symmetric , half-vectorization reduces redundancy by mapping to the lower triangular part, halving the dimension while preserving the cone structure via duplication matrices. This reformulation is crucial for efficient interior-point solvers in applications like portfolio optimization, where covariance constraints are imposed on vectorized risk measures. Half-vectorization is particularly useful here for symmetric positive definite constraints in covariance modeling. Vectorization extends to statistical diagnostics in multivariate regression, where influence functions and leverage scores are computed in stacked form to evaluate observation impact. In multivariate linear models, the influence function for the coefficient matrix is in vectorized coordinates, revealing how single cases affect parameter estimates across responses.[24] Leverage scores, generalizing univariate diagnostics, are derived from the trace of the projection matrix in the vectorized space, , where is the hat matrix, to identify influential designs in joint response modeling. A representative application is the vectorized least squares estimator in matrix-variate regression, where observations for with row-covariance and column-covariance yield the model . The ordinary least squares solution is then , simplifying inference under matrix normal errors and enabling tests for structured coefficients in multi-response settings like econometrics.[25]Computational Implementation
Support in Programming Languages
In MATLAB and GNU Octave, the standard way to vectorize a matrix is using the syntax , which stacks the elements column by column to form a column vector in column-major order. This operation is fundamental to MATLAB's array indexing system and is compatible across both environments. GNU Octave extends this with a built-invech function that performs half-vectorization by stacking the lower triangular portion (including the diagonal) of a symmetric square matrix. In MATLAB, half-vectorization lacks a core built-in function but is supported through dedicated routines in toolboxes like the Econometrics Toolbox or via contributed functions that extract lower triangular elements without including zeros (using tril alone would include zeros when vectorized).
Python's NumPy library provides vectorization through methods like A.flatten('F') or A.ravel('F'), where the 'F' argument enforces Fortran-style column-major ordering to stack columns, aligning with the mathematical vec operator; the default 'C' order is row-major. NumPy does not include a native half-vectorization function, but it can be achieved by indexing the lower triangular part with np.tril_indices(n) on an matrix, effectively selecting the relevant elements for vech.
Julia offers the vec(A) function in its base library, which reshapes the matrix into a column vector by stacking columns in column-major order, or alternatively reshape(A, :, 1) for the same effect. Julia does not have a built-in half-vectorization function in the standard library; it can be implemented manually by collecting the lower triangular elements column-wise, for example, using a comprehension like [A[i, j] for j in 1:n for i in j:n], or via packages such as VectorizationTransformations.jl which provides a vech function.[26]
In the R programming language, vectorization of a matrix uses as.vector(A) or the concatenation operator c(A), both of which arrange elements in column-major order by default, consistent with R's internal storage convention. Half-vectorization is not built into the base language but is readily available through the vech function in the matrixcalc package, which stacks the lower triangular elements of a square matrix into a single column.
The following table compares the syntax for vectorization and half-vectorization across these languages, highlighting default memory orders:
| Language/Library | Vectorization Syntax | Default Order | Half-Vectorization Syntax |
|---|---|---|---|
| MATLAB/Octave | A(:) | Column-major | Octave: vech(A); MATLAB: toolbox/contributed functions or manual extraction (tril alone includes zeros when vectorized) |
| Python (NumPy) | A.flatten('F') or A.ravel('F') | Row-major ('C'); Column-major ('F') | Via A[np.tril_indices(n)] |
| Julia | vec(A) or reshape(A, :, 1) | Column-major | Manual (e.g., [A[i,j] for j=1:n for i=j:n]) or vech(A) (VectorizationTransformations.jl package) |
| R | as.vector(A) or c(A) | Column-major | vech(A) (matrixcalc package) |
Practical Examples and Considerations
In Python, using NumPy, the mathematical vectorization operator vec(A), which stacks the columns of a matrix A into a single column vector, can be implemented via theflatten method with Fortran-order ('F') to ensure column-major stacking, as NumPy defaults to row-major ('C') order. For example, consider a 2×2 matrix:
import numpy as np
A = np.array([[1, 2], [3, 4]])
vec_A = A.flatten(order='F')
print(vec_A) # Output: [1 3 2 4]
import numpy as np
A = np.array([[1, 2], [3, 4]])
vec_A = A.flatten(order='F')
print(vec_A) # Output: [1 3 2 4]
manual_vec = np.concatenate(A[:, np.newaxis, :], axis=0).flatten()
print(np.allclose(vec_A, manual_vec)) # Output: True
manual_vec = np.concatenate(A[:, np.newaxis, :], axis=0).flatten()
print(np.allclose(vec_A, manual_vec)) # Output: True
A = [1 2; 3 4];
B = [5 6; 7 8];
C = [9 10; 11 12];
vec_ABC_direct = (A * B * C)(:); % Direct matrix product then vectorize
kron_vec = kron(C', A) * B(:); % Using identity
isequal(vec_ABC_direct, kron_vec) % Output: logical 1 (true)
A = [1 2; 3 4];
B = [5 6; 7 8];
C = [9 10; 11 12];
vec_ABC_direct = (A * B * C)(:); % Direct matrix product then vectorize
kron_vec = kron(C', A) * B(:); % Using identity
isequal(vec_ABC_direct, kron_vec) % Output: logical 1 (true)
