Recent from talks
Nothing was collected or created yet.
Incomplete Cholesky factorization
View on WikipediaIn numerical analysis, an incomplete Cholesky factorization of a symmetric positive definite matrix is a sparse approximation of the Cholesky factorization. An incomplete Cholesky factorization is often used as a preconditioner for algorithms like the conjugate gradient method.
The Cholesky factorization of a positive definite matrix A is A = LL* where L is a lower triangular matrix. An incomplete Cholesky factorization is given by a sparse lower triangular matrix K that is in some sense close to L. The corresponding preconditioner is KK*.
One popular way to find such a matrix K is to use the algorithm for finding the exact Cholesky decomposition in which K has the same sparsity pattern as A (any entry of K is set to zero if the corresponding entry in A is also zero). This gives an incomplete Cholesky factorization which is as sparse as the matrix A.
Motivation
[edit]Consider the following matrix as an example:
If we apply the full regular Cholesky decomposition, it yields:
And, by definition:
However, by applying Cholesky decomposition, we observe that some zero elements in the original matrix end up being non-zero elements in the decomposed matrix, like elements (4,2), (5,2) and (5,3) in this example. These elements are known as "fill-ins".
This is not an issue per se, but it is very problematic when working with sparse matrices, since the fill-ins generation is mostly unpredictable and reduces the matrix sparsity, impacting the efficiency of sparse matrix algorithms.
Therefore, given the importance of the Cholesky decomposition in matrix calculations, it is extremely relevant to repurpose the regular method, so as to eliminate the fill-ins generation. The incomplete Cholesky factorization does exactly that, by generating a matrix L similar to the one generated by the regular method, but conserving the zero elements in the original matrix.
Naturally:
Multiplying matrix L generated by incomplete Cholesky factorization by its transpose won't yield the original matrix, but a similar one.
Algorithm
[edit]For from to :
For from to :
Implementation
[edit]Implementation of the incomplete Cholesky factorization in the GNU Octave language. The factorization is stored as a lower triangular matrix, with the elements in the upper triangle set to zero.
function a = ichol(a)
n = size(a,1);
for k = 1:n
a(k,k) = sqrt(a(k,k));
for i = (k+1):n
if (a(i,k) != 0)
a(i,k) = a(i,k)/a(k,k);
endif
endfor
for j = (k+1):n
for i = j:n
if (a(i,j) != 0)
a(i,j) = a(i,j) - a(i,k)*a(j,k);
endif
endfor
endfor
endfor
for i = 1:n
for j = i+1:n
a(i,j) = 0;
endfor
endfor
endfunction
Sparse example
[edit]Consider again the matrix displayed in the beginning of this article. Since it is symmetric and the method only uses the lower triangular elements, we can represent it by:
More specifically, in its sparse form as a coordinate list, sweeping rows first:
Value 5 -2 -2 -2 5 -2 5 -2 5 -2 5
Row 1 2 4 5 2 3 3 4 4 5 5
Col 1 1 1 1 2 2 3 3 4 4 5
Then, we take the square root of (1,1) and divide the other (i,1) elements by the result:
Value 2.24 -0.89 -0.89 -0.89 | 5 -2 5 -2 5 -2 5
Row 1 2 4 5 | 2 3 3 4 4 5 5
Col 1 1 1 1 | 2 2 3 3 4 4 5
After that, for all the other elements with column greater than 1, calculate (i,j)=(i,j)-(i,1)*(j,1) if (i,1) and (j,1) exist. For example: (5,4) = (5,4)-(5,1)*(4,1) = -2 -(-0.89*-0.89) = -2.8.
Value 2.24 -0.89 -0.89 -0.89 | 4.2 -2 5 -2 4.2 -2.8 4.2
Row 1 2 4 5 | 2 3 3 4 4 5 5
Col 1 1 1 1 | 2 2 3 3 4 4 5
↑ ↑ ↑ ↑
The elements (2,2), (4,4), (5,4) and (5,5) (marked with an arrow) have been recalculated, since they obey this rule. On the other hand, the elements (3,2), (3,3) and (4,3) won't be recalculated since the element (3,1) doesn't exist, even though the elements (2,1) and (4,1) exist.
Now, repeat the process, but for (i,2). Take the square root of (2,2) and divide the other (i,2) elements by the result:
Value 2.24 -0.89 -0.89 -0.89 | 2.05 -0.98 | 5 -2 4.2 -2.8 4.2
Row 1 2 4 5 | 2 3 | 3 4 4 5 5
Col 1 1 1 1 | 2 2 | 3 3 4 4 5
Again, for elements with column greater than 2, calculate (i,j)=(i,j)-(i,2)*(j,2) if (i,2) and (j,2) exist:
Value 2.24 -0.89 -0.89 -0.89 | 2.05 -0.98 | 4.05 -2 4.2 -2.8 4.2
Row 1 2 4 5 | 2 3 | 3 4 4 5 5
Col 1 1 1 1 | 2 2 | 3 3 4 4 5
↑
Repeat for (i,3). Take the square root of (3,3) and divide the other (i,3):
Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 | 2.01 -0.99 | 4.2 -2.8 4.2
Row 1 2 4 5 2 3 | 3 4 | 4 5 5
Col 1 1 1 1 2 2 | 3 3 | 4 4 5
For elements with column greater than 3, calculate (i,j)=(i,j)-(i,3)*(j,3) if (i,3) and (j,3) exist:
Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 | 2.01 -0.99 | 3.21 -2.8 4.2
Row 1 2 4 5 2 3 | 3 4 | 4 5 5
Col 1 1 1 1 2 2 | 3 3 | 4 4 5
↑
Repeat for (i,4). Take the square root of (4,4) and divide the other (i,4):
Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 2.01 -0.99 | 1.79 -1.56 | 4.2
Row 1 2 4 5 2 3 3 4 | 4 5 | 5
Col 1 1 1 1 2 2 3 3 | 4 4 | 5
For elements with column greater than 4, calculate (i,j)=(i,j)-(i,4)*(j,4) if (i,4) and (j,4) exist:
Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 2.01 -0.99 | 1.79 -1.56 | 1.76
Row 1 2 4 5 2 3 3 4 | 4 5 | 5
Col 1 1 1 1 2 2 3 3 | 4 4 | 5
↑
Finally take the square root of (5,5):
Value 2.24 -0.89 -0.89 -0.89 2.05 -0.98 2.01 -0.99 1.79 -1.56 | 1.33
Row 1 2 4 5 2 3 3 4 4 5 | 5
Col 1 1 1 1 2 2 3 3 4 4 | 5
Expanding the matrix to its full form:
Note that in this case no fill-ins were generated compared to the original matrix. The elements (4,2), (5,2) and (5,3) are still zero.
However, if we perform the multiplication of L to its transpose:
We get a matrix slightly different from the original one, since the decomposition didn't take into account all the elements, in order to eliminate the fill-ins.
Sparse implementation
[edit]The sparse version for the incomplete Cholesky factorization (the same procedure presented above) implemented in MATLAB can be seen below. Naturally, MATLAB has its own means for dealing with sparse matrixes, but the code below was made explicit for pedagogic purposes. This algorithm is efficient, since it treats the matrix as a sequential 1D array, automatically avoiding the zero elements.
function A=Sp_ichol(A)
n=size(A,1);
ncols=A(n).col;
c_end=0;
for col=1:ncols
is_next_col=0;
c_start=c_end+1;
for i=c_start:n
if A(i).col==col % in the current column (col):
if A(i).col==A(i).row
A(i).val=sqrt(A(i).val); % take the square root of the current column's diagonal element
div=A(i).val;
else
A(i).val=A(i).val/div; % divide the other current column's elements by the square root of the diagonal element
end
end
if A(i).col>col % in the next columns (col+1 ... ncols):
if is_next_col==0
c_end=i-1;
is_next_col=1;
end
v1=0;
v2=0;
for j=c_start:c_end
if A(j).col==col
if A(j).row==A(i).row % search for current column's (col) elements A(j) whose row index is the same as current element's A(i) row index
v1=A(j).val;
end
if A(j).row==A(i).col % search for current column's (col) elements A(j) whose row index is the same as current element's A(i) column index
v2=A(j).val;
end
if v1~=0 && v2~=0 % if these elements exist in the current column (col), recalculate the current element A(i):
A(i).val=A(i).val-v1*v2;
break;
end
end
end
end
end
end
end
References
[edit]- Incomplete Cholesky factorization at CFD Online wiki
- Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations (3rd ed.), Johns Hopkins, ISBN 978-0-8018-5414-9. See Section 10.3.2.
Incomplete Cholesky factorization
View on GrokipediaIntroduction
Definition
Incomplete Cholesky factorization is an approximation to the complete Cholesky decomposition for symmetric positive definite (SPD) matrices, where a sparse lower triangular matrix is computed such that .[3] This sparsity in is enforced by dropping certain elements—either those outside a predefined sparsity pattern (such as the pattern of itself) or those below a specified threshold—during the factorization process, thereby limiting fill-in compared to the exact factorization.[3] The method was originally developed for matrices arising from finite difference discretizations, particularly symmetric M-matrices, to ensure numerical stability. In mathematical terms, the factorization often incorporates a permutation matrix to minimize fill-in, satisfying , where is a sparse lower triangular matrix and represents the matrix of dropped terms, which is typically neglected in the approximation to yield .[3] This formulation preserves the positive definiteness of the original matrix under certain conditions on the dropping strategy, as the approximate factor remains SPD. The primary purpose of incomplete Cholesky factorization is to serve as a preconditioner in iterative solvers, such as the conjugate gradient method, for large sparse linear systems . By approximating the exact Cholesky factors while significantly reducing computational and storage costs, it accelerates convergence without requiring the full dense factorization.[3] This makes it particularly valuable for applications in scientific computing, such as solving partial differential equations on sparse grids.Historical Development
The development of incomplete Cholesky factorization emerged in the context of early iterative methods for solving large-scale linear systems arising in scientific computing, particularly following the introduction of the conjugate gradient method by Hestenes and Stiefel in 1952, which highlighted the need for effective preconditioners to handle sparse symmetric positive definite matrices. Initial ideas for incomplete factorizations appeared in the late 1950s and early 1960s as approximations to facilitate convergence in finite difference schemes for elliptic partial differential equations. Richard S. Varga formalized one of the earliest variants in 1960, proposing factorization-based splittings for normalized iterative methods applied to boundary value problems, motivated by the computational limitations of direct methods on emerging digital computers. Similar contributions from Buleev in 1960 and Oliphant in 1961 extended these techniques to improve the spectral properties of iteration matrices for difference operators. The 1970s marked a pivotal formalization of incomplete Cholesky factorization, driven by applications in plasma physics and structural analysis where full factorizations were infeasible due to sparsity and scale. In 1977, Meijerink and van der Vorst introduced the IC(0) method, an incomplete Cholesky preconditioner that preserves the sparsity pattern of the lower triangular factor, proving its existence and positive definiteness for symmetric M-matrices and demonstrating its efficacy as a preconditioner for conjugate gradient iterations on five-point finite difference discretizations. Building on this, Kershaw proposed in 1978 a shifted incomplete Cholesky variant to mitigate breakdown issues in plasma simulation problems, where negative pivots could arise, achieving up to 200 times faster convergence than alternating direction implicit methods on challenging systems. These works established incomplete Cholesky as a robust algebraic preconditioner, emphasizing drop tolerance and pivot modifications to balance accuracy and sparsity. Extensions in the 1980s focused on enhancing robustness for general sparse matrices beyond M-matrices, incorporating numerical dropping and ordering strategies to control fill-in in preconditioners for conjugate gradient solvers. Munksgaard's 1980 approach introduced threshold-based dropping during factorization, allowing controlled fill to improve conditioning while limiting storage, particularly for irregular sparse grids in fluid dynamics. Ajiz and Jennings further advanced breakdown prevention in 1984 with a modified incomplete Cholesky scheme that ensures positive definiteness through diagonal adjustments, widely adopted for elliptic PDEs on unstructured meshes. These innovations addressed the sensitivity of early methods to matrix ordering, paving the way for broader use in sparse linear algebra libraries. By the 1990s and 2000s, incomplete Cholesky factorization was integrated into major scientific computing software packages, reflecting its maturity as a standard preconditioner for high-performance simulations. The Portable, Extensible Toolkit for Scientific Computation (PETSc), initiated in the early 1990s, incorporated IC variants including threshold dropping, enabling scalable implementations for parallel architectures in applications like geophysics and climate modeling. Similarly, the hypre library, developed from the late 1990s, included incomplete Cholesky options within its BoomerAMG framework for hybrid preconditioning in large-scale multiphysics problems. Post-2010 developments have emphasized acceleration on modern hardware, with GPU-optimized algorithms emerging around 2012 to parallelize the factorization and solves, achieving speedups of approximately 2 to 3 times over CPU versions for sparse systems in machine learning and computational fluid dynamics.[4] Hybrid variants combining incomplete Cholesky with multigrid or low-rank updates have further extended its applicability to exascale computing challenges. More recent advances, as of 2024, include robust implementations in half-precision arithmetic for improved energy efficiency in large-scale simulations.[5]Mathematical Background
Complete Cholesky Factorization
The Cholesky factorization provides an exact decomposition for symmetric positive definite matrices. Specifically, for an symmetric positive definite matrix , it expresses , where is a lower triangular matrix with positive diagonal entries.[6] This decomposition leverages the symmetry and positive definiteness of to halve the storage and computational requirements compared to general LU factorizations. The algorithm derives from equating entries in the matrix equation . It proceeds column-by-column, computing the elements of recursively. For the diagonal entry at position , and for each off-diagonal entry with , These formulas ensure that the factorization matches exactly, with the square root operation guaranteed to be real and positive due to the positive definiteness of .[6] Key properties of the Cholesky factorization include its computational complexity of approximately floating-point operations for dense matrices, making it efficient for moderate-sized problems.[7] It preserves the positive definiteness of , as the eigenvalues of are nonnegative, and provides an exact representation for dense matrices, though it introduces fill-in (additional nonzero elements) when applied to sparse matrices. Numerically, the factorization is backward stable: the computed factor satisfies , where and is the unit roundoff, ensuring reliable results in finite precision arithmetic without pivoting for positive definite matrices.[8]Challenges with Sparse Matrices
One of the primary challenges in applying complete Cholesky factorization to sparse symmetric positive definite matrices arises from the fill-in problem, where the elimination process introduces new nonzero entries in the lower triangular factor at positions that were zero in the original matrix . This fill-in expands the sparsity pattern of beyond that of , potentially transforming a matrix with nonzeros into one requiring storage and operations in the worst case, thereby undermining the efficiency gains expected from sparsity.[9] The extent of fill-in is highly sensitive to the ordering of the matrix rows and columns, as poor vertex ordering in the associated graph can lead to excessive new connections during elimination, amplifying storage and computational demands. Techniques such as nested dissection address this by recursively partitioning the graph into separators to minimize fill-in, achieving near-optimal reductions for structured problems like regular finite element meshes, where fill-in scales as under suitable ordering. However, determining an optimal ordering remains computationally intensive, often approximating NP-hard minimization problems.[10][9] For large-scale applications, such as finite element methods (FEM) in structural analysis or circuit simulations involving matrices with unknowns, complete Cholesky factorization becomes prohibitive due to its superlinear growth in memory and floating-point operations; for two-dimensional FEM grids, factorization requires operations and storage, escalating to infeasible levels for three-dimensional problems with costs.[9]Incomplete Factorization Variants
IC(0) Method
The IC(0) method, introduced as a basic form of incomplete Cholesky factorization, computes a sparse lower triangular matrix such that the nonzero entries of match exactly the sparsity pattern of the lower triangle of the symmetric positive definite matrix , with all fill-in elements explicitly set to zero during the factorization process.[3] This approach approximates the complete Cholesky decomposition by preserving the original matrix sparsity, thereby reducing both storage requirements and computational complexity, particularly for large sparse systems arising in practical applications like partial differential equation discretizations.[3] The computation follows a modified version of the standard Cholesky algorithm, where elimination steps are restricted to the predefined sparsity pattern, ignoring any contributions from positions outside this pattern. For the diagonal elements, at step , the value is determined by where denotes the sparsity pattern of , ensuring only existing nonzero terms contribute to the sum; is then taken as the positive square root.[3] For off-diagonal elements with and , the formula is again limiting the summation to positions allowed by the pattern, which excludes fill-in interactions.[3] These restricted updates ensure the factorization remains within the original sparsity structure but may introduce approximation errors compared to the full decomposition. A key property of the IC(0) method is its strict preservation of sparsity, which facilitates efficient implementation in sparse matrix formats and makes it suitable as a preconditioner for iterative solvers like the conjugate gradient method.[3] However, the dropping of fill-in can lead to a loss of positive definiteness in , particularly if a computed diagonal pivot becomes negative due to aggressive sparsity enforcement; in such cases, robustness is often achieved by clamping the negative value to a small positive constant (e.g., machine epsilon) or applying a diagonal shift to prior to factorization.[3] This potential breakdown highlights a trade-off between sparsity efficiency and numerical stability, with empirical studies showing that IC(0) performs well for well-structured matrices like those from finite difference discretizations but may require modifications for more irregular sparsity patterns.[3]Higher-Level Incomplete Methods
Higher-level incomplete methods, particularly the IC(k) variants for , extend the basic incomplete Cholesky factorization by allowing controlled fill-in in the lower triangular factor to improve approximation quality while preserving sparsity. In these methods, a fill-in entry in is retained if its generation during Gaussian elimination corresponds to a graph distance of at most in the matrix's sparsity graph or, equivalently, a path length of at most in the elimination tree.[11] This distance measures the minimum number of elimination steps required to produce the entry from the original nonzero pattern.[12] From a graph-theoretic viewpoint, the symmetric matrix is associated with an undirected graph where vertices represent indices and edges indicate nonzero entries. During Cholesky elimination, fill-ins occur along paths connecting previously eliminated nodes; the level of a potential fill-in at position with is defined recursively as the smallest integer such that there exists a sequence of indices linking and through at most intermediate eliminations. IC(k) includes all such positions where , with level-1 fill-ins limited to direct extensions of the original adjacency (e.g., neighbors of existing nonzeros).[11] The retention rule is implemented symbolically prior to numerical factorization. The level for a fill-in position is computed iteratively via starting with if and otherwise, converging after finitely many steps; entries with final are kept.[11] During the numerical phase, for row and column , the entry is provided the denominator is positive; for , the sum includes only immediate predecessors adjacent to both and in the level-0 graph.[11] These approaches yield superior preconditioners compared to the IC(0) baseline (where enforces no fill-in), especially for ill-conditioned systems from discretized PDEs, by clustering eigenvalues more effectively and reducing iteration counts in methods like preconditioned conjugate gradients—e.g., IC(1) often halves iterations relative to IC(0) on 2D Poisson problems at the expense of roughly doubling the nonzero count in .[11] The trade-off balances enhanced accuracy with manageable storage growth, making IC(k) suitable for large-scale applications where full factorization is prohibitive.[12]Threshold Incomplete Factorization
Threshold incomplete factorization, often denoted as IC(τ), is a magnitude-based variant of incomplete Cholesky factorization designed for sparse symmetric positive definite matrices. In this approach, the factorization proceeds by computing potential fill-in elements across the entire lower triangular factor L as in the complete Cholesky decomposition, but discards those whose absolute values fall below a user-specified threshold τ during the process. This dropping strategy allows fill-in in any position but controls sparsity numerically rather than structurally, making it suitable for matrices where fill patterns are not easily predictable from graph distance.[3][13] The core computation for an off-diagonal element l_{ji} (with j > i) follows the standard Cholesky update formula: This value is then evaluated against the threshold: if |l_{ji}| ≥ τ, it is retained; otherwise, it is set to zero. To enhance robustness and ensure non-negativity—particularly to prevent negative pivots that could cause breakdown—a modified form clamps the value before dropping: if the result exceeds τ, and zero otherwise. Diagonal elements l_{ii} are always retained and computed as the square root of the remaining Schur complement to maintain positive definiteness.[3] Variants of this method include absolute thresholding, where τ is a fixed small positive scalar (e.g., 10^{-3}), and relative thresholding, where the drop criterion is |l_{ji}| < τ ⋅ ‖row norm of the current Schur complement‖ or τ ⋅ |l_{ii}|, adapting the tolerance to local matrix scaling. These are frequently combined with diagonal scaling of the original matrix A to normalize row sums or diagonals, improving the preconditioner's effectiveness for ill-conditioned systems. A robust extension, known as the Ajiz-Jennings modification, compensates for dropped elements by adding their contributions to the diagonals, guaranteeing the existence of a positive definite factorization under mild assumptions.[3][14] This factorization is particularly adaptive to matrix conditioning, as larger elements in poorly conditioned regions are more likely to be retained, yielding a better spectral approximation of A ≈ LL^T. Compared to level-based methods, it offers greater flexibility in controlling the trade-off between sparsity and accuracy via τ, but the resulting fill-in pattern is harder to predict a priori since it depends on numerical magnitudes rather than predefined sparsity levels. As τ decreases, the factor becomes denser and closer to the complete factorization, often reducing the number of iterations in preconditioned conjugate gradient methods by 20-50% for typical sparse problems, though at higher computational cost.[3][13]Algorithms and Computation
Basic Algorithm
The basic algorithm for incomplete Cholesky factorization of a symmetric positive definite matrix proceeds in two distinct phases: a symbolic phase and a numeric phase. In the symbolic phase, the sparsity pattern of the lower triangular factor is predetermined, often through graph analysis of (such as identifying the nonzero structure via the adjacency graph) or by executing a dummy factorization that simulates fill-in while adhering to the chosen dropping criterion. This phase ensures that the positions where may have nonzeros are fixed in advance, limiting computational overhead in the subsequent numeric phase.[3][15] The numeric phase computes the values of the nonzero entries in such that , by performing a sparse variant of Gaussian elimination where elements outside the symbolic pattern are dropped (set to zero). For the standard IC(0) variant, which retains the sparsity pattern of the lower triangle of without additional fill-in, the process updates only existing positions. The following pseudocode outlines the core sequential steps for this phase, assuming row-major storage and that the sparsity pattern is precomputed:for i = 1 to n
sum_diag = 0
for k in nonzero positions below i in column i (per pattern)
sum_diag += l_{i k}^2
l_{i i} = sqrt(a_{i i} - sum_diag)
if l_{i i} <= 0 (breakdown handling; see below)
l_{i i} = sqrt(τ) where τ is a small positive tolerance (e.g., machine epsilon)
// Adjust subsequent sums if needed for stability
for j = i+1 to n (only if (j,i) in pattern)
sum_off = 0
for k in intersection of patterns for rows i and j below i
sum_off += l_{j k} * l_{i k}
l_{j i} = (a_{j i} - sum_off) / l_{i i}
// Drop l_{j i} if |l_{j i}| < δ (threshold, if applicable beyond IC(0))
for i = 1 to n
sum_diag = 0
for k in nonzero positions below i in column i (per pattern)
sum_diag += l_{i k}^2
l_{i i} = sqrt(a_{i i} - sum_diag)
if l_{i i} <= 0 (breakdown handling; see below)
l_{i i} = sqrt(τ) where τ is a small positive tolerance (e.g., machine epsilon)
// Adjust subsequent sums if needed for stability
for j = i+1 to n (only if (j,i) in pattern)
sum_off = 0
for k in intersection of patterns for rows i and j below i
sum_off += l_{j k} * l_{i k}
l_{j i} = (a_{j i} - sum_off) / l_{i i}
// Drop l_{j i} if |l_{j i}| < δ (threshold, if applicable beyond IC(0))
Advanced Algorithmic Modifications
Advanced algorithmic modifications to incomplete Cholesky factorization address limitations in scalability, numerical stability, and adaptability, particularly for large-scale sparse systems arising in scientific computing. Task-based parallelism leverages the elimination tree structure of the matrix to enable concurrent processing during factorization. For instance, a task-parallel approach using a 2D partitioned-block layout decomposes the sparse matrix into blocks, allowing independent tasks for computing Schur complements and updates, which can be scheduled dynamically to minimize idle time on multicore architectures. This method extends to distributed-memory settings through left-looking or right-looking supernodal techniques, where supernodes—groups of columns with identical sparsity patterns—are factored in parallel, reducing communication overhead by processing dense blocks within supernodes before scattering updates.[17] To enhance robustness, especially in ensuring positive definiteness for preconditioning symmetric positive definite matrices, the modified incomplete Cholesky (MIC) variant adjusts dropped fill-in elements by incorporating them into the diagonal entries of the factor. In MIC, when a potential fill-in term is discarded during factorization due to a dropping criterion, its contribution is instead added to the pivot diagonal to preserve row sums and maintain the spectrum's positive definiteness, preventing factorization breakdown in ill-conditioned cases.[18] This diagonal compensation, originally proposed in schemes like Jennings-Malik, ensures the approximate factor remains positive definite even for matrices close to singularity, improving convergence in iterative solvers like conjugate gradients.[12] Regarding computational complexity, parallel implementations of these modifications offer theoretical speedup up to O(p) on p processors by exploiting the elimination tree's height for task granularity, though sparsity-induced load imbalance often limits scalability, as irregular nonzero distributions lead to uneven work across processors. Techniques like dynamic scheduling and supernodal partitioning mitigate this by balancing computational loads, achieving near-linear speedups on up to hundreds of cores for matrices with favorable sparsity patterns, such as those from finite element discretizations.[19]Implementations
Dense Matrix Implementation
For dense matrices, where no inherent sparsity pattern exists, the incomplete Cholesky factorization is typically implemented using full matrix storage in languages like MATLAB or Fortran, adapting the standard Cholesky algorithm with an artificial incompleteness mechanism such as threshold dropping. This approach treats the matrix as fully dense, looping over all elements without sparsity checks, and introduces dropping to control fill-in and approximate the factorization , where is symmetric positive definite and is lower triangular. The IC(0) variant, which retains only the original pattern, degenerates to the complete Cholesky for dense , so threshold-based methods like ICT() are preferred, where elements below a tolerance (often relative to the diagonal or column norm) are discarded during computation. Banded storage may be applied if the matrix exhibits a known bandwidth, reducing memory for semi-dense cases, but full two-dimensional arrays are standard for general dense implementations. In practice, the algorithm modifies the classical Cholesky process by computing Schur complements and applying the dropping rule inline, ensuring numerical stability for positive definite matrices. This avoids the need for sparse data structures like compressed sparse row (CSR) formats, simplifying the code but increasing computational cost to in the worst case before dropping. For threshold dropping, the tolerance is chosen empirically, often starting small (e.g., ) to balance approximation quality and efficiency; global or column-relative thresholds can be used, with the latter scaling drops per column to handle varying matrix densities. Such implementations are straightforward in high-level languages, leveraging built-in linear algebra routines for inner products, though custom loops are needed for dropping. A representative MATLAB-like pseudocode for threshold incomplete Cholesky (ICT) on a dense matrix of size illustrates the process:L = zeros(n, n);
[tau](/page/Tau) = 1e-3; % Drop tolerance
for i = 1:n
% Compute diagonal
sum_sq = 0;
for k = 1:i-1
sum_sq = sum_sq + L(i, k)^2;
end
L(i, i) = sqrt(A(i, i) - sum_sq);
for j = i+1:n
% Compute off-diagonal
sum_prod = 0;
for k = 1:i-1
sum_prod = sum_prod + L(j, k) * L(i, k);
end
temp = (A(j, i) - sum_prod) / L(i, i);
% Apply threshold dropping
if abs(temp) > [tau](/page/Tau) * abs(L(i, i))
L(j, i) = temp;
else
L(j, i) = 0;
end
end
end
L = zeros(n, n);
[tau](/page/Tau) = 1e-3; % Drop tolerance
for i = 1:n
% Compute diagonal
sum_sq = 0;
for k = 1:i-1
sum_sq = sum_sq + L(i, k)^2;
end
L(i, i) = sqrt(A(i, i) - sum_sq);
for j = i+1:n
% Compute off-diagonal
sum_prod = 0;
for k = 1:i-1
sum_prod = sum_prod + L(j, k) * L(i, k);
end
temp = (A(j, i) - sum_prod) / L(i, i);
% Apply threshold dropping
if abs(temp) > [tau](/page/Tau) * abs(L(i, i))
L(j, i) = temp;
else
L(j, i) = 0;
end
end
end
Sparse Matrix Implementation
Implementing incomplete Cholesky factorization for sparse matrices requires specialized data structures to exploit the sparsity pattern and minimize storage overhead. The input matrix and the resulting lower triangular factor are typically stored in compressed sparse row (CSR) format, which efficiently represents the non-zero entries using three arrays: values, column indices, and row pointers. This format allows for fast access during the factorization process, as updates to the Schur complement can be performed by traversing only the relevant non-zeros.[20][21] In the symbolic phase, the sparsity pattern of is determined prior to numerical computation to allocate storage and predict fill-in. This phase models the matrix as an undirected graph, where non-zero entries correspond to edges, and uses adjacency lists to represent the graph structure for efficient traversal and elimination ordering analysis. Adjacency lists facilitate the identification of potential fill positions without storing the full dense graph, enabling scalable symbolic factorization even for large matrices.[22] Several established libraries provide robust implementations of incomplete Cholesky for sparse matrices, integrating seamlessly with high-performance computing environments. PETSc's PCICC preconditioner supports level-based incomplete Cholesky (ICC(k)) for sequential sparse matrices in AIJ format, allowing users to specify fill levels via options like -pc_factor_levels. HSL_MI28 offers an efficient limited-memory incomplete Cholesky with support for level-k fill control, suitable for general sparse symmetric matrices and emphasizing robustness against breakdowns.[23][24] To optimize performance and reduce fill-in during factorization, pre-ordering techniques are essential. Approximate minimum degree (AMD) ordering minimizes the number of non-zeros in by selecting pivots that eliminate vertices with the fewest connections, significantly reducing computational cost for IC(0) and higher levels. Similarly, METIS can be used for multilevel ordering to balance load in parallel settings while controlling fill. For higher-level IC(k) methods, dynamic allocation strategies are employed, where the sparsity pattern of is grown incrementally using linked lists or vector resizing to accommodate additional fill without excessive over-allocation.[25][26] Key challenges in sparse implementations include memory management for variable fill patterns, particularly in higher-level or threshold-based methods where the exact number of non-zeros is unpredictable. Limited-memory approaches, such as those in HSL_MI28, bound storage by dropping small entries or limiting column counts, but require careful tuning to avoid excessive sparsity that degrades preconditioner quality. Additionally, debugging indefiniteness—arising from negative or zero pivots during factorization—can be addressed by monitoring diagonal entries in the Schur complement; if a pivot falls below a small threshold (e.g., machine epsilon), a small diagonal shift or regularization can be applied to restore positive definiteness.[24][16][27]Applications and Examples
Use in Preconditioning
Incomplete Cholesky factorization serves as an effective preconditioner for iterative methods solving large sparse symmetric positive definite (SPD) linear systems . In the left-preconditioned form, it transforms the system to , where approximates with being a lower triangular sparse matrix from the incomplete factorization. Solving systems involving requires only forward and backward substitutions on and , which are computationally efficient due to the sparsity of . This approach is particularly useful in methods like the preconditioned conjugate gradient (PCG) for SPD matrices and preconditioned GMRES for more general cases, as introduced in early applications such as the incomplete Cholesky-CG method.[3] The primary benefit of incomplete Cholesky preconditioning lies in its impact on convergence rates of iterative solvers. By approximating , it clusters the eigenvalues of the preconditioned matrix , reducing the effective condition number and spectral radius, which accelerates convergence in PCG and GMRES. For instance, higher-level methods like IC(k) generally provide better conditioning than IC(0) by retaining more fill-in, leading to tighter eigenvalue clustering, especially for matrices with eigenvalues already clustered around unity. This improvement is evident in PCG, where the convergence bound depends on the maximum and minimum eigenvalues of , and in GMRES, where it minimizes residuals more effectively through reduced iteration counts.[3][3] Theoretically, for an SPD matrix , the incomplete Cholesky factor ensures that remains SPD provided no negative pivots (diagonal entries) occur during factorization, preserving the positive definiteness essential for stable preconditioning. Approximation quality is quantified by the dropped terms forming a remainder , with error bounds derived from the norm ; small (controlled by dropping strategies) guarantees that the condition number of is bounded, ensuring robust convergence. The factorization computation, typically via sparse LU-like algorithms, supports this by enforcing sparsity patterns or thresholds to manage .[3][28][3] Selection of the incomplete Cholesky variant depends on the eigenvalue spectrum of : IC(0) suffices for well-structured matrices with minimal fill, while IC(k) or threshold-based methods are preferred for spectra requiring more accurate approximation to avoid pivot issues. For added robustness, especially in cases prone to breakdown, it can be combined with symmetric successive over-relaxation (SSOR), yielding a hybrid preconditioner that enhances stability without excessive fill.[3][29]Numerical and Sparse Examples
To illustrate the incomplete Cholesky factorization, consider a small dense symmetric positive definite (SPD) matrix . The IC(0) factorization, which retains the full sparsity pattern without dropping fill-ins, coincides with the complete Cholesky factorization for this matrix since no additional fill-ins occur beyond the original pattern. Using the standard algorithm, the lower triangular factor is . The approximation error is , indicating an exact factorization. For the threshold-based variant with , entries in below this tolerance in absolute value are dropped during computation. In this case, all computed entries exceed 0.1, yielding the same as IC(0) and identical error . Now consider a sparse example with the 5x5 tridiagonal SPD matrix from the discrete 1D Laplacian , which has 13 nonzeros (nnz(A) = 13). The IC(0) factorization preserves the original tridiagonal pattern in , dropping any potential fill-ins outside this structure, resulting in a bidiagonal . The step-by-step computation for IC(0) proceeds as follows, using the formula for (dropped if outside pattern) and , with only subdiagonal and diagonal terms retained:- Row 1: .
- Row 2: , .
- Row 3: , .
- Row 4: , .
- Row 5: , .
