Hubbry Logo
Kernel (linear algebra)Kernel (linear algebra)Main
Open search
Kernel (linear algebra)
Community hub
Kernel (linear algebra)
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Kernel (linear algebra)
Kernel (linear algebra)
from Wikipedia
An example for a kernel- the linear operator transforms all points on the line to the zero point , thus they form the kernel for the linear operator

In mathematics, the kernel of a linear map, also known as the null space or nullspace, is the part of the domain which is mapped to the zero vector of the co-domain; the kernel is always a linear subspace of the domain.[1] That is, given a linear map L : VW between two vector spaces V and W, the kernel of L is the vector space of all elements v of V such that L(v) = 0, where 0 denotes the zero vector in W,[2] or more symbolically:

Properties

[edit]
Kernel and image of a linear map L from V to W

The kernel of L is a linear subspace of the domain V.[3][2] In the linear map two elements of V have the same image in W if and only if their difference lies in the kernel of L, that is,

From this, it follows by the first isomorphism theorem that the image of L is isomorphic to the quotient of V by the kernel: In the case where V is finite-dimensional, this implies the rank–nullity theorem: where the term rank refers to the dimension of the image of L, while nullity refers to the dimension of the kernel of L, [4] That is, so that the rank–nullity theorem can be restated as

When V is an inner product space, the quotient can be identified with the orthogonal complement in V of . This is the generalization to linear operators of the row space, or coimage, of a matrix.

Generalization to modules

[edit]

The notion of kernel also makes sense for homomorphisms of modules, which are generalizations of vector spaces where the scalars are elements of a ring, rather than a field. The domain of the mapping is a module, with the kernel constituting a submodule. Here, the concepts of rank and nullity do not necessarily apply.

In functional analysis

[edit]

If V and W are topological vector spaces such that W is finite-dimensional, then a linear operator L: VW is continuous if and only if the kernel of L is a closed subspace of V.

Representation as matrix multiplication

[edit]

Consider a linear map represented as a m × n matrix A with coefficients in a field K (typically or ), that is operating on column vectors x with n components over K. The kernel of this linear map is the set of solutions to the equation Ax = 0, where 0 is understood as the zero vector. The dimension of the kernel of A is called the nullity of A. In set-builder notation, The matrix equation is equivalent to a homogeneous system of linear equations: Thus the kernel of A is the same as the solution set to the above homogeneous equations.

Subspace properties

[edit]

The kernel of a m × n matrix A over a field K is a linear subspace of Kn. That is, the kernel of A, the set Null(A), has the following three properties:

  1. Null(A) always contains the zero vector, since A0 = 0.
  2. If x ∈ Null(A) and y ∈ Null(A), then x + y ∈ Null(A). This follows from the distributivity of matrix multiplication over addition.
  3. If x ∈ Null(A) and c is a scalar cK, then cx ∈ Null(A), since A(cx) = c(Ax) = c0 = 0.

The row space of a matrix

[edit]

The product Ax can be written in terms of the dot product of vectors as follows:

Here, a1, ... , am denote the rows of the matrix A. It follows that x is in the kernel of A, if and only if x is orthogonal (or perpendicular) to each of the row vectors of A (since orthogonality is defined as having a dot product of 0).

The row space, or coimage, of a matrix A is the span of the row vectors of A. By the above reasoning, the kernel of A is the orthogonal complement to the row space. That is, a vector x lies in the kernel of A, if and only if it is perpendicular to every vector in the row space of A.

The dimension of the row space of A is called the rank of A, and the dimension of the kernel of A is called the nullity of A. These quantities are related by the rank–nullity theorem[4]

Left null space

[edit]

The left null space, or cokernel, of a matrix A consists of all column vectors x such that xTA = 0T, where T denotes the transpose of a matrix. The left null space of A is the same as the kernel of AT. The left null space of A is the orthogonal complement to the column space of A, and is dual to the cokernel of the associated linear transformation. The kernel, the row space, the column space, and the left null space of A are the four fundamental subspaces associated with the matrix A.

Nonhomogeneous systems of linear equations

[edit]

The kernel also plays a role in the solution to a nonhomogeneous system of linear equations: If u and v are two possible solutions to the above equation, then Thus, the difference of any two solutions to the equation Ax = b lies in the kernel of A.

It follows that any solution to the equation Ax = b can be expressed as the sum of a fixed solution v and an arbitrary element of the kernel. That is, the solution set to the equation Ax = b is Geometrically, this says that the solution set to Ax = b is the translation of the kernel of A by the vector v. See also Fredholm alternative and flat (geometry).

Illustration

[edit]

The following is a simple illustration of the computation of the kernel of a matrix (see § Computation by Gaussian elimination, below for methods better suited to more complex calculations). The illustration also touches on the row space and its relation to the kernel.

Consider the matrix The kernel of this matrix consists of all vectors (x, y, z) ∈ R3 for which which can be expressed as a homogeneous system of linear equations involving x, y, and z:

The same linear equations can also be written in matrix form as:

Through Gauss–Jordan elimination, the matrix can be reduced to:

Rewriting the matrix in equation form yields:

The elements of the kernel can be further expressed in parametric vector form, as follows:

Since c is a free variable ranging over all real numbers, this can be expressed equally well as: The kernel of A is precisely the solution set to these equations (in this case, a line through the origin in R3). Here, the vector (−1,−26,16)T constitutes a basis of the kernel of A. The nullity of A is therefore 1, as it is spanned by a single vector.

The following dot products are zero: which illustrates that vectors in the kernel of A are orthogonal to each of the row vectors of A.

These two (linearly independent) row vectors span the row space of A—a plane orthogonal to the vector (−1,−26,16)T.

With the rank 2 of A, the nullity 1 of A, and the dimension 3 of A, we have an illustration of the rank-nullity theorem.

Examples

[edit]
  • If L: RmRn, then the kernel of L is the solution set to a homogeneous system of linear equations. As in the above illustration, if L is the operator: then the kernel of L is the set of solutions to the equations
  • Let C[0,1] denote the vector space of all continuous real-valued functions on the interval [0,1], and define L: C[0,1] → R by the rule Then the kernel of L consists of all functions fC[0,1] for which f(0.3) = 0.
  • Let C(R) be the vector space of all infinitely differentiable functions RR, and let D: C(R) → C(R) be the differentiation operator: Then the kernel of D consists of all functions in C(R) whose derivatives are zero, i.e. the set of all constant functions.
  • Let R be the direct product of infinitely many copies of R, and let s: RR be the shift operator Then the kernel of s is the one-dimensional subspace consisting of all vectors (x1, 0, 0, 0, ...).
  • If V is an inner product space and W is a subspace, the kernel of the orthogonal projection VW is the orthogonal complement to W in V.

Computation by Gaussian elimination

[edit]

A basis of the kernel of a matrix may be computed by Gaussian elimination.

For this purpose, given an m × n matrix A, we construct first the row augmented matrix where I is the n × n identity matrix.

Computing its column echelon form by Gaussian elimination (or any other suitable method), we get a matrix A basis of the kernel of A consists in the non-zero columns of C such that the corresponding column of B is a zero column.

In fact, the computation may be stopped as soon as the upper matrix is in column echelon form: the remainder of the computation consists in changing the basis of the vector space generated by the columns whose upper part is zero.

For example, suppose that Then

Putting the upper part in column echelon form by column operations on the whole matrix gives

The last three columns of B are zero columns. Therefore, the three last vectors of C, are a basis of the kernel of A.

Proof that the method computes the kernel: Since column operations correspond to post-multiplication by invertible matrices, the fact that reduces to means that there exists an invertible matrix such that with in column echelon form. Thus , , and . A column vector belongs to the kernel of (that is ) if and only if where . As is in column echelon form, , if and only if the nonzero entries of correspond to the zero columns of . By multiplying by , one may deduce that this is the case if and only if is a linear combination of the corresponding columns of .

Numerical computation

[edit]

The problem of computing the kernel on a computer depends on the nature of the coefficients.

Exact coefficients

[edit]

If the coefficients of the matrix are exactly given numbers, the column echelon form of the matrix may be computed with Bareiss algorithm more efficiently than with Gaussian elimination. It is even more efficient to use modular arithmetic and Chinese remainder theorem, which reduces the problem to several similar ones over finite fields (this avoids the overhead induced by the non-linearity of the computational complexity of integer multiplication).[citation needed]

For coefficients in a finite field, Gaussian elimination works well, but for the large matrices that occur in cryptography and Gröbner basis computation, better algorithms are known, which have roughly the same computational complexity, but are faster and behave better with modern computer hardware.[citation needed]

Floating point computation

[edit]

For matrices whose entries are floating-point numbers, the problem of computing the kernel makes sense only for matrices such that the number of rows is equal to their rank: because of the rounding errors, a floating-point matrix has almost always a full rank, even when it is an approximation of a matrix of a much smaller rank. Even for a full-rank matrix, it is possible to compute its kernel only if it is well conditioned, i.e. it has a low condition number.[5][citation needed]

Even for a well conditioned full rank matrix, Gaussian elimination does not behave correctly: it introduces rounding errors that are too large for getting a significant result. As the computation of the kernel of a matrix is a special instance of solving a homogeneous system of linear equations, the kernel may be computed with any of the various algorithms designed to solve homogeneous systems. A state of the art software for this purpose is the Lapack library.[citation needed]

See also

[edit]

Notes and references

[edit]

Bibliography

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
In linear algebra, the kernel of a linear transformation T:VWT: V \to W between vector spaces, also known as the null space, is defined as the set of all vectors vV\mathbf{v} \in V such that T(v)=0T(\mathbf{v}) = \mathbf{0}, where 0\mathbf{0} is the zero vector in WW. The kernel is always a vector subspace of the domain VV, containing the zero vector and closed under and . For a linear transformation represented by an m×nm \times n matrix AA, the kernel consists of all vectors xRn\mathbf{x} \in \mathbb{R}^n (or the appropriate field) satisfying Ax=0A\mathbf{x} = \mathbf{0}, and its dimension, called the nullity, equals nrn - r, where rr is the rank of AA (the dimension of the image). This relationship is formalized in the rank-nullity theorem, which states that for any linear transformation T:VWT: V \to W, dim(ker(T))+dim(im(T))=dim(V)\dim(\ker(T)) + \dim(\operatorname{im}(T)) = \dim(V). The kernel plays a central role in understanding linear dependence, solvability of systems of equations, and the structure of the four fundamental subspaces associated with a matrix, where it is orthogonal to the row space. If the kernel is trivial (containing only the zero vector), the transformation is injective; otherwise, it reveals the redundancies in the mapping.

Definition and Fundamentals

Definition for linear transformations

In linear algebra, the kernel of a linear transformation captures the set of vectors in the domain that map to the zero vector in the codomain, serving as a fundamental concept for understanding the behavior of linear maps. Formally, let T:VWT: V \to W be a linear transformation between vector spaces VV and WW over a field FF, where VV and WW are finite-dimensional unless otherwise specified. The kernel of TT, denoted ker(T)\ker(T), is defined as the set ker(T)={vVT(v)=0W},\ker(T) = \{ v \in V \mid T(v) = 0_W \}, with 0W0_W denoting the zero vector in WW. This set consists precisely of all vectors in VV that are sent to the origin under TT. Alternative notations for the kernel include N(T)N(T) and \null(T)\null(T), reflecting its equivalence to the null space of the transformation. The kernel is always non-empty, as it contains at least the zero vector of the domain VV. To see this, note that of TT implies T(0V)=0WT(0_V) = 0_W, where 0V0_V is the zero vector in VV. Thus, 0Vker(T)0_V \in \ker(T). The kernel forms a vector subspace of VV.

Kernel as a vector subspace

The kernel of a linear transformation T:VWT: V \to W between vector spaces is a subspace of the domain VV. To establish this, verify the subspace properties. The zero vector 0V\mathbf{0}_V belongs to ker(T)\ker(T) because linearity implies T(0V)=0WT(\mathbf{0}_V) = \mathbf{0}_W. For closure under addition, if u,vker(T)\mathbf{u}, \mathbf{v} \in \ker(T), then T(u+v)=T(u)+T(v)=0W+0W=0WT(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v}) = \mathbf{0}_W + \mathbf{0}_W = \mathbf{0}_W, so u+vker(T)\mathbf{u} + \mathbf{v} \in \ker(T). Similarly, for scalar multiplication, if cc is a scalar and uker(T)\mathbf{u} \in \ker(T), then T(cu)=cT(u)=c0W=0WT(c\mathbf{u}) = c T(\mathbf{u}) = c \mathbf{0}_W = \mathbf{0}_W, so cuker(T)c\mathbf{u} \in \ker(T). Thus, ker(T)\ker(T) satisfies the axioms of a vector subspace of VV. As a subspace, ker(T)V\ker(T) \subseteq V and comprises precisely the solution set to the homogeneous equation T(v)=0WT(\mathbf{v}) = \mathbf{0}_W. In trivial cases, ker(T)={0V}\ker(T) = \{\mathbf{0}_V\} if and only if TT is injective, as injectivity ensures only the zero vector maps to 0W\mathbf{0}_W, and conversely, a trivial kernel implies distinct inputs yield distinct outputs. Additionally, ker(T)=V\ker(T) = V if TT is the zero transformation, since every vector in VV maps to 0W\mathbf{0}_W.

Algebraic Properties

Dimension and rank-nullity theorem

The dimension of the kernel of a linear transformation T:VWT: V \to W between finite-dimensional vector spaces is called the nullity of TT, denoted \nullity(T)=dim(ker(T))\nullity(T) = \dim(\ker(T))./03%3A_General_Vector_Spaces/3.05%3A_Rank_and_Nullity) The rank-nullity theorem establishes a fundamental relationship between the nullity and the dimension of the image of TT. Specifically, if VV and WW are finite-dimensional vector spaces and T:VWT: V \to W is a linear transformation, then \dim(V) = \dim(\ker(T)) + \dim(\im(T)). $$/03%3A_General_Vector_Spaces/3.05%3A_Rank_and_Nullity) Equivalently, denoting $\rank(T) = \dim(\im(T))$, the theorem states that \nullity(T) + \rank(T) = \dim(V). To sketch the proof, suppose $\{v_1, \dots, v_k\}$ is a basis for $\ker(T)$. Extend this to a basis $\{v_1, \dots, v_k, u_1, \dots, u_m\}$ for $V$. The set $\{T(u_1), \dots, T(u_m)\}$ is linearly independent (since if $\sum c_i T(u_i) = 0$, then $\sum c_i u_i \in \ker(T)$, so the $c_i = 0$) and spans $\im(T)$ (since any $T(v) = \sum a_i T(u_i)$ for $v \in V$). Thus, $\dim(\im(T)) = m$, and $\dim(V) = k + m = \dim(\ker(T)) + \dim(\im(T))$./03%3A_General_Vector_Spaces/3.05%3A_Rank_and_Nullity) As corollaries, $T$ is injective if and only if $\ker(T) = \{0\}$ (equivalently, $\nullity(T) = 0$)./03%3A_General_Vector_Spaces/3.05%3A_Rank_and_Nullity) Moreover, assuming $\dim(W)$ is finite, $T$ is surjective if and only if $\dim(\ker(T)) = \dim(V) - \dim(W)$ (equivalently, $\rank(T) = \dim(W)$)./03%3A_General_Vector_Spaces/3.05%3A_Rank_and_Nullity) ### Invariance under change of basis The kernel of a linear transformation $T: V \to W$ between finite-dimensional vector spaces is defined as $\ker(T) = \{ \mathbf{v} \in V \mid T(\mathbf{v}) = \mathbf{0}_W \}$, which forms a subspace of the domain $V$. This definition relies solely on the abstract properties of the transformation and the vectors themselves, without reference to coordinates or specific bases for $V$ or $W$. Consequently, $\ker(T)$ is an intrinsic geometric object associated with $T$, independent of any choice of bases.[](https://www.math.ucdavis.edu/~linear/linear-guest.pdf)[](https://scholar.harvard.edu/files/forrestgflesher/files/lecture_notes.pdf) To illustrate this invariance, consider matrix representations of $T$. Let $A$ be the matrix of $T$ with respect to bases $\mathcal{B}$ for $V$ and $\mathcal{C}$ for $W$. A vector $\mathbf{v} \in V$ satisfies $T(\mathbf{v}) = \mathbf{0}_W$ if and only if the [coordinate vector](/page/Coordinate_vector) $[\mathbf{v}]_{\mathcal{B}}$ satisfies $A [\mathbf{v}]_{\mathcal{B}} = \mathbf{0}$. Now suppose the bases are changed to $\mathcal{B}'$ for $V$ and $\mathcal{C}'$ for $W$, with change-of-basis matrices $P$ (such that $[\mathbf{v}]_{\mathcal{B}} = P [\mathbf{v}]_{\mathcal{B}'}$) and $Q$ (such that coordinates in $\mathcal{C}$ equal $Q$ times coordinates in $\mathcal{C}'$). The new [matrix representation](/page/Matrix_representation) is $A' = Q^{-1} A P$. Then $T(\mathbf{v}) = \mathbf{0}_W$ if and only if $A [\mathbf{v}]_{\mathcal{B}} = \mathbf{0}$, which substitutes to $A P [\mathbf{v}]_{\mathcal{B}'} = \mathbf{0}$, or equivalently $Q^{-1} (A P [\mathbf{v}]_{\mathcal{B}'}) = \mathbf{0}$, hence $A' [\mathbf{v}]_{\mathcal{B}'} = \mathbf{0}$ since $Q$ is invertible. This equivalence confirms that the condition $T(\mathbf{v}) = \mathbf{0}_W$ holds for precisely the same vectors $\mathbf{v} \in V$, regardless of the bases chosen, so $\ker(T)$ remains unchanged as a subspace.[](https://pi.math.cornell.edu/~kassabov/math4330.fall19/cornell-only/Matrix.pdf)[](https://dummit.cos.northeastern.edu/docs/linalgthy_2_linear_transformations.pdf) This basis independence underscores the kernel's role as a fundamental invariant of the linear transformation, distinct from its matrix representations which vary with basis choices. In applications, such as the analysis of [invariant subspaces](/page/Invariant_subspace), the kernel serves as the prototypical example of a $T$-invariant subspace (where $T(\ker(T)) = \{\mathbf{0}\}$), enabling the decomposition of $V$ into invariant components that preserve structural properties under $T$. This property facilitates the study of decompositions like the [rational canonical form](/page/Canonical_form), where kernels of powers of $T$ help identify cyclic subspaces invariant under $T$.[](https://ocw.mit.edu/courses/8-05-quantum-physics-ii-fall-2013/04b0570b349e84d74129eef504498472_MIT8_05F13_Chap_03.pdf) ## Matrix Representation ### Kernel via matrix multiplication In the context of matrix representations of linear transformations, the kernel of an $m \times n$ matrix $A$ over a field $F$, denoted $\ker(A)$, is defined as the set of all column vectors $x \in F^n$ such that $A x = 0_m$, where $0_m$ is the zero vector in $F^m$.[](https://sites.millersville.edu/bikenaga/linear-algebra/null-space/null-space.html) This set consists of all vectors in the domain that are mapped to the zero vector in the [codomain](/page/Codomain) under the linear transformation represented by $A$.[](https://people.math.harvard.edu/~knill/teaching/math19b_2011/handouts/lecture13.pdf) The equation $A x = 0$ forms a homogeneous [system](/page/System) of $m$ linear equations in $n$ unknowns, whose [solution set](/page/Solution_set) is precisely $\ker(A)$.[](http://www.csun.edu/~jb715473/math382/MT_review.pdf) As established in the fundamentals of linear algebra, this [solution set](/page/Solution_set) is always a vector subspace of $F^n$.[](https://people.math.harvard.edu/~knill/teaching/math19b_2011/handouts/lecture13.pdf) The columns of $A$ are linearly independent if and only if $\ker(A) = \{0\}$, meaning the only solution to $A x = 0$ is the zero vector.[](https://math.vanderbilt.edu/rolenl/LinearAlgebraNotes14.pdf) For a square $n \times n$ matrix $A$, the kernel is trivial (i.e., $\ker(A) = \{0\}$) if and only if $A$ is invertible.[](https://math.colorado.edu/~casa/teaching/22fall/2130/hw/6_4_19/6_4_19.pdf) ### Relation to row space and nullity The kernel of a matrix $A \in \mathbb{R}^{m \times n}$, denoted $\ker(A)$, is intimately related to the row space of $A$ through the rank-nullity theorem, which states that the dimension of the kernel equals the number of columns minus the rank of $A$: $\dim(\ker(A)) = n - \rank(A)$, where $\rank(A) = \dim(\operatorname{Row}(A))$.[](https://people.math.harvard.edu/~knill/teaching/math21b2023/handouts/lecture11.pdf) The row space of $A$, denoted $\operatorname{Row}(A)$, is the subspace of $\mathbb{R}^n$ spanned by the row vectors of $A$.[](https://chrisj.math.gatech.edu/18f/m1553/materials/11_19_web.pdf) Equivalently, $\operatorname{Row}(A)$ is the column space of the [transpose](/page/Transpose) $A^T$, so $\operatorname{Row}(A) = \operatorname{Col}(A^T)$.[](https://chrisj.math.gatech.edu/18f/m1553/materials/11_19_web.pdf) The nullity of $A$, defined as $\nullity(A) = \dim(\ker(A))$, corresponds to the number of free variables in the solution set to the homogeneous [equation](/page/Equation) $Ax = 0$ after performing row reduction on $A$.[](https://math.unm.edu/~loring/links/linear_s06/nullity.pdf) In the row echelon form of $A$, the number of free variables is $n$ minus the number of pivot positions, directly yielding the nullity.[](https://people.tamu.edu/~yvorobets/MATH304-2011C/Lect2-08web.pdf) In inner product spaces equipped with the standard [dot product](/page/Dot_product), the kernel $\ker(A)$ is orthogonal to the row space $\operatorname{Row}(A)$, meaning every vector in $\ker(A)$ is [perpendicular](/page/Perpendicular) to every vector in $\operatorname{Row}(A)$.[](https://people.tamu.edu/~yvorobets//MATH304-2011C/Lect3-02web.pdf) This orthogonality follows from the fact that $Ax = 0$ implies the [dot product](/page/Dot_product) of $x$ with each row of $A$ is zero.[](https://ocw.mit.edu/courses/18-06sc-linear-algebra-fall-2011/fdce42d9c8d8dc4eac7a9db3c1978e18_MIT18_06SCF11_Ses2.1sum.pdf) ### Left kernel and its distinctions The left kernel of an $ m \times n $ matrix $ A $ over a field $ F $ is defined as the set of all vectors $ y \in F^m $ satisfying $ y^T A = 0_n^T $. This subspace consists of vectors in the [codomain](/page/Codomain) that, when multiplied from the left, yield the zero vector.[](https://www.cs.utexas.edu/~rvdg/LinAlgBook/Chapter6.pdf)[](https://personal.math.vt.edu/embree/cmda3606/chapter4.pdf) Unlike the standard (right) kernel, which captures vectors in the domain mapped to zero by $ A $, the left kernel operates on the [codomain](/page/Codomain) and is equivalently the kernel of the [transpose](/page/Transpose) $ A^T $, denoted $ \ker(A^T) $. This equivalence arises because $ y^T A = 0 $ if and only if $ A^T y = 0 $. The dimension of the left kernel is $ m - \rank(A) $, following from the rank-nullity theorem applied to $ A^T $ and the fact that $ \rank(A^T) = \rank(A) $.[](https://personal.math.vt.edu/embree/cmda3606/chapter4.pdf)[](http://aleph0.clarku.edu/~ma130/ranknullity2.pdf) Over fields supporting an inner product, such as the reals or complexes, the left kernel forms the [orthogonal complement](/page/Orthogonal_complement) to the column space of $ A $. Thus, any vector in the left kernel is perpendicular to every column of $ A $, providing a dual perspective to the row space relations for the standard kernel.[](https://people.sc.fsu.edu/~jburkardt/classes/gateway_2014/lecture_week05.pdf) In applications like [least squares](/page/Least_squares) problems, the left kernel characterizes the residuals, as the error vector must lie in this space to ensure [orthogonality](/page/Orthogonality) to the column space.[](https://uspas.fnal.gov/materials/05UCB/4_LSQ.pdf) ## Computational Methods ### Exact computation using Gaussian elimination To compute the kernel of an $ m \times n $ matrix $ A $ over a field supporting exact arithmetic, such as the rational numbers, [Gaussian elimination](/page/Gaussian_elimination) is applied to solve the homogeneous [equation](/page/Equation) $ Ax = 0 $. This method transforms $ A $ into an equivalent [row echelon form](/page/Row_echelon_form) $ R $ through elementary row operations—interchanging rows, scaling a row by a nonzero scalar from the field, and adding a scalar multiple of one row to another—which do not alter the [solution set](/page/Solution_set) of the system.[](https://ocw.mit.edu/courses/18-06-linear-algebra-spring-2010/resources/mit18_06s10_rec_09/) The [row echelon form](/page/Row_echelon_form) $ R $ features leading nonzero entries (pivots) that are staggered to the right in successive nonzero rows, with zeros below each pivot, facilitating the identification of [dependent and independent variables](/page/Dependent_and_independent_variables). The columns of $ R $ without pivots correspond to free variables, while those with pivots correspond to pivot variables (or basic variables). The number of free variables equals the nullity of $ A $, which is $ n $ minus the rank (number of pivots). To find a basis for the kernel, introduce parameters for the free variables, expressing each pivot variable as a linear combination of these parameters via back-substitution starting from the bottom row of $ R $. A basis consists of $ n - \rank(A) $ vectors, each obtained by setting one free variable to 1 and all others to 0, then computing the corresponding values for the pivot variables; these special solutions span the kernel.[](https://ocw.mit.edu/courses/18-06-linear-algebra-spring-2010/resources/mit18_06s10_rec_09/) Consider a [workflow](/page/Workflow) for a $ 3 \times 4 $ matrix $ A $. After [Gaussian elimination](/page/Gaussian_elimination), suppose $ R $ has pivots in columns 1 and 3, leaving columns 2 and 4 free. The system $ Rx = 0 $ then involves two equations with parameters $ s $ for $ x_2 $ and $ t $ for $ x_4 $, allowing expression of $ x_1 $ and $ x_3 $ in terms of $ s $ and $ t $. The basis vectors are constructed by first setting $ s = 1 $, $ t = 0 $ to get one vector with 1 in the second position and the solved values elsewhere, then setting $ s = 0 $, $ t = 1 $ for the second vector with 1 in the fourth position. This parametric representation yields a basis of [dimension](/page/Dimension) 2 for the kernel.[](https://ocw.mit.edu/courses/18-06-linear-algebra-spring-2010/resources/mit18_06s10_rec_09/) The computational complexity of Gaussian elimination for this purpose is $ O(m n^2) $ arithmetic operations when solving a system with $ m $ equations in $ n $ variables, dominated by the elimination steps across columns. This assumes exact arithmetic to avoid rounding issues, ensuring the kernel basis is computed precisely without approximation. ### Numerical computation with floating-point arithmetic In floating-point arithmetic, computing the kernel of a matrix introduces challenges due to round-off errors, which can result in small non-zero residuals for vectors that should ideally satisfy $Ax = 0$. These errors arise because finite precision representations, such as IEEE 754 double precision, approximate real numbers, leading to perturbations in matrix entries during operations like subtraction or multiplication. To address this, numerical algorithms employ a tolerance parameter, often set to $\tau = \max(m,n) \cdot \epsilon \cdot \|A\|$ where $\epsilon$ is machine epsilon and $\|A\|$ is a matrix norm, to identify near-zero pivots or singular values as effectively zero, thus determining the numerical rank and kernel dimension. Robust methods for approximating the kernel in floating-point settings include [QR decomposition](/page/QR_decomposition) and [singular value decomposition](/page/Singular_value_decomposition) (SVD), both of which provide numerically stable ways to reveal the kernel structure. In [QR decomposition](/page/QR_decomposition) with column pivoting, the kernel basis is obtained from the orthogonal factor $Q$ corresponding to the zero (or near-zero) diagonal entries in the upper triangular factor $R$; specifically, for a matrix $A \in \mathbb{R}^{m \times n}$ with $m \geq n$, the last $n - r$ columns of $Q$ from the decomposition of $A^T$ form an orthonormal basis for the kernel, where $r$ is the numerical rank. This approach is backward stable, meaning the computed factorization satisfies $A + \Delta A = QR$ with $\|\Delta A\| = O(\epsilon \|A\|)$, avoiding significant error amplification. SVD offers even higher reliability for kernel approximation, as the right singular vectors associated with singular values below the tolerance span the kernel; the decomposition $A = U \Sigma V^T$ ensures that the columns of $V$ corresponding to zero singular values provide an orthonormal kernel basis, with stability guaranteed under the same perturbation bounds. Rank determination via SVD is particularly effective for ill-conditioned matrices, as singular values decay gradually, allowing precise separation using the tolerance.[](https://scicomp.stackexchange.com/questions/2510/null-space-of-a-rectangular-dense-matrix) To enhance stability in QR-based kernel computation, algorithms often use Householder reflections for orthogonalization, which introduce minimal round-off errors compared to classical Gram-Schmidt. A Householder reflection $H = I - 2vv^T / (v^T v)$ (with $\|v\| = 1$) zeros out subdiagonal elements in a column while preserving the Euclidean norm, and successive applications yield the QR factors without squaring the condition number, unlike [LU decomposition](/page/LU_decomposition). This method ensures that the computed kernel vectors satisfy $\|A x\| / \|A\| \leq O(\epsilon)$ for unit vectors $x$ in the approximate kernel, mitigating error growth in ill-conditioned cases. Implementations in numerical software libraries handle these computations efficiently. In [MATLAB](/page/MATLAB), the `null(A)` function computes an [orthonormal basis](/page/Orthonormal_basis) for the kernel using SVD, applying a default tolerance of `max(size(A)) * eps(norm(A,2))` to account for floating-point precision and return vectors with residuals below this threshold. Similarly, [LAPACK](/page/LAPACK) provides routines like `dgeqp3` for rank-revealing QR and `dgesvd` for SVD, which users combine to extract kernel bases; these are designed for high-performance floating-point operations on dense matrices, ensuring stability across hardware.[](https://www.mathworks.com/help/matlab/ref/null.html) ## Examples and Illustrations ### Concrete matrix examples To illustrate the kernel of a linear transformation represented by a matrix, consider specific small matrices and compute their kernels by solving the homogeneous equation $Ax = 0$. For the $2 \times 2$ [identity matrix](/page/Identity_matrix) I_2 = \begin{pmatrix} 1 & 0 \ 0 & 1 \end{pmatrix}, the equation $I_2 \mathbf{x} = \mathbf{0}$ yields $\mathbf{x} = \mathbf{0}$ as the only solution, so the kernel is $\{\mathbf{0}\}$, with nullity 0. Next, examine the $2 \times 3$ matrix A = \begin{pmatrix} 1 & 0 & 1 \ 0 & 1 & 0 \end{pmatrix}. Solving $A \mathbf{x} = \mathbf{0}$ gives the system $x_1 + x_3 = 0$ and $x_2 = 0$, so $x_1 = -x_3$, $x_2 = 0$, with $x_3$ free. The kernel is spanned by the basis vector $\begin{pmatrix} -1 \\ 0 \\ 1 \end{pmatrix}$, and the nullity is 1.[](https://math.berkeley.edu/~jkuan/Lecture_July9.pdf) To verify, compute A \begin{pmatrix} -1 \ 0 \ 1 \end{pmatrix} = \begin{pmatrix} 1 \cdot (-1) + 0 \cdot 0 + 1 \cdot 1 \ 0 \cdot (-1) + 1 \cdot 0 + 0 \cdot 1 \end{pmatrix} = \begin{pmatrix} 0 \ 0 \end{pmatrix}. For a singular $3 \times 3$ matrix, consider B = \begin{pmatrix} 1 & 2 & 3 \ 2 & 4 & 6 \ 3 & 6 & 9 \end{pmatrix}, which has rank 1. Solving $B \mathbf{x} = \mathbf{0}$ (via row reduction to identify free variables $x_2$ and $x_3$) yields solutions of the form $x_1 = -2x_2 - 3x_3$, $x_2$ free, $x_3$ free. A basis for the kernel is $\left\{ \begin{pmatrix} -2 \\ 1 \\ 0 \end{pmatrix}, \begin{pmatrix} -3 \\ 0 \\ 1 \end{pmatrix} \right\}$, and the nullity is 2. Verification for the first basis vector: B \begin{pmatrix} -2 \ 1 \ 0 \end{pmatrix} = \begin{pmatrix} 1 \cdot (-2) + 2 \cdot 1 + 3 \cdot 0 \ 2 \cdot (-2) + 4 \cdot 1 + 6 \cdot 0 \ 3 \cdot (-2) + 6 \cdot 1 + 9 \cdot 0 \end{pmatrix} = \begin{pmatrix} 0 \ 0 \ 0 \end{pmatrix}. Similarlyforthesecondbasisvector:Similarly for the second basis vector: B \begin{pmatrix} -3 \ 0 \ 1 \end{pmatrix} = \begin{pmatrix} 1 \cdot (-3) + 2 \cdot 0 + 3 \cdot 1 \ 2 \cdot (-3) + 4 \cdot 0 + 6 \cdot 1 \ 3 \cdot (-3) + 6 \cdot 0 + 9 \cdot 1 \end{pmatrix} = \begin{pmatrix} 0 \ 0 \ 0 \end{pmatrix}. ### Geometric interpretations Geometrically, the kernel of a linear transformation $T: V \to W$ between finite-dimensional vector spaces represents the subspace of $V$ consisting of all vectors that are mapped to the zero vector in $W$, effectively capturing the "directions" in $V$ that are collapsed or annihilated by $T$. This subspace always passes through the origin, as linearity ensures $T(0) = 0$. In low-dimensional Euclidean spaces, these kernels manifest as familiar geometric objects such as lines or planes, providing intuition for how transformations distort or preserve spatial structure.[](https://linear.axler.net/LADR4e.pdf) A classic example occurs with the projection map $T: \mathbb{R}^2 \to \mathbb{R}^1$ defined by $T(x, y) = x$, which projects points onto the x-axis. Here, the kernel is the set $\{(0, y) \mid y \in \mathbb{R}\}$, the y-axis, forming a line through the origin in the plane. Vectors along this line are invisible under the projection, as they contribute nothing to the output coordinate. Similarly, for a projection $T: \mathbb{R}^3 \to \mathbb{R}^2$ onto the xy-plane given by $T(x, y, z) = (x, y)$, the kernel is the z-axis $\{(0, 0, z) \mid z \in \mathbb{R}\}$, again a line through the origin, illustrating how the transformation ignores the height [dimension](/page/Dimension).[](https://web.math.utk.edu/~freire/linalgebra2005-1.pdf)[](https://linear.axler.net/LADR4e.pdf) In higher ranks, kernels can fill larger subspaces; for instance, consider a rank-1 [linear map](/page/Linear_map) $T: \mathbb{R}^3 \to \mathbb{R}^2$ such as $T(x, y, z) = (x + y + z, 0)$. The kernel consists of all vectors satisfying $x + y + z = 0$, which is a plane through the origin in $\mathbb{R}^3$, [perpendicular](/page/Perpendicular) to the vector $(1, 1, 1)$. This plane represents the two-dimensional set of directions mapped to zero, highlighting the [codimension](/page/Codimension) of the kernel matching the rank of $T$. Conversely, when $T$ is injective, its kernel is the trivial subspace $\{0\}$, the single origin point, meaning no non-zero vector is collapsed, preserving all directions distinctly.[](https://web.math.utk.edu/~freire/linalgebra2005-1.pdf) More abstractly, the kernel aligns with the [concept](/page/Concept) of fibers in the [context](/page/Context) of linear maps: it is precisely the fiber over the [zero element](/page/Zero_element) in the [codomain](/page/Codomain), $T^{-1}(\{0\})$, the preimage of the origin under $T$. This perspective underscores the kernel's role as the [solution set](/page/Solution_set) to the homogeneous [equation](/page/Equation) $T(\mathbf{v}) = \mathbf{0}$, generalizing the geometric collapse to arbitrary dimensions while retaining intuitive visualizations in low-dimensional cases.[](https://linear.axler.net/LADR4e.pdf) ## Generalizations ### Kernels over modules In the setting of modules over a [commutative ring](/page/Commutative_ring) $R$, the kernel of a homomorphism $\phi: M \to N$ between $R$-modules $M$ and $N$ is defined as $\ker(\phi) = \{ m \in M \mid \phi(m) = 0_N \}$, where $0_N$ denotes the [zero element](/page/Zero_element) in $N$. This set forms an $R$-submodule of $M$, as it is closed under addition and [scalar multiplication](/page/Scalar_multiplication) by elements of $R$.[](https://faculty.etsu.edu/gardnerr/5410/notes/IV-1.pdf) A key property is that the kernel always constitutes a submodule, enabling the formation of quotient modules. The first isomorphism theorem for modules states that if $\phi: M \to N$ is an $R$-module homomorphism, then $M / \ker(\phi) \cong \operatorname{im}(\phi)$, where $\operatorname{im}(\phi)$ is the [image](/page/Image) submodule of $N$. This isomorphism identifies the structure of the quotient by the kernel with the actual [image](/page/Image) of the map.[](https://www.cip.ifi.lmu.de/~grinberg/t/21w/lec9.pdf) Unlike the case of vector spaces over fields, where every subspace (including kernels) admits a basis, kernels of module homomorphisms may lack a basis and fail to be free modules. For instance, over the ring $R = \mathbb{Z}$, consider the $\mathbb{Z}$-module $M = \mathbb{Z}/6\mathbb{Z}$ and the endomorphism $\phi: M \to M$ given by [multiplication](/page/Multiplication) by 3, so $\phi(m) = 3m \mod 6$. The kernel is $\{ m \in \mathbb{Z}/6\mathbb{Z} \mid 3m \equiv 0 \mod 6 \} = \{0, 3\}$, which is isomorphic to $\mathbb{Z}/2\mathbb{Z}$, a torsion module that is not free and thus has no basis.[](https://www.math.wustl.edu/~matkerr/5031/A1IVB.pdf) In [homological algebra](/page/Homological_algebra), kernels are fundamental to the study of [chain complexes](/page/Chain_complex) and exact sequences of modules. For a chain complex $\cdots \to M_{n+1} \xrightarrow{d_{n+1}} M_n \xrightarrow{d_n} M_{n-1} \to \cdots$, the $n$th homology group is defined as $H_n = \ker(d_n) / \operatorname{im}(d_{n+1})$, capturing the failure of exactness at $M_n$ and enabling computations of derived functors and invariants in [algebraic topology](/page/Algebraic_topology) and [geometry](/page/Geometry).[](https://stacks.math.columbia.edu/download/homology.pdf) ### Kernels in functional analysis In functional analysis, the kernel of a bounded linear operator $ T: X \to Y $ between Banach spaces $ X $ and $ Y $ is the set $ \ker(T) = \{ x \in X \mid Tx = 0 \} $, which forms a subspace of $ X $.[](https://www.impan.pl/~pmh/teach/algebra/additional/bounded.pdf) Since bounded linear operators are continuous, their kernels are always closed subspaces.[](https://www.math.ucdavis.edu/~hunter/book/ch5.pdf) This closedness distinguishes kernels of bounded operators from those of unbounded operators, where the kernel may fail to be closed.[](https://59clc.files.wordpress.com/2012/08/functional-analysis-_-rudin-2th.pdf) A key property is that finite-dimensional subspaces of Banach spaces are always closed, so finite-dimensional kernels are closed regardless of the operator's boundedness.[](https://www.math.ucdavis.edu/~hunter/book/ch5.pdf) However, the possession of infinite codimension does not guarantee closedness for arbitrary subspaces; counterexamples exist where dense, proper subspaces have infinite codimension but are not closed.[](https://www.math.purdue.edu/~iswanso/functionalanalysis.pdf) For bounded operators, the kernel's closedness holds even in infinite dimensions, facilitating applications like the open mapping theorem, which relates the kernel to the operator's range.[](https://www.math.ucdavis.edu/~hunter/book/ch5.pdf) Examples illustrate these concepts. Consider the differentiation operator $ D = \frac{d}{dx} $, defined on the dense subspace of continuously differentiable functions $ C^1[0,1] $ in the [Banach space](/page/Banach_space) $ C[0,1] $ of continuous functions on $[0,1]$ with the sup norm, mapping to $ C[0,1] $; this is an unbounded linear operator whose kernel consists of constant functions, a one-dimensional closed subspace.[](https://math.stackexchange.com/questions/3851315/showing-that-the-differentiation-operator-is-linear-on-c0-1-but-not-bound) In contrast, integral operators are typically bounded. For instance, the Fredholm integral [operator](/page/Integral_operator) $ (Tf)(x) = \int_0^1 K(x,y) f(y) \, dy $ on $ L^2[0,1] $, where $ K $ is a square-integrable kernel, has a kernel comprising functions $ f $ orthogonal to the range of the [adjoint](/page/Adjoint) operator, and this kernel is closed as a consequence of boundedness. Kernels play a central role in [spectral theory](/page/Spectral_theory) for bounded linear operators on Banach spaces. The [resolvent set](/page/Resolvent_set) consists of scalars $ \lambda $ for which $ T - \lambda I $ is invertible, implying $ \ker(T - \lambda I) = \{0\} $; the [spectrum](/page/Spectrum) includes the point spectrum, where $ \ker(T - \lambda I) \neq \{0\} $, corresponding to eigenvalues whose eigenspaces are the associated kernels.[](https://www.math.ucdavis.edu/~hunter/book/ch9.pdf) This extends the finite-dimensional notion of eigenvalues to infinite dimensions, with closed kernels ensuring well-defined spectral projections in many cases.[](https://www.impan.pl/~pmh/teach/algebra/additional/spectrum.pdf)
Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.