Hubbry Logo
search
logo

Kernel principal component analysis

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Kernel principal component analysis

In the field of multivariate statistics, kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are performed in a reproducing kernel Hilbert space.

Recall that conventional PCA operates on zero-centered data; that is,

where is one of the multivariate observations. It operates by diagonalizing the covariance matrix,

in other words, it gives an eigendecomposition of the covariance matrix:

which can be rewritten as

(See also: Covariance matrix as a linear operator)

To understand the utility of kernel PCA, particularly for clustering, observe that, while N points cannot, in general, be linearly separated in dimensions, they can almost always be linearly separated in dimensions. That is, given N points, , if we map them to an N-dimensional space with

it is easy to construct a hyperplane that divides the points into arbitrary clusters. Of course, this creates linearly independent vectors, so there is no covariance on which to perform eigendecomposition explicitly as we would in linear PCA.

See all
User Avatar
No comments yet.