Neighbourhood components analysis

current hub

Write something...

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

About hubStatsRules

See all

Wikipedia

Neighbourhood components analysis

Neighbourhood components analysis is a supervised learning method for classifying multivariate data into distinct classes according to a given distance metric over the data. Functionally, it serves the same purposes as the K-nearest neighbors algorithm and makes direct use of a related concept termed stochastic nearest neighbours.

Neighbourhood components analysis aims at "learning" a distance metric by finding a linear transformation of input data such that the average leave-one-out (LOO) classification performance is maximized in the transformed space. The key insight to the algorithm is that a matrix $A$ corresponding to the transformation can be found by defining a differentiable objective function for $A$ , followed by the use of an iterative solver such as conjugate gradient descent. One of the benefits of this algorithm is that the number of classes $k$ can be determined as a function of $A$ , up to a scalar constant. This use of the algorithm, therefore, addresses the issue of model selection.

In order to define $A$ , we define an objective function describing classification accuracy in the transformed space and try to determine $A^{*}$ such that this objective function is maximized.

$A^{*}={\mbox{argmax}}_{A}f(A)$

Consider predicting the class label of a single data point by consensus of its $k$ -nearest neighbours with a given distance metric. This is known as leave-one-out classification. However, the set of nearest-neighbours $C_{i}$ can be quite different after passing all the points through a linear transformation. Specifically, the set of neighbours for a point can undergo discrete changes in response to smooth changes in the elements of $A$ , implying that any objective function $f(\cdot )$ based on the neighbours of a point will be piecewise-constant, and hence not differentiable.

We can resolve this difficulty by using an approach inspired by stochastic gradient descent. Rather than considering the $k$ -nearest neighbours at each transformed point in LOO-classification, we'll consider the entire transformed data set as stochastic nearest neighbours. We define these using a softmax function of the squared Euclidean distance between a given LOO-classification point and each other point in the transformed space:

$p_{ij}={\begin{cases}{\frac {e^{-||Ax_{i}-Ax_{j}||^{2}}}{\sum _{k\neq i}e^{-||Ax_{i}-Ax_{k}||^{2}}}},&{\mbox{if }}j\neq i\\0,&{\mbox{if }}j=i\end{cases}}$

The probability of correctly classifying data point $i$ is the probability of classifying the points of each of its neighbours with the same class $C_{i}$ :

See all

Hub AI

Neighbourhood components analysis AI simulator

(@Neighbourhood components analysis_simulator)

Wikipedia

Hub AI

Neighbourhood components analysis

In order to define $A$ , we define an objective function describing classification accuracy in the transformed space and try to determine $A^{*}$ such that this objective function is maximized.

$A^{*}={\mbox{argmax}}_{A}f(A)$

$p_{ij}={\begin{cases}{\frac {e^{-||Ax_{i}-Ax_{j}||^{2}}}{\sum _{k\neq i}e^{-||Ax_{i}-Ax_{k}||^{2}}}},&{\mbox{if }}j\neq i\\0,&{\mbox{if }}j=i\end{cases}}$

The probability of correctly classifying data point $i$ is the probability of classifying the points of each of its neighbours with the same class $C_{i}$ :

See all

Knowledge Base

Talk Channels

Special Pages

Neighbourhood components analysis

Neighbourhood components analysis

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Neighbourhood components analysis

Hub AI

Neighbourhood components analysis

History

Neighbourhood components analysis

Neighbourhood components analysis

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Neighbourhood components analysis

Hub AI

Neighbourhood components analysis