Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
Neighbourhood components analysis
Neighbourhood components analysis is a supervised learning method for classifying multivariate data into distinct classes according to a given distance metric over the data. Functionally, it serves the same purposes as the K-nearest neighbors algorithm and makes direct use of a related concept termed stochastic nearest neighbours.
Neighbourhood components analysis aims at "learning" a distance metric by finding a linear transformation of input data such that the average leave-one-out (LOO) classification performance is maximized in the transformed space. The key insight to the algorithm is that a matrix corresponding to the transformation can be found by defining a differentiable objective function for , followed by the use of an iterative solver such as conjugate gradient descent. One of the benefits of this algorithm is that the number of classes can be determined as a function of , up to a scalar constant. This use of the algorithm, therefore, addresses the issue of model selection.
In order to define , we define an objective function describing classification accuracy in the transformed space and try to determine such that this objective function is maximized.
Consider predicting the class label of a single data point by consensus of its -nearest neighbours with a given distance metric. This is known as leave-one-out classification. However, the set of nearest-neighbours can be quite different after passing all the points through a linear transformation. Specifically, the set of neighbours for a point can undergo discrete changes in response to smooth changes in the elements of , implying that any objective function based on the neighbours of a point will be piecewise-constant, and hence not differentiable.
We can resolve this difficulty by using an approach inspired by stochastic gradient descent. Rather than considering the -nearest neighbours at each transformed point in LOO-classification, we'll consider the entire transformed data set as stochastic nearest neighbours. We define these using a softmax function of the squared Euclidean distance between a given LOO-classification point and each other point in the transformed space:
The probability of correctly classifying data point is the probability of classifying the points of each of its neighbours with the same class :
Hub AI
Neighbourhood components analysis AI simulator
(@Neighbourhood components analysis_simulator)
Neighbourhood components analysis
Neighbourhood components analysis is a supervised learning method for classifying multivariate data into distinct classes according to a given distance metric over the data. Functionally, it serves the same purposes as the K-nearest neighbors algorithm and makes direct use of a related concept termed stochastic nearest neighbours.
Neighbourhood components analysis aims at "learning" a distance metric by finding a linear transformation of input data such that the average leave-one-out (LOO) classification performance is maximized in the transformed space. The key insight to the algorithm is that a matrix corresponding to the transformation can be found by defining a differentiable objective function for , followed by the use of an iterative solver such as conjugate gradient descent. One of the benefits of this algorithm is that the number of classes can be determined as a function of , up to a scalar constant. This use of the algorithm, therefore, addresses the issue of model selection.
In order to define , we define an objective function describing classification accuracy in the transformed space and try to determine such that this objective function is maximized.
Consider predicting the class label of a single data point by consensus of its -nearest neighbours with a given distance metric. This is known as leave-one-out classification. However, the set of nearest-neighbours can be quite different after passing all the points through a linear transformation. Specifically, the set of neighbours for a point can undergo discrete changes in response to smooth changes in the elements of , implying that any objective function based on the neighbours of a point will be piecewise-constant, and hence not differentiable.
We can resolve this difficulty by using an approach inspired by stochastic gradient descent. Rather than considering the -nearest neighbours at each transformed point in LOO-classification, we'll consider the entire transformed data set as stochastic nearest neighbours. We define these using a softmax function of the squared Euclidean distance between a given LOO-classification point and each other point in the transformed space:
The probability of correctly classifying data point is the probability of classifying the points of each of its neighbours with the same class :