Newton's method in optimization
Newton's method in optimization
Main page
2088654

Newton's method in optimization

logo
Community Hub0 subscribers
2088654

Newton's method in optimization

logo
Community Hub0 subscribers
What are your thoughts?
Be the first to start a discussion here.
Be the first to start a discussion here.
Newton's method in optimization

In calculus, Newton's method (also called Newton–Raphson) is an iterative method for finding the roots of a differentiable function , which are solutions to the equation . However, to optimize a twice-differentiable , our goal is to find the roots of . We can therefore use Newton's method on its derivative to find solutions to , also known as the critical points of . These solutions may be minima, maxima, or saddle points; see section "Several variables" in Critical point (mathematics) and also section "Geometric interpretation" in this article. This is relevant in optimization, which aims to find (global) minima of the function .

The central problem of optimization is minimization of functions. Let us first consider the case of univariate functions, i.e., functions of a single real variable. We will later consider the more general and more practically useful multivariate case.

Given a twice differentiable function , we seek to solve the optimization problem

Newton's method attempts to solve this problem by constructing a sequence from an initial guess (starting point) that converges towards a minimizer of by using a sequence of second-order Taylor approximations of around the iterates. The second-order Taylor expansion of f around is

The next iterate is defined so as to minimize this quadratic approximation in , and setting . If the second derivative is positive, the quadratic approximation is a convex function of , and its minimum can be found by setting the derivative to zero. Since

the minimum is achieved for

Putting everything together, Newton's method performs the iteration

The geometric interpretation of Newton's method is that at each iteration, it amounts to the fitting of a parabola to the graph of at the trial value , having the same slope and curvature as the graph at that point, and then proceeding to the maximum or minimum of that parabola (in higher dimensions, this may also be a saddle point), see below. Note that if happens to be a quadratic function, then the exact extremum is found in one step.

See all
User Avatar
No comments yet.