Active-set method

Active-set methodMain

Community hub

Active-set method

7 pages, 0 posts

0 subscribers

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Contribute something

About hubMembersContent overviewUpdatesRules

Main reference articles

Active-set method

View on Wikipedia

from Wikipedia

In mathematical optimization, the active-set method is an algorithm used to identify the active constraints in a set of inequality constraints. The active constraints are then expressed as equality constraints, thereby transforming an inequality-constrained problem into a simpler equality-constrained subproblem.

An optimization problem is defined using an objective function to minimize or maximize, and a set of constraints

g_{1}(x)\geq 0,\dots ,g_{k}(x)\geq 0

that define the feasible region, that is, the set of all x to search for the optimal solution. Given a point $x$ in the feasible region, a constraint

g_{i}(x)\geq 0

is called active at $x_{0}$ if $g_{i}(x_{0})=0$ , and inactive at $x_{0}$ if $g_{i}(x_{0})>0.$ Equality constraints are always active. The active set at $x_{0}$ is made up of those constraints $g_{i}(x_{0})$ that are active at the current point (Nocedal & Wright 2006, p. 308).

The active set is particularly important in optimization theory, as it determines which constraints will influence the final result of optimization. For example, in solving the linear programming problem, the active set gives the hyperplanes that intersect at the solution point. In quadratic programming, as the solution is not necessarily on one of the edges of the bounding polygon, an estimation of the active set gives us a subset of inequalities to watch while searching the solution, which reduces the complexity of the search.

Active-set methods

[edit]

In general an active-set algorithm has the following structure:

Find a feasible starting point

repeat until "optimal enough"

solve the equality problem defined by the active set (approximately)

compute the Lagrange multipliers of the active set

remove a subset of the constraints with negative Lagrange multipliers

search for infeasible constraints among the inactive constraints and add them to the problem

end repeat

The motivations for this is that near the optimum usually only a small number of all constraints are binding and the solve step usually takes superlinear time in the amount of constraints. Thus repeated solving of a series equality constrained problem, which drop constraints which are not violated when improving but are in the way of improvement (negative lagrange multipliers) and adding of those constraints which the current solution violates can converge against the true solution. The optima of the last problem can often provide an initial guess in case the equality constrained problem solver needs an initial value.

Methods that can be described as active-set methods include:^[1]

Successive linear programming (SLP)
Sequential quadratic programming (SQP)
Sequential linear-quadratic programming (SLQP)
Reduced gradient method (RG)
Generalized reduced gradient method (GRG)

Performance

[edit]

Consider the problem of Linearly Constrained Convex Quadratic Programming. Under reasonable assumptions (the problem is feasible, the system of constraints is regular at every point, and the quadratic objective is strongly convex), the active-set method terminates after finitely many steps, and yields a global solution to the problem. Theoretically, the active-set method may perform a number of iterations exponential in m, like the simplex method. However, its practical behaviour is typically much better.^[2]^: Sec.9.1

References

[edit]

^ Nocedal & Wright 2006, pp. 467–480
^ Nemirovsky and Ben-Tal (2023). "Optimization III: Convex Optimization" (PDF).

Bibliography

[edit]

Murty, K. G. (1988). Linear complementarity, linear and nonlinear programming. Sigma Series in Applied Mathematics. Vol. 3. Berlin: Heldermann Verlag. pp. xlviii+629 pp. ISBN 3-88538-403-5. MR 0949214. Archived from the original on 2010-04-01. Retrieved 2010-04-03.
Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-30303-1.

Revisions and contributors Edit on Wikipedia Read on Wikipedia

View on Grokipedia

from Grokipedia

The active-set method is an iterative algorithm in mathematical optimization for solving quadratic programming (QP) problems subject to linear equality and inequality constraints, by identifying and maintaining a working set of active constraints—those binding at the solution—and treating them as equalities while solving reduced equality-constrained subproblems.^[1] At each iteration, the method computes a search direction by minimizing the quadratic objective over the current active set, then updates the solution and adjusts the active set by removing constraints with negative Lagrange multipliers (indicating non-optimality) or adding violated constraints (to restore feasibility), continuing until the Karush-Kuhn-Tucker (KKT) conditions for optimality are met.^[1] This approach assumes a convex QP where the Hessian is positive semidefinite, ensuring finite termination under nondegeneracy assumptions.^[1] Key developments in active-set methods trace back to extensions of the simplex method for linear programming, with seminal work in the 1970s including Fletcher's general QP algorithm and Gill and Murray's numerically stable implementations for problems with bound constraints.^[1] The algorithm typically begins with an initial feasible point and a guess for the working set (often empty or based on simple bounds), solves the KKT system for the current subproblem using techniques like conjugate gradients or factorizations, and employs a line search or step-length computation to ensure progress toward feasibility and optimality.^[2] Variants address nonconvex QPs, degeneracy, or large-scale problems by incorporating trust regions, regularization, or dual formulations to improve convergence rates, which can achieve superlinear or quadratic speed in nondegenerate cases.^[3] Active-set methods excel in small- to medium-scale QPs due to their ability to exploit sparsity, structure, and warm-start capabilities—reusing factorizations from prior solutions—which makes them suitable for sequential QP solves in applications like model predictive control and support vector machines.^[1] They offer robustness over interior-point methods for certain constraint configurations and provide exact active sets at termination, aiding interpretability, though they may suffer from zigzagging in degenerate problems without safeguards.^[4] Extensions to nonlinear programming via sequential quadratic programming (SQP) frameworks integrate active-set strategies for general constrained optimization, balancing efficiency and reliability in engineering and economic models.^[5]

Introduction and History

Overview

The active-set method is an iterative algorithm employed in constrained optimization to solve problems involving inequality constraints by identifying and maintaining a working set of "active" constraints—those that bind at the current iterate, meaning they are satisfied with equality. This approach focuses on a subset of the constraints presumed to be active at the optimal solution, iteratively refining the set through additions or removals based on feasibility and optimality assessments. The method's core motivation stems from the combinatorial challenge posed by inequality constraints in optimization problems of the form

\min f(\mathbf{x})

subject to

g_i(\mathbf{x}) \leq 0

for

i = 1, \dots, m

, where the active constraints at a solution

\mathbf{x}^*

satisfy

g_i(\mathbf{x}^*) = 0

. By treating these active inequalities as equalities, the problem reduces to an easier equality-constrained subproblem, often solved using techniques like Lagrange multipliers, thereby simplifying computations and leveraging the structure of quadratic or linear objectives. This reduction enhances efficiency, particularly when the number of truly active constraints is small compared to the total, avoiding the need to consider all possible constraint combinations. To illustrate intuitively, consider a simple linear programming problem: minimize

3x_1 + 2x_2

subject to

x_1 + x_2 \leq 4

2x_1 + x_2 \leq 5

x_1 \geq 0

x_2 \geq 0

. The method, akin to the simplex algorithm, starts at a vertex like (0,0) where the non-negativity constraints are active, then computes an improving feasible direction (e.g., increasing

x_1

), moving along the boundary until another constraint becomes active, such as at the optimum (1,3) where

x_1 + x_2 = 4

and

2x_1 + x_2 = 5

bind.

Historical Development

The active-set method has its origins in the simplex algorithm for linear programming, introduced by George B. Dantzig in 1947, which iteratively updates a basis corresponding to an active set of constraints to solve linear optimization problems. This foundational approach influenced early extensions to quadratic programming (QP), where active-set ideas emerged to handle inequality constraints by identifying and updating subsets of binding constraints. In the late 1950s, E. M. L. Beale developed one of the first explicit active-set methods for QP, proposing an algorithm that solves a sequence of equality-constrained subproblems by adjusting the active set based on Lagrange multipliers.^[6] During the 1960s and 1970s, active-set methods gained prominence in nonlinear optimization. Roger Fletcher contributed significantly in 1971 with a general QP algorithm that refined active-set strategies for nonconvex cases, emphasizing efficient updates to avoid cycling and ensure finite termination. The extension to nonlinear problems accelerated in the 1970s through sequential quadratic programming (SQP), originating in R. B. Wilson's 1963 PhD thesis and further developed by S.-P. Han and M. J. D. Powell, who established local and global convergence properties using quasi-Newton approximations and merit functions. These advancements positioned active-set methods as robust alternatives to penalty and barrier approaches for constrained optimization. In the 1980s, primal-dual variants enhanced efficiency, particularly for QP subproblems in SQP. A landmark was the dual active-set method by D. Goldfarb and A. Idnani in 1983, which solves strictly convex QPs by iteratively adjusting dual variables and active bounds, offering numerical stability and finite convergence. The 1990s saw integration into machine learning, notably for training support vector machines (SVMs), where decomposition-based QP solvers employing working set strategies inspired by active-set methods, such as SVMlight (Joachims, 1999), exploited sparsity in the dual formulation to handle large-scale classification problems.^[7] Post-2000 developments focused on parametric and warm-start capabilities for real-time applications, such as model predictive control. The qpOASES solver by H. J. Ferreau et al. (2008) introduced an online active-set method that efficiently warm-starts sequences of related QPs by reusing prior active sets, enabling millisecond solutions in embedded systems. Since the 2010s, active-set methods have continued to evolve with advancements in complexity certification, parallel implementations, and machine learning-assisted techniques for predicting active sets to accelerate solving sequences of QPs.^[8] These evolutions are comprehensively summarized in J. Nocedal and S. J. Wright's Numerical Optimization (2006), a standard reference highlighting active-set methods' enduring impact.

Core Concepts

Definition of Active Set

In constrained optimization, consider the general problem of minimizing an objective function

f(x)

subject to inequality constraints

g_i(x) \leq 0

for

i = 1, \dots, m

, where

x \in \mathbb{R}^n

.^[9] The active set

A(x)

at a feasible point

x

is formally defined as the index set of all equality constraints (if present) together with the indices

i

of inequality constraints that hold with equality, i.e.,

A(x) = E \cup \{ i \in I \mid g_i(x) = 0 \}

, where

E

denotes the set of equality constraint indices and

I = \{1, \dots, m\}

the inequality indices.^[9] This set identifies the binding constraints that limit the feasible region at

x

, and standard notation uses

A

to represent these indices.^[10] In active-set methods, the algorithm maintains an estimate known as the working set, which approximates the true active set during iterative steps and guides the solution of equality-constrained subproblems.^[10] The working set is typically a linearly independent subset of the estimated active constraints to ensure computational feasibility.^[10] The inactive set comprises the remaining constraint indices not in the active set, which are treated as non-binding and thus ignored in the current subproblem, allowing the search direction to potentially violate them temporarily.^[11] Lagrange multipliers associated with constraints in the active set play a key role in verifying optimality conditions, as they must be nonnegative for inequalities to indicate binding status.^[9] For example, in bound-constrained optimization where each variable

x_j

satisfies

l_j \leq x_j \leq u_j

, the active set

A(\hat{x})

at a point

\hat{x}

consists of indices

j

where

\hat{x}_j = l_j

(lower bound active) or

\hat{x}_j = u_j

(upper bound active), often denoted with sign conventions to reflect gradient directions at stationarity.^[12]

Role in Constrained Optimization

The active-set method plays a central role in constrained optimization by leveraging the Karush-Kuhn-Tucker (KKT) conditions to identify and manage binding constraints. The KKT conditions consist of four key components: stationarity, which requires the gradient of the objective to lie in the cone spanned by the constraint gradients; primal feasibility, ensuring the solution satisfies all constraints; dual feasibility, mandating non-negative multipliers for inequalities; and complementary slackness, which enforces that multipliers are zero for non-binding inequalities. In this framework, the active set—comprising the indices of inequality constraints that are active (i.e., binding at equality)—determines which inequalities are treated as equalities, thereby satisfying complementary slackness for the optimal solution.^[13]^[14] Lagrange multipliers associated with active constraints provide insight into their binding nature and guide active-set updates. For an active constraint

g_i(x) = 0

, the corresponding multiplier

\lambda_i > 0

indicates that the constraint is binding and contributes positively to the stationarity condition

\nabla f(x) + \sum \lambda_i \nabla g_i(x) = 0

; conversely, a negative

\lambda_i

signals that the constraint is not optimally binding and should be removed from the active set to restore dual feasibility. This mechanism ensures that the method iteratively refines the active set to approach a KKT point.^[10]^[15] By treating active constraints as equalities, the active-set method reduces the original inequality-constrained problem to a sequence of equality-constrained subproblems, often formulated as quadratic programs (QPs) in sequential quadratic programming contexts. Specifically, the search direction

d

is obtained by solving the equality-constrained QP subproblem:

\min_d \ \frac{1}{2} d^T H d + g^T d \quad \text{subject to} \ A^T d = 0,

where

H

approximates the Hessian of the Lagrangian,

g = \nabla f(x)

, and the columns of

A

are the gradients of the active constraints. This corresponds to solving the linear system

\nabla^2 L \, d = -\nabla L

, with

L

the Lagrangian, effectively projecting the Newton step onto the null space of the active constraints.^[10]^[15] In contrast to interior-point methods, which remain strictly inside the feasible region by incorporating barrier terms to handle inequalities softly, active-set methods focus on the boundary of the feasible set, exactly enforcing active constraints while ignoring inactive ones during each iteration. This boundary-oriented approach facilitates precise handling of sparsity and structure in the constraints but requires careful management of active-set changes to ensure progress toward feasibility and optimality.^[15]

Algorithm

Step-by-Step Procedure

The active-set method begins with the initialization of a feasible starting point

x^0

that satisfies all inequality constraints

A x^0 \leq b

, where

A

is the constraint matrix and

b

the right-hand side vector; this feasible point can be obtained via a phase-one linear programming solver if necessary. An initial active set

\mathcal{A}^0

is selected, often empty or consisting of the constraints that are exactly tight at

x^0

, i.e.,

\mathcal{A}^0 = \{ i \mid a_i^T x^0 = b_i \}

, where

a_i^T

is the

i

-th row of

A

.^[10]^[16] In each iteration

k

, given the current feasible point

x^k

and active set

\mathcal{A}^k

, the method solves an equality-constrained quadratic subproblem to determine a search direction

d^k

and associated Lagrange multipliers

\lambda^k

: minimize

\frac{1}{2} (x^k + d)^T Q (x^k + d) + c^T (x^k + d)

subject to

A_{\mathcal{A}^k} (x^k + d) = b_{\mathcal{A}^k}

, where

Q

is the positive semidefinite Hessian matrix,

c

the linear term, and

A_{\mathcal{A}^k}

the rows of

A

indexed by

\mathcal{A}^k

. This subproblem is solved using techniques such as range-space or null-space decomposition, yielding

d^k

and

\lambda^k

. If

d^k = 0

and all

\lambda^k \geq 0

, then

x^k

satisfies the Karush-Kuhn-Tucker (KKT) conditions and is optimal. Otherwise, the algorithm proceeds to update the iterate.^[10]^[16] To maintain feasibility, the step length

\alpha^k

is computed as the largest value in

(0, 1]

such that

x^k + \alpha^k d^k

remains feasible for all inactive constraints

i \notin \mathcal{A}^k

, specifically

\alpha^k = \min_{i \notin \mathcal{A}^k, \, a_i^T d^k > 0} \frac{b_i - a_i^T x^k}{a_i^T d^k}

if such a blocking constraint exists; if no blocking occurs (i.e.,

x^k + d^k

is feasible), set

\alpha^k = 1

. The next iterate is then

x^{k+1} = x^k + \alpha^k d^k

, and the active set is updated by adding the index of the blocking constraint if

\alpha^k < 1

. If

d^k = 0

but some

\lambda^k < 0

, a constraint with negative multiplier is removed from

\mathcal{A}^k

instead, and

x^{k+1} = x^k

. The process repeats until convergence. Line search may be employed along

d^k

to ensure sufficient decrease in the objective if exact feasibility steps are insufficient.^[10]^[16] The high-level algorithm can be expressed in pseudocode as follows:

Initialize: Feasible x^0, active set A^0 = {i | a_i^T x^0 = b_i} While true: Solve the equality-constrained QP subproblem for d^k and λ^k on A^k If d^k = 0: If all λ^k ≥ 0: Return x^k as optimal Else: Remove j ∈ A^k with minimal (most negative) λ_j^k Set A^{k+1} = A^k \ {j}, x^{k+1} = x^k Else: Compute α^k = min{1, min_{i ∉ A^k, a_i^T d^k > 0} (b_i - a_i^T x^k)/(a_i^T d^k)} Set x^{k+1} = x^k + α^k d^k If α^k < 1: Add blocking index j to A^{k+1} = A^k ∪ {j} Else: Set A^{k+1} = A^k ∪ {i ∉ A^k | a_i^T x^{k+1} = b_i} k ← k + 1

This procedure ensures primal feasibility at each step while iteratively refining the active set toward optimality.^[16] As a simple illustrative example, consider the 2D quadratic program: minimize

x_1^2 + x_2^2 - 4x_1 - 4x_2

subject to

x_1 + x_2 \leq 2

and

x_1 - 2x_2 \leq 2

, with initial feasible point

x^0 = [0, 0]^T

and empty active set

\mathcal{A}^0 = \emptyset

. In the first iteration, the unconstrained subproblem yields direction

d^0 = [2, 2]^T

(the Newton direction). The step length is computed considering the constraints: for the first constraint,

a_1^T d^0 = 4 > 0

, so

\alpha^0 = (2 - 0)/4 = 0.5

; the second does not block. Thus,

x^1 = [1, 1]^T

, adding the first constraint to

\mathcal{A}^1 = \{1\}

. In the next iteration, solving the subproblem with the active set yields

d^1 = 0

and

\lambda^1 > 0

, confirming

x^1 = [1, 1]^T

as the optimum satisfying the KKT conditions.^[10]

Active Set Updates

In active-set methods for constrained optimization, the active set

A

is updated iteratively to better approximate the constraints binding at the solution. A key update mechanism involves removing constraints from

A

when their associated Lagrange multipliers indicate infeasibility in the dual problem. Specifically, after solving the equality-constrained subproblem with the current active set, if the Lagrange multiplier

\lambda_i < 0

for any constraint

i \in A

, that constraint is dropped from

A

, as a negative multiplier suggests the constraint should not be active at the optimum. This removal rule ensures progress toward dual feasibility while maintaining primal feasibility.^[17] To add constraints to

A

, the method computes a search direction

d

from the subproblem solution and determines the step length

\alpha

along this direction that reaches the boundary of the feasible region defined by inactive constraints. The step length is calculated as

\alpha = \min\left\{1, \min_{j \notin A, \nabla g_j(x)^T d > 0} \frac{-g_j(x)}{\nabla g_j(x)^T d}\right\}

, where

g_j(x)

is the

j

-th constraint function and

\nabla g_j(x)

is its gradient (often denoted

A_j

). The constraint

j

corresponding to the smallest such ratio—indicating the first violation along

d

—is then added to

A

. This addition rule identifies newly binding constraints, aligning the active set with the Karush-Kuhn-Tucker (KKT) complementary slackness conditions.^[17] The Lagrange multipliers for the active constraints are computed as part of solving the subproblem, which for quadratic programming takes the form of a KKT system. Given the objective

\frac{1}{2} x^T H x + c^T x

subject to

A_A x = b_A

(where

A_A

and

b_A

correspond to the active set), the direction

d

and multipliers

\lambda

satisfy:

\begin{bmatrix} H & A_A^T \\ A_A & 0 \end{bmatrix} \begin{bmatrix} d \\ \lambda \end{bmatrix} = \begin{bmatrix} -g \\ 0 \end{bmatrix},

where

g = Hx + c

is the gradient at the current iterate

x

. This linear system yields the multipliers directly, allowing immediate checks for removals. For general nonlinear problems, similar projections or Newton steps adapt this computation.^[17] Standard single-constraint additions can lead to zigzagging, where the algorithm oscillates between nearby active sets without substantial progress, particularly in nearly degenerate cases. To mitigate this, some implementations add multiple violated constraints per iteration—such as all those binding at the computed step or a subset based on violation severity—accelerating convergence while preserving feasibility. Active-set methods face challenges like cycling, where degenerate linear dependence among active constraints causes repeated activation and deactivation of the same sets, stalling progress. Prevention strategies include small perturbations to the objective or constraints, or specialized anti-cycling procedures that track iteration history and enforce non-repetition rules. Additionally, warm-starting leverages solutions from prior related problems (e.g., in sequential quadratic programming) by initializing with a near-optimal active set, significantly reducing iterations for subsequent solves.

Variants and Extensions

Primal-Dual Active-Set Methods

Primal-dual active-set methods extend the standard active-set approach by simultaneously solving for both the primal variables and the associated dual multipliers, ensuring progress toward satisfying the Karush-Kuhn-Tucker (KKT) conditions at each iteration. These methods are particularly suited for convex quadratic programming (QP) problems with equality constraints and simple bounds on the variables, where the objective is to minimize

\frac{1}{2} x^T H x + c^T x

subject to

A x = b

and

l \leq x \leq u

. Unlike purely primal methods that maintain feasibility in the primal space, primal-dual variants solve a coupled system that advances both primal and dual feasibility.^[18] The core of the primal-dual framework involves solving the KKT system for the equality-constrained subproblem defined by the current active set

\mathcal{A}

\begin{bmatrix} H & A \\ A^T & 0 \end{bmatrix} \begin{bmatrix} d \\ \Delta \lambda \end{bmatrix} = \begin{bmatrix} -g \\ 0 \end{bmatrix},

where

H

is the Hessian matrix (assumed positive semidefinite),

A

consists of the rows of the constraint matrix corresponding to the active equalities,

g = \nabla f(x)

is the gradient,

d

is the primal search direction, and

\Delta \lambda

is the change in dual multipliers. This system is solved to compute steps that reduce the duality gap and update both the primal iterate and the multipliers. The method identifies the active set by partitioning variables into fixed (at bounds) and free sets, transforming the problem into an equality-constrained QP over the free variables.^[18] A key advantage of primal-dual active-set methods is their ability to handle degeneracy—situations where multiple constraints are active at the solution without linear independence—through stabilized decompositions, making them robust for ill-conditioned problems. They are especially useful for QP with simple bounds, as the dual multipliers directly indicate bound violations, allowing efficient identification of the optimal active set without explicit projection steps. Compared to primal-only methods, these variants update both the active set and multipliers simultaneously; for instance, after computing the step, constraints are added to the active set if the updated multipliers for inactive bounds violate dual feasibility (e.g.,

\lambda_j < 0

for a lower bound that is inactive, indicating it should be activated). Constraints may also be removed if multipliers become zero or change sign, ensuring monotonic progress toward the KKT point.^[18] The Gill-Murray approach, a foundational primal-dual variant for QP, employs range-space and null-space decompositions to solve the KKT system efficiently, particularly for large-scale problems. In the range-space decomposition, the primal step is projected onto the range of

A^T

to satisfy equality constraints, while the null-space method parameterizes the free variables in the kernel of

A

to reduce the dimensionality of the Hessian solve. These techniques, combined with partial Cholesky factorizations for handling indefiniteness, enable the method to maintain numerical stability and avoid cycling in degenerate cases by using blocking tests on multipliers (e.g., blocking addition of a constraint if

\lambda_j \geq -\epsilon

for a small tolerance

\epsilon

). This approach has been implemented in solvers like SNOPT and forms the basis for many modern QP algorithms.^[18]^[19] Recent extensions of active-set methods include parametric variants for multiparametric QP, enabling explicit solutions for online optimization in model predictive control, and integrations with machine learning techniques like graph neural networks to accelerate solving large-scale QPs in embedded systems.^[20]^[21]

Applications in Quadratic and Nonlinear Programming

The active-set method is widely applied to quadratic programming (QP) problems of the form

\min_x \ \frac{1}{2} x^T Q x + c^T x \quad \text{subject to} \ Ax \leq b,

where

Q

is positive semidefinite,

A

has full row rank, and

b

satisfies strict complementarity at the solution. In this setting, the method iteratively identifies and refines the active set of constraints, solving reduced equality-constrained subproblems until optimality is achieved. For convex QP (where

Q

is positive definite), the algorithm guarantees finite termination due to the finite number of possible active sets and strict decrease in the objective function at each iteration. In nonlinear programming, active-set methods are integrated into sequential quadratic programming (SQP) frameworks, where the nonlinear problem is successively approximated by QP subproblems that incorporate a local quadratic model of the objective and linearized constraints. The active set from the previous QP solution serves as a warm start for the current subproblem, with updates occurring across major SQP iterations to reflect changes in the nonlinear constraint approximations and ensure progress toward a KKT point. This approach leverages the efficiency of active-set QP solvers while handling the nonlinearity through iterative refinement. For bound-constrained QP, specialized active-set variants combine gradient projection steps with active-bound identification to enforce simple bounds like

l \leq x \leq u

. These methods project the gradient onto the feasible region defined by the current active bounds, then update the set by adding or removing bounds based on Lagrange multiplier signs and projected search directions, often achieving faster convergence for problems with many inactive bounds. Active-set methods extend to piecewise-linear QP by treating nonsmooth terms, such as absolute values in the objective (e.g.,

\sum |x_i|

), as additional linear constraints that activate at kinks, effectively reformulating the problem as a parametric QP where the active set tracks the piecewise regions. This adaptation maintains finite termination for convex cases by enumerating the finite number of linear pieces.^[22] A representative application arises in portfolio optimization, where the QP minimizes portfolio variance subject to expected return targets and inequality constraints on asset weights (e.g., non-negativity and sum-to-one). Here, the active set typically identifies zero-weight assets as binding non-negativity constraints, allowing the method to solve the reduced problem over the active assets efficiently.^[23]

Analysis

Convergence Properties

The active-set method exhibits finite termination for strictly convex quadratic programming (QP) problems with nondegenerate constraints, cycling through a finite number of possible active sets until an optimal solution is reached.^[1] This property holds under key assumptions, including the strict convexity of the objective function (ensured by a positive definite Hessian), the linear independence of the gradients of the active constraints (LICQ), and primal and dual feasibility at each iteration.^[1] Nondegeneracy prevents cycling by ensuring that constraint violations or binding constraints change irreversibly, limiting the algorithm to at most

2^m

iterations, where

m

is the number of inequality constraints, though practical performance is typically much better.^[1] For nonlinear optimization, the active-set method is often embedded within sequential quadratic programming (SQP) frameworks, where global convergence to a Karush-Kuhn-Tucker (KKT) point is guaranteed under second-order sufficient conditions, LICQ, and the use of a merit function (such as the

\ell_1

penalty function) to ensure descent.^[1] These conditions include Lipschitz continuity of the objective and constraint gradients, boundedness of the Lagrange multipliers, and feasibility of the QP subproblems.^[1] The algorithm achieves this by solving a sequence of QP subproblems that approximate the nonlinear problem, with line-search or trust-region globalization strategies preventing divergence from remote starting points.^[1] Locally, near a solution satisfying second-order sufficient conditions, the method demonstrates superlinear convergence for QP, with the rate depending on the conditioning of the reduced Hessian in the null space of the active constraints; quadratic convergence is attainable if the Hessian is exact and well-conditioned.^[1] In SQP settings, superlinear or quadratic local convergence to the KKT point follows under LICQ and second-order sufficiency, particularly when quasi-Newton approximations are used and the Hessian error diminishes appropriately.^[1] A proof sketch for finite termination in QP relies on the finite number of possible working sets and the strict decrease in the quadratic objective at each iteration under nondegeneracy, ensuring no repetition of active sets; each step either solves the subproblem exactly or updates the active set via a nonzero Lagrange multiplier or constraint violation.^[1] For global convergence in SQP, the merit function reduction per iteration, combined with Taylor expansions around the KKT conditions, guarantees descent until stationarity, while local rates are established via Newton's method analysis on the primal-dual system.^[1]

Performance and Complexity

The active-set method exhibits a worst-case complexity that can be exponential in the number of inequality constraints, as there are up to

2^m

possible working sets for

m

constraints, potentially requiring exploration of many combinations in degenerate cases similar to the simplex method for linear programming.^[1]^[24] In practice, however, the method demonstrates polynomial-time behavior on most instances, performing efficiently on small- to medium-scale problems with up to roughly 1000 variables, where it benefits from rapid convergence and effective warm-starting for sequences of parametric quadratic programs.^[1] Compared to interior-point methods, active-set approaches excel on sparse or structured problems with fewer constraints, offering faster iteration times due to boundary-focused updates, though interior-point methods scale better to very large-scale quadratic programs with thousands of variables.^[1] Relative to penalty methods, active-set strategies provide superior accuracy on ill-conditioned constraints by directly identifying the active set without introducing artificial parameters that exacerbate numerical instability.^[1] Each iteration typically incurs a cost of

O(n^2)

operations for dense quadratic programs, dominated by factorization or updates to the reduced Hessian, though techniques such as range-space reduction—projecting onto the null space of active constraints via QR or Cholesky factorizations—can mitigate this to near-linear costs for sparse Hessians.^[1]

Applications

In Machine Learning

In machine learning, active-set methods are prominently applied to solve the dual quadratic programming formulation underlying support vector machines (SVMs), particularly in the soft-margin case where inequality constraints on Lagrange multipliers are managed by iteratively identifying and updating the active set of support vectors.^[25] This approach incrementally adds violations to the active set, enabling efficient handling of the convex QP by focusing computations on binding constraints, which correspond to the support vectors that define the decision boundary. Active-set methods also play a key role in sparse regression tasks, such as solving L1-regularized least squares problems like the Lasso, which can be reformulated as bound-constrained QPs where the active set identifies the subset of features with non-zero coefficients.^[26] By maintaining and updating this active set, the method promotes sparsity in the solution, selecting relevant predictors while setting others to zero, as demonstrated in path-following algorithms that trace solutions across regularization levels. A notable example is the Sequential Minimal Optimization (SMO) algorithm, a specialized active-set variant tailored for SVM training, which decomposes the dual QP by selecting and optimizing pairs of Lagrange multipliers at each step to update the active set.^[27] Introduced by Platt in 1998, SMO leverages analytical solutions for these two-variable subproblems, making it particularly effective for large-scale SVMs by avoiding full matrix inversions.^[27] These methods offer advantages in machine learning by naturally yielding sparse solutions—limiting SVMs to essential support vectors and Lasso to key features—which reduces model complexity and improves interpretability.^[28] They are especially efficient for kernel-based methods, where the dense kernel matrix in the QP Hessian is managed through active-set updates, avoiding explicit computation of the full matrix for semi-definite kernels like RBF.^[28] In the 2020s, active-set methods have found recent applications in neural network pruning and constrained deep learning, where they address bound-constrained optimization to enforce sparsity in weights or activations during training.^[29] For instance, boundary-wandering strategies inspired by active-set techniques enhance feasibility in equality-constrained deep networks, enabling efficient pruning while maintaining performance on resource-limited devices.^[30]

In Engineering and Finance

In financial modeling, the active-set method is widely applied to quadratic programming (QP) formulations of portfolio optimization, particularly in mean-variance models pioneered by Markowitz, where constraints on asset weights, risk budgets, and transaction costs are prevalent.^[31] These models minimize portfolio variance subject to expected return targets and inequality constraints, such as risk limits that cap exposure to volatile assets; the active-set method efficiently identifies and manages binding constraints by iteratively adjusting the working set of active inequalities, enabling sparse and efficient portfolios.^[31] For instance, in the classic Markowitz problem with short-sale prohibitions (non-negativity constraints on weights), the active-set approach determines which assets reach their lower bounds at optimality, effectively reducing the problem dimension and avoiding unnecessary computations on inactive constraints.^[32] In structural engineering, the active-set method supports truss design optimization by solving QPs that incorporate inequality constraints on member stresses, nodal displacements, and geometric limits to ensure structural integrity under load.^[33] This approach is particularly useful in topology optimization, where it handles large-scale problems by activating only relevant constraints, such as maximum stress thresholds in multi-load scenarios, leading to lightweight yet stable designs like bridges or frameworks.^[33] By focusing on the active set, the method scales to problems with billions of potential design variables, providing global optima for discretized ground structures while respecting displacement bounds to prevent failure modes.^[33] In control systems, parametric active-set methods enable real-time QP solving for model predictive control (MPC) in engineering applications, such as automotive trajectory tracking and aerospace attitude stabilization, where constraints on vehicle dynamics or actuator limits must be respected at each prediction horizon.^[34] These methods update the active set across sequential control steps, accommodating binding inequalities like speed limits in autonomous driving or thrust bounds in flight control, thus ensuring feasible and optimal trajectories.^[34] The approach's efficiency stems from its low iteration count per QP, making it suitable for embedded systems with millisecond deadlines.^[34] A key benefit of the active-set method in these domains, especially sequential finance problems like dynamic portfolio rebalancing, is warm-starting: it reuses the optimal active set from prior QPs to drastically reduce computation time, often converging in fewer than three iterations due to correlated constraint patterns across time steps.^[31] This feature enhances scalability in engineering designs requiring iterative refinements and financial models with frequent market updates.^[31]

History

Active-set method

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

Active-set method

Active-set methods

Performance

References

Bibliography

Active-set method

Introduction and History

Overview

Historical Development

Core Concepts

Definition of Active Set

Role in Constrained Optimization

Algorithm

Step-by-Step Procedure

Active Set Updates

Variants and Extensions

Primal-Dual Active-Set Methods

Applications in Quadratic and Nonlinear Programming

Analysis

Convergence Properties

Performance and Complexity

Applications

In Machine Learning

In Engineering and Finance

References

Add your contribution

Related Hubs

Contribute something