Recent from talks
Nothing was collected or created yet.
Constrained optimization
View on WikipediaIn mathematical optimization, constrained optimization (in some contexts called constraint optimization) is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The objective function is either a cost function or energy function, which is to be minimized, or a reward function or utility function, which is to be maximized. Constraints can be either hard constraints, which set conditions for the variables that are required to be satisfied, or soft constraints, which have some variable values that are penalized in the objective function if, and based on the extent that, the conditions on the variables are not satisfied.
Relation to constraint-satisfaction problems
[edit]The constrained-optimization problem (COP) is a significant generalization of the classic constraint-satisfaction problem (CSP) model.[1] COP is a CSP that includes an objective function to be optimized. Many algorithms are used to handle the optimization part.
General form
[edit]A general constrained minimization problem may be written as follows:[2]
where and are constraints that are required to be satisfied (these are called hard constraints), and is the objective function that needs to be optimized subject to the constraints.
In some problems, often called constraint optimization problems, the objective function is actually the sum of cost functions, each of which penalizes the extent (if any) to which a soft constraint (a constraint which is preferred but not required to be satisfied) is violated.
Solution methods
[edit]Many constrained optimization algorithms can be adapted to the unconstrained case, often via the use of a penalty method. However, search steps taken by the unconstrained method may be unacceptable for the constrained problem, leading to a lack of convergence. This is referred to as the Maratos effect.[3]
Equality constraints
[edit]Substitution method
[edit]For very simple problems, say a function of two variables subject to a single equality constraint, it is most practical to apply the method of substitution.[4] The idea is to substitute the constraint into the objective function to create a composite function that incorporates the effect of the constraint. For example, assume the objective is to maximize subject to . The constraint implies , which can be substituted into the objective function to create . The first-order necessary condition gives , which can be solved for and, consequently, .
Lagrange multiplier
[edit]If the constrained problem has only equality constraints, the method of Lagrange multipliers can be used to convert it into an unconstrained problem whose number of variables is the original number of variables plus the original number of equality constraints. Alternatively, if the constraints are all equality constraints and are all linear, they can be solved for some of the variables in terms of the others, and the former can be substituted out of the objective function, leaving an unconstrained problem in a smaller number of variables.
Inequality constraints
[edit]With inequality constraints, the problem can be characterized in terms of the geometric optimality conditions, Fritz John conditions and Karush–Kuhn–Tucker conditions, under which simple problems may be solvable.
Linear programming
[edit]If the objective function and all of the hard constraints are linear and some hard constraints are inequalities, then the problem is a linear programming problem. This can be solved by the simplex method, which usually works in polynomial time in the problem size but is not guaranteed to, or by interior point methods which are guaranteed to work in polynomial time.
Nonlinear programming
[edit]If the objective function or some of the constraints are nonlinear, and some constraints are inequalities, then the problem is a nonlinear programming problem.
Quadratic programming
[edit]If all the hard constraints are linear and some are inequalities, but the objective function is quadratic, the problem is a quadratic programming problem. It is one type of nonlinear programming. It can still be solved in polynomial time by the ellipsoid method if the objective function is convex; otherwise the problem may be NP hard.
KKT conditions
[edit]Allowing inequality constraints, the KKT approach to nonlinear programming generalizes the method of Lagrange multipliers. It can be applied under differentiability and convexity.
Branch and bound
[edit]Constraint optimization can be solved by branch-and-bound algorithms. These are backtracking algorithms storing the cost of the best solution found during execution and using it to avoid part of the search. More precisely, whenever the algorithm encounters a partial solution that cannot be extended to form a solution of better cost than the stored best cost, the algorithm backtracks, instead of trying to extend this solution.
Assuming that cost is to be minimized, the efficiency of these algorithms depends on how the cost that can be obtained from extending a partial solution is evaluated. Indeed, if the algorithm can backtrack from a partial solution, part of the search is skipped. The lower the estimated cost, the better the algorithm, as a lower estimated cost is more likely to be lower than the best cost of solution found so far.
On the other hand, this estimated cost cannot be lower than the effective cost that can be obtained by extending the solution, as otherwise the algorithm could backtrack while a solution better than the best found so far exists. As a result, the algorithm requires an upper bound on the cost that can be obtained from extending a partial solution, and this upper bound should be as small as possible.
A variation of this approach called Hansen's method uses interval methods.[5] It inherently implements rectangular constraints.
First-choice bounding functions
[edit]One way for evaluating this upper bound for a partial solution is to consider each soft constraint separately. For each soft constraint, the maximal possible value for any assignment to the unassigned variables is assumed. The sum of these values is an upper bound because the soft constraints cannot assume a higher value. It is exact because the maximal values of soft constraints may derive from different evaluations: a soft constraint may be maximal for while another constraint is maximal for .
Russian doll search
[edit]This method[6] runs a branch-and-bound algorithm on problems, where is the number of variables. Each such problem is the subproblem obtained by dropping a sequence of variables from the original problem, along with the constraints containing them. After the problem on variables is solved, its optimal cost can be used as an upper bound while solving the other problems,
In particular, the cost estimate of a solution having as unassigned variables is added to the cost that derives from the evaluated variables. Virtually, this corresponds on ignoring the evaluated variables and solving the problem on the unassigned ones, except that the latter problem has already been solved. More precisely, the cost of soft constraints containing both assigned and unassigned variables is estimated as above (or using an arbitrary other method); the cost of soft constraints containing only unassigned variables is instead estimated using the optimal solution of the corresponding problem, which is already known at this point.
There is similarity between the Russian Doll Search method and dynamic programming. Like dynamic programming, Russian Doll Search solves sub-problems in order to solve the whole problem. But, whereas Dynamic Programming directly combines the results obtained on sub-problems to get the result of the whole problem, Russian Doll Search only uses them as bounds during its search.
Bucket elimination
[edit]The bucket elimination algorithm can be adapted for constraint optimization. A given variable can be indeed removed from the problem by replacing all soft constraints containing it with a new soft constraint. The cost of this new constraint is computed assuming a maximal value for every value of the removed variable. Formally, if is the variable to be removed, are the soft constraints containing it, and are their variables except , the new soft constraint is defined by:
Bucket elimination works with an (arbitrary) ordering of the variables. Every variable is associated a bucket of constraints; the bucket of a variable contains all constraints having the variable has the highest in the order. Bucket elimination proceed from the last variable to the first. For each variable, all constraints of the bucket are replaced as above to remove the variable. The resulting constraint is then placed in the appropriate bucket.
See also
[edit]References
[edit]- ^ Rossi, Francesca; van Beek, Peter; Walsh, Toby (2006-01-01), Rossi, Francesca; van Beek, Peter; Walsh, Toby (eds.), "Chapter 1 – Introduction", Foundations of Artificial Intelligence, Handbook of Constraint Programming, vol. 2, Elsevier, pp. 3–12, doi:10.1016/s1574-6526(06)80005-2, retrieved 2019-10-04
- ^ Martins, J. R. R. A.; Ning, A. (2021). Engineering Design Optimization. Cambridge University Press. ISBN 978-1108833417.
- ^ Wenyu Sun; Ya-Xiang Yuan (2010). Optimization Theory and Methods: Nonlinear Programming, Springer, ISBN 978-1441937650. p. 541
- ^ Prosser, Mike (1993). "Constrained Optimization by Substitution". Basic Mathematics for Economists. New York: Routledge. pp. 338–346. ISBN 0-415-08424-5.
- ^ Leader, Jeffery J. (2004). Numerical Analysis and Scientific Computation. Addison Wesley. ISBN 0-201-73499-0.
- ^ Verfaillie, Gérard, Michel Lemaître, and Thomas Schiex. "Russian doll search for solving constraint optimization problems." AAAI/IAAI, Vol. 1. 1996.
Further reading
[edit]- Bertsekas, Dimitri P. (1982). Constrained Optimization and Lagrange Multiplier Methods. New York: Academic Press. ISBN 0-12-093480-9.
- Dechter, Rina (2003). Constraint Processing. Morgan Kaufmann. ISBN 1-55860-890-7.
- Madsen, K.; Nielsen, H.B.; Tingleff, O. (March 2004). Optimization with Constraints (PDF) (Technical report) (2nd ed.). IMM/DTU. 4213. Retrieved Sep 6, 2025.
Constrained optimization
View on GrokipediaFundamentals
Definition and Scope
Constrained optimization refers to the mathematical process of finding the minimum or maximum of an objective function subject to equality or inequality constraints on the decision variables, which define a feasible region for potential solutions. This approach is fundamental in mathematical programming, where the goal is to achieve an optimal value within the boundaries imposed by the constraints, rather than searching unrestricted space.[1][6] In contrast to unconstrained optimization, where the objective can be evaluated over the entire domain without restrictions, constrained optimization limits the search to the feasible region, which may be bounded or unbounded and can introduce local optima due to the geometry of the constraints.[2] This distinction highlights how constraints model real-world limitations, such as resource availability or regulatory requirements, potentially altering the nature of the solution landscape.[1] The scope of constrained optimization extends to continuous problems with real-valued variables, discrete problems involving integer decisions, and mixed-integer formulations that combine both types. It finds widespread applications in economics for modeling resource allocation under scarcity, in engineering for design optimization subject to physical limits, and in operations research for solving logistics and scheduling challenges.[7][8][9][10] A basic illustrative example involves a manufacturer seeking to minimize the total production cost of two goods while adhering to fixed resource limits on raw materials, where the quantities of each good must collectively not exceed the available supply.[2] Understanding constrained optimization assumes foundational knowledge of calculus for handling function behaviors and linear algebra for representing constraints and variables efficiently.[11]Historical Development
The roots of constrained optimization trace back to the 17th and 18th centuries, when mathematicians began addressing problems involving extrema under constraints through the emerging field of calculus of variations. Leonhard Euler laid foundational work in 1744 with his publication Methodus inveniendi lineas curvas maximi minimive proprietate gaudentes, where he derived necessary conditions for optimizing functionals, introducing what became known as the Euler-Lagrange equation for variational problems with equality constraints.[12] This approach was pivotal in handling continuous optimization tasks, such as finding curves of minimal length or maximal area, and influenced subsequent developments in mechanics and physics. Joseph-Louis Lagrange built upon Euler's ideas in the late 18th century, formalizing the multiplier method in 1762 by simplifying optimality conditions for constrained variational problems and fully articulating it in his 1788 treatise Mécanique Analytique.[13] Lagrange's technique introduced auxiliary variables—now called Lagrange multipliers—to incorporate equality constraints directly into the objective function, enabling the solution of static equilibrium problems in mechanics without explicit substitution.[12] This method marked a significant advancement for equality-constrained optimization, providing a systematic analytical framework that persisted into the 19th century amid growing applications in engineering and economics. In the early 20th century, attention shifted toward inequality constraints, with Constantin Carathéodory contributing key insights in 1935 through his theorem on the properties of the inverse of the bordered Hessian in constrained optimization problems.[14] Carathéodory's work extended variational methods to handle inequalities more robustly, laying groundwork for comparative statics and optimal control.[15] This was followed by William Karush's 1939 master's thesis at the University of Chicago, which independently derived necessary optimality conditions for nonlinear problems with both equality and inequality constraints, though it remained unpublished at the time.[16] The mid-20th century saw formalization and broader adoption, as Harold W. Kuhn and Albert W. Tucker published their extension of Lagrange multipliers to inequality constraints in 1951, introducing the Kuhn-Tucker conditions as necessary for local optima in nonlinear programming.[17] Their work, building on Karush's earlier formulation, provided a cornerstone for modern constrained optimization theory. Post-World War II, George B. Dantzig's 1947 invention of the simplex method for linear programming revolutionized computational approaches to constrained problems, offering an efficient iterative algorithm that influenced the shift from analytical to numerical methods across optimization.[18] The modern era integrated constrained optimization with computing advancements, exemplified by Narendra Karmarkar's 1984 development of an interior-point method, a polynomial-time algorithm for linear programming that traversed the interior of the feasible region rather than boundaries, dramatically improving scalability for large-scale problems.[19] This innovation spurred widespread adoption of interior-point techniques in the 1980s and beyond, bridging theoretical guarantees with practical efficiency in fields like operations research.[17]Formulations
General Problem Statement
Constrained optimization problems involve finding values of decision variables that minimize (or maximize) an objective function while satisfying a set of constraints defining the feasible region. The standard mathematical formulation is to minimize a function subject to equality constraints for and inequality constraints for , where represents the vector of decision variables.[20] This general form encompasses both equality-constrained and inequality-constrained problems as special cases, with the latter arising when all equalities are absent and vice versa.[21] The feasible set is defined as , which must be nonempty for the problem to be well-posed. In many practical settings, assumptions such as convexity of , , and ensure that local optima are global, while regularity conditions like the linear independence constraint qualification (LICQ)—requiring the gradients of active constraints to be linearly independent—facilitate the application of optimality conditions.[21] The objective function can be linear, quadratic, or nonlinear, and the variables may be continuous or discrete, though classical methods typically assume continuous domains.[20] Maximization problems are equivalently formulated by minimizing , preserving the structure of the constraints. Multi-objective extensions replace the scalar with a vector of objectives, seeking the Pareto front—a set of nondominated solutions where no objective can improve without degrading another—subject to the same constraints.[22] Stochastic variants arise when , , or involve expectations over random variables, as in risk-averse or data-driven optimization.[23] For solution methods to apply, functions are often assumed differentiable, with twice-differentiability aiding second-order analyses.[21]Equality-Constrained Form
In equality-constrained optimization, the problem is formulated as minimizing an objective function subject to equality constraints for , where each is typically smooth, and to ensure the system is underdetermined with a nontrivial feasible set.[24] This setup contrasts with unconstrained problems by restricting the search space to the solution set of the constraints, often leading to a lower-dimensional feasible region.[25] Geometrically, the feasible set , where , forms a smooth manifold of dimension under suitable regularity conditions, embedded in .[26] Optimal points occur at critical points of restricted to this manifold, where the gradient of is orthogonal to the tangent space of the constraint surface, ensuring no descent direction lies within the feasible set.[27] To guarantee the existence of optimal solutions and the validity of necessary conditions, constraint qualifications are imposed on the equality constraints. The linear independence constraint qualification (LICQ) requires that the gradients for are linearly independent at a point in the feasible set.[28] For equality-only problems, the Mangasarian-Fromovitz constraint qualification (MFCQ) simplifies to LICQ, as it demands linear independence of the equality gradient vectors without additional inequality considerations.[27] Duality in equality-constrained optimization introduces the Lagrangian function , where are Lagrange multipliers.[29] This formulation encodes the constraints into an unconstrained saddle-point problem, with the dual function providing a lower bound on the primal objective under strong duality conditions such as convexity and a constraint qualification like the linear independence constraint qualification (LICQ).[30] A simple example is minimizing subject to . The feasible set is the line , and substituting yields , with minimum at , , and . Equality-constrained forms provide foundational building blocks for extending to inequality constraints, where active inequalities mimic equalities at optima.[25]Inequality-Constrained Form
In inequality-constrained optimization, the problem is typically formulated as finding the minimum of an objective function over a feasible region defined by inequality constraints: where is the objective function (often assumed differentiable), and each is a constraint function.[21] This setup contrasts with unconstrained optimization by restricting the search space to the intersection of sublevel sets . At a local optimum , constraints are classified as active if or inactive if ; active constraints bind the solution, effectively reducing the dimensionality of the problem, while inactive ones do not influence the local behavior.[31] The feasible region is a closed convex set if the are convex; for linear inequalities , forms a convex polyhedron bounded by hyperplanes. A key conceptual aspect involves complementarity, where at the optimum, each constraint is either inactive () with zero associated multiplier, or active with a non-negative multiplier, satisfying for multiplier ; this ensures that inactive constraints do not penalize the objective unduly.[21] For convex problems (where and are convex), Slater's condition guarantees strong duality: if there exists a strictly feasible point such that for all , then the primal and dual optimal values coincide, and the dual problem attains its optimum.[32] This condition ensures that local minima are global and facilitates efficient solvability, as the duality gap vanishes under mild assumptions. Equality constraints can be incorporated by converting each into a pair of inequalities and , transforming the problem into a purely inequality-constrained form; however, this requires constraint qualifications (such as linear independence of gradients) to preserve optimality conditions and avoid ill-posedness.[33] A representative example is resource allocation in linear programming, formulated as subject to and , where represents allocations, captures resource limits, and are bounds; here, nonnegativity constraints (or ) model physical limits, and the feasible region is a polyhedron.[34] When the objective or constraints are nonconvex, the feasible region may be nonconvex, leading to multiple local optima where active sets vary across basins of attraction; in such cases, distinguishing local from global optima is challenging, often requiring global search techniques, as local methods may converge to suboptimal points.[35]Solution Methods for Equality Constraints
Direct Substitution
The direct substitution method, also known as the elimination method, addresses equality-constrained optimization problems by solving the constraint equations explicitly for a subset of the decision variables and substituting these expressions into the objective function, thereby transforming the problem into an unconstrained minimization over the remaining free variables. This approach is particularly applicable when the number of equality constraints is small relative to the number of variables , assuming the constraints are independent and solvable for dependent variables in terms of free variables . The resulting reduced objective function is then , where denotes the full vector of variables after substitution, and optimization proceeds using standard unconstrained techniques such as gradient descent or Newton's method.[21] The method follows a structured process: first, select and solve the equality constraints for the dependent variables, yielding explicit functions under suitable regularity conditions; second, insert this into the objective to form ; third, compute the minimum of by setting its gradient to zero or applying iterative algorithms, ensuring the solution satisfies the original constraints. For linear constraints, this substitution is exact and preserves the problem's structure without approximation.[21] This technique offers simplicity and computational efficiency in low-dimensional settings with few constraints, as it avoids introducing auxiliary variables and directly yields an unconstrained problem that can leverage well-established solvers. It is exact for linear equality constraints, where the substitution maintains convexity if the original objective is convex. However, explicit solvability of nonlinear constraints is often infeasible, limiting applicability to problems where closed-form expressions exist. Additionally, numerical instability arises for nonlinear cases if the implicit functions are poorly conditioned, potentially amplifying errors in constraint satisfaction during optimization.[21][36] The validity of local substitution relies on the implicit function theorem, which guarantees the existence and differentiability of the implicit functions provided the Jacobian matrix of the constraints with respect to the dependent variables, , is nonsingular (i.e., full rank ) at the solution point. Violation or near-violation of this nonsingularity assumption—such as when the Jacobian determinant approaches zero—leads to high sensitivity: minor perturbations in the constraints or objective can cause substantial deviations in the substituted variables, resulting in ill-conditioned reduced problems and unreliable optima. This error amplification is particularly pronounced in numerical implementations, where floating-point precision exacerbates the instability, often necessitating regularization or alternative methods like Lagrange multipliers for robustness.[37][36] A representative example illustrates the method's application. Consider minimizing subject to the linear equality constraint . Solve the constraint for , yielding the reduced objective . Differentiating gives , so and , with minimum value . Here, the linear constraint allows exact substitution without numerical concerns.[21]Lagrange Multiplier Method
The Lagrange multiplier method provides a systematic approach to finding local optima of a function subject to equality constraints by transforming the constrained problem into an unconstrained one involving auxiliary variables known as Lagrange multipliers. Developed originally by Joseph-Louis Lagrange in his 1788 work Mécanique Analytique to address problems in classical mechanics, the method has become a cornerstone of optimization theory for equality-constrained problems.[13] It applies to nonlinear optimization problems where the objective function and constraints are differentiable, enabling the identification of stationary points that satisfy necessary conditions for optimality.[21] Consider the equality-constrained optimization problem of minimizing a smooth objective function subject to equality constraints for , with . The Lagrangian function is formed as where are the Lagrange multipliers. The method seeks stationary points of the Lagrangian, satisfying the stationarity conditions and , which translate to the coupled nonlinear system This system is solved simultaneously for and , avoiding explicit elimination of variables as in direct substitution methods.[21] The derivation of these conditions can be obtained using the implicit function theorem applied to the constraint manifold. Near a regular point where the constraint gradients are linearly independent, the feasible set is locally a smooth -dimensional manifold. Parameterizing this manifold implicitly and considering a Taylor expansion of along feasible directions yields that at a local optimum , the gradient must lie in the span of the constraint normals , leading to for some multipliers . The signs in the Lagrangian convention account for minimization.[37] Geometrically, the method interprets the optimum as the point where the level sets of the objective function are tangent to the constraint surface, implying that is a linear combination of the . The multipliers quantify the sensitivity of the optimal value to perturbations in the -th constraint; specifically, represents the rate of change of the optimal objective value with respect to a relaxation of to , often termed shadow prices in economic applications. Under suitable regularity, the envelope theorem confirms at .[21] A key assumption for the existence and uniqueness of multipliers is the linear independence constraint qualification (LICQ), which requires that the gradients are linearly independent at the candidate optimum . This ensures the constraint Jacobian has full row rank, guaranteeing a unique and that is a regular point on the feasible set. Without LICQ, multiple multipliers may exist or the conditions may fail to hold.[38] To classify the stationary point as a local minimum, maximum, or saddle, second-order sufficient conditions involve the bordered Hessian of the Lagrangian, defined as the matrix evaluated at . For a local minimum, the last principal minors of must be positive (or satisfy sign conditions for maxima). Equivalently, in the reduced Hessian approach, directions in the null space of the constraint Jacobian (i.e., ) must satisfy . These conditions ensure positive definiteness on the tangent space to the feasible set.[39][21] In practice, the nonlinear system is solved iteratively, for example, using Newton's method on the KKT conditions, which linearizes to with updates and . Gradient-based methods on the augmented Lagrangian (with penalty parameter ) can also be employed, updating via after each unconstrained minimization of . Convergence requires sufficiently large and LICQ.[21] As an illustrative example, consider maximizing subject to . The Lagrangian is . Stationarity gives , so , , and , yielding , , , with . The bordered Hessian confirms this as a maximum, as the relevant minor is negative. Here, indicates that relaxing the constraint by increases the maximum by .[21]Solution Methods for Inequality Constraints
Karush-Kuhn-Tucker Conditions
The Karush-Kuhn-Tucker (KKT) conditions provide first-order necessary conditions for a solution to be optimal in nonlinear programming problems involving both equality and inequality constraints. These conditions generalize the method of Lagrange multipliers to handle inequality constraints by introducing complementary slackness and nonnegativity requirements on the multipliers. They were first derived by William Karush in his 1939 master's thesis at the University of Chicago, though this work remained unpublished and overlooked for decades. Independently, Harold W. Kuhn and Albert W. Tucker formalized and published the conditions in 1951, establishing them as a cornerstone of constrained optimization theory. The oversight of Karush's contribution was later recognized in the optimization community, leading to the full attribution as KKT conditions. Consider the general nonlinear programming problem of minimizing an objective function subject to equality constraints for and inequality constraints for , where , , and are continuously differentiable. The Lagrangian is defined as , extending the equality-only Lagrangian by incorporating nonnegative multipliers for the inequalities. The KKT conditions require that there exist multipliers and such that a point satisfies the full system:- Stationarity: , or equivalently,
- Primal feasibility: for all and for all .
- Dual feasibility: for all .
- Complementary slackness: for all .

