Quadratic Programming
Mathwrist White Paper Series
Abstract
This document presents a technical overview of the Quadratic Programming (QP) feature implemented in Mathwristโs C++ Numerical Programming Library (NPL). We use the active set method with an inertia control strategy to handle both convex and general QP problems.
1 Introduction
Quadratic programming (QP) solves the following minimization problem with a quadratic objective function subject to linear constraints.
(1) |
Compared to linear programming (LP), QP objective function has an additional quadratic term with Hessian matrix . If is positive definite, the problem is known as a convex QP. Otherwise, it is called a general QP problem.
Following the same notation used in our LP white paper, we shall refer as the cost vector, as general linear constraints and as simple bound (box) constraints respectively. It is possible that some of the linear constraints are equality constraints, in which case their lower bound and upper bound are same.
Our main references include [1] by Jorge Nocedal and Stephen J. Wright, [2] by Gill et al., and [3] by Gill et al. Particularly inspired by [2], we implemented a framework as the outline algorithm for all active-set based methods in NPL. Different optimization problems have different customization in search direction, search step length and optimality conditions. They are largely common in terms of active-set operations. Please refer to our LP white paper for an overview of the active set method.
2 Equality Constrained QP
If all constraints are equality constraints and the Hessian matrix is positive definite, then from any given point we can reach to the optimal point with a single step . This step is the unique solution of the KKT system,
, where is the gradient of the objective , .
Let have the following QR factorization,
Since is a full basis, we can write in terms of a range space component and a null space component . Clearly, they individually are the solutions of
(2) | |||||
(3) |
Matrix is called the reduced Hessian . We can interpret the component as a feasibility step and component as the optimality step. In order to solve from equation (3), we compute the Cholesky factorization of .
At , the gradient is . If we want to make any further move yet retaining the feasibility with respect to all equality constraints, needs be in the null space, for some . The change of objective function introduced by is,
Because and is in the null space, . The objective function changed by is just the quadratic term . For positive definite , any move from causes the objective function to increase. This confirms is indeed the optimal point. In fact, we only need the reduced Hessian to be positive definite.
Finally, if is indefinite, we can first make a feasibility step move and then find a descent null space direction such that the quadratic term is negative. Because there is no any other constraint, we can move at infinite step length along . The problem has unbounded solution.
3 Convex QP
For conveniency purposes and without loss of generality, letโs now consider a standard QP formulation with only lower bounded linear constraints.
(4) |
, where the Hessian matrix is positive definite. Our QP active set implementation doesnโt convert the general form (1) to this standard form (4). We will discuss how the general constraints and simple bounds are handled in a later section.
Let denote the index set of linear equality constraints, . Assume has full row rank. The number of equality constraints is less than the number of unknowns .
At the start of the -th iteration, the method is given a feasible point . It partitions all contraints into a working set and a non-working set . The working index set includes all equality constraints and some inequality constraints that have equality exactly being held at the current point . includes all other constraints.
3.1 Sub QP
At each iteration, the active set method solves a sub QP problem subject to only equality constraints . If is positive definite, we can reach the local optimal at a unit step . This search step is the unique solution of the KKT system,
(5) |
Note that the second equation in the KKT system now is to retain equality with respect to . In other words, is a null space (optimality) step.
The KKT system (5) simultaneously solves the Lagrange multipliers at for all working constraints. If an inequality constraint in the working set, , has negative Lagrange multiplier , we can remove constraint from and start a new iteration. It can be shown that the search direction computed from the KKT system (5) using the updated working set is a feasible direction to the deleted constraint .
3.2 Objective Function Change
When we move along a null space direction for some , the objective function change is
The term is oftern called reduced gradient, denoted as here. The change is,
(6) |
3.3 Working Set Operations
After solving from KKT system (5) and moving along this search direction, we could hit a non-working constraint before a unit step. Let be the first blocking constraint among all non-working constraints. Let be the step length that we have moved along to hit . In this case we update . At this point, equality exactly holds. We therefore remove from and add it to . With this updated feasible point and new working set, we start the next iteration.
On the other hand, if we are able to make a unit step move , the local optimal is reached. We test the Lagrange multipliers for all constraint . Constraint with the most negative Lagrange is removed from the current working set and added to . We update and start the next iteration with the updated working set.
3.4 Upper Bounds and Mixed Constraints
NPL QP active set solver directly works on the general formulation (1). At a local optimal point , the Lagrange multiplier requirement is for lower bounded constraints and for upper bounded constraints. When we move along a descent feasible direction , for an inequality constraint , if , we check if the lower bound blocks the move. If , we check if its upper bound blocks. The working set not only tracks the constraint index, but also which bound has equality being satisfied.
NPL handles the mixture of general linear constraints and simple bounds in the same active set framework. When a variable is equal to one of its boundary values, it is considered as a fixed variable. A variable stays strictly inside the constrained range is a free variable. Let and denote the fixed and free variables respectively. Let and be the working constraint coefficient sub matrix corresponding to and . The set of working constraints effectively is
At each sub QP iteration, conceptually we have a KKT system in block structure
(8) |
, where we have arranged the Hessian matrix and gradient into fixed variable and free variable blocks. The search direction now is just relevant to free variables. At a local optimal , we now need also test the Lagrange multipliers for the simple bound constraints .
3.5 Null Space Direction and QR Update
From KKT system (8), clearly is a null space direction to . We maintain QR factorization of ,
Initially, a full QR decomposition is performed. Subsequently at each iteration, both simple bounds and general constraints can join or leave the working set. The block structure in (8) therefore changes accordingly. It is cost prohibitive to recompute a full QR factorization whenever changes. An efficient QR update scheme is implemented based on [2], section 5.2.4.
3.6 Cholesky Update of
Because we use Cholesky factorization to solve equation (3), it is also cost prohibitive to refactor whenever is modified due to working set changes. We developed an efficient updating scheme to adopt the working set changes and economically obtain the new Cholesky factor from the previous Cholesky factor.
4 General QP
In this section we again assume the standard QP formulation (4) for the convenience of discussion.
First of all, given a general Hessian matrix , the reduced Hessian could still possibly be positive definite at an iteration. In which case, the active set method continues exactly same as it solves the convex QP problems.
Secondly, if the current iteration is a stationary point wrt the working set and is positive semi-definite, is then at a local optimal point. We cannot reduce the objective function value unless the working set changes. The active set method procceds to test Lagrange multiplier conditions also same as convex QP problems.
In the next, we consider the situation when is positive semi-definite but is not a stationary point and when is indefinite. In both cases, we just need find a descent feasible direction with respect to the working set. By moving along this direction, if there is no blocking constraint , the problem is unbounded. Otherwise, the blocking constraint is added to the working set. is changed accordingly. We need start a new iteration and test the definiteness of the new .
4.1 Inertia Control
Based on [3], we implemented the inertia control strategy in combination with an extended KKT system. In summary, inertia control ensures that has at most one negative eigen value at any point of time. At a local optimal point, when a contraint is to be deleted from the working set, inertia control identifies if the deletion operation will cause being singular or indefinite. If so, is marked as deletion-pending but still kept in the extended KKT system. A zero curvature or negative curvature direction is then computed accordingly.
When we move along and hit a blocking constraint , if adding to lets the iteration reach to a vertex point, then clearly that is a local optimal point. We can remove from the extended KKT. If it is not a vertex, by testing the independency of wrt and , inertia control can identify if such a contraint addition operation restores positive definiteness of . If so, again can be physically removed from the extended KKT system. Otherwise, the algorithm keeps searching in zero or negative curvature directions and adding blocking constraints. Ultimately, either becomes positive definite or the current working set is a vertex point.
4.2 Positive Semi-definite and Non-stationary
When is not a stationary point, a null space direction that satisfies the following linear system is a zero curvature direction. Hence, the quadratic term in (6) vanishes.
(9) |
Then we just need flip the sign if necessary so that the first term in equation (6) is negative.
4.3 Indefinite and Non-stationary
When is indefinite at a non-stationary point, it is possible to find a negative curvature direction such that the quadratic term in (6) is negative. Note that we can again flip the sign of if necessary to make the first term in (6) negative as well. Such a direction can be solved by the linear system
(10) |
4.4 Indefinite and Stationary
If is indefinite but is a stationary point wrt the current working set, gradient necessarily is a linear combination of working constraints . Multiplying to the right hand side of the first equation in (10) will be 0. So we cannot use (10) to find a negative curvature direction.
It can be shown that by following an inertia control strategy [3], the only possibility of having indefinite at a stationary point is by deleting a constraint from the most recent local optimal point with . This observation allows us to find a negative curvature direction by solving,
(11) |
4.5 Initial Working Set
Our active set framework always uses the โcrash startโ procedure [2] to find an initial feasible point and an initial working set . Then a full QR factorication is performed to to obtain the range space and null space . Please refer to our LP white paper for details of this procedure.
For QP problems, the inertia control strategy requires the initial reduced Hessian being positive definite. If that is the case, surely for convex QP, we carry out an initial Cholesky factorization . The active set method is then ready to start its first iteration.
Given a general Hessian , it is possible that this initial is indefinite. We then perform a partial Cholesky [5],
, where the permutation effectively arranges the null space as such that is positive definite. All columns of are treated as artificial constraints and added to the initial working set. The active set method then starts the first iteration. At a local optimal, these artificial constraints are deleted before any real constraint.
5 Termination Control
The active set method has a control parameter for the max number of iterations allowed. The algorithm always terminates after iterations. Further, there is an exception-throw control. Once enabled, the algorithm throws an exception after iterations. If disabled, it just reports the โhalf-wayโ solution found after iterations.
At a local optimal, we test Lagrange multipliers with a โconvergence toleranceโ parameter . If all upper bounded constraints have Lagrange multiplier and all lower bounded constraints have Lagrange multiplier , Lagrange multiplier condition is considered as satisfied. The algorithm then terminates.
The inertia control strategy internally performs various numerical tests to determine if stationary condition is satisfied. The numerical tolerances used by those tests are critical to the correctness of the algorithm, hence are not exposed as user control. However, we do provide a user interface level โstationary toleranceโ that can be used as termination control. If the search direction computed by (5), (9), (10) and (11) satisfies , then we consider a stationary point is reached with respect to the current working set.
References
- [1] Jorge Nocedal and Stephen J. Wright: Numerical Optimization, Springer, 1999
- [2] Philip E. Gill, Walter Murray and Margaret H. Wright: Practical Optimization, Academic Press, 1981
- [3] Philip E. Gill, Walter Murray, Michael A. Saunders and Margaret H. Wright: Inertia-Controlling Methods For General Quadratic Programming, SIAM Review, Volume 33, Number 1, 1991, pages 1-36.
- [4] Philip E. Gill, Walter Murray, Michael A. Saunders and Margaret H. Wright: Procedures for Optimization Problems with a Mixture of Bounds and General Linear Constraints, ACM Transactions on Mathematical Software, Vol.10, No. 3, September 1984, Pages 282-298
- [5] Anders. Forsgren, Philip E. Gill and Walter Murray: Computing Modified Newton Directions Using a Partial Cholesky Factorization, SIAM J. SCI. COMPUT. Vol. 16, No. 1, pp. 139-150