Linear Programming
Mathwrist White Paper Series

Copyright ยฉMathwrist LLC 2023
(January 20, 2023)
Abstract

This document presents a technical overview of the Linear Programming (LP) feature in Mathwristโ€™s C++ Numerical Programming Library (NPL). Two algorithms are implemented to solve LP problems. One is the classic Simplex method. The second one is an active-set based method with linear objective function.

1 Introduction

In canonical form, linear programming (LP) solves the following minimization problem with a linear objective function subject to linear constraints.

min๐ฑโˆˆโ„nโก๐œTโข๐ฑ,ย s.t.ย โข๐€๐ฑ=๐›โขย andย โข๐ฑโ‰ฅ0 (1)

To better support practical use, we actually formulate LP problems in a more general form,

min๐ฑโˆˆโ„nโก๐œTโข๐ฑ,ย s.t.ย โข๐›lโ‰ค๐€๐ฑโ‰ค๐›uโขย andย โข๐ฅโ‰ค๐ฑโ‰ค๐ฎ (2)

We shall refer ๐œ as the cost vector, ๐›lโ‰ค๐€๐ฑโ‰ค๐›u as general linear constraints and ๐ฅโ‰ค๐ฑโ‰ค๐ฎ as simple bound (box) constraints respectively. Our implementation of both the Simplex method and the active-set method all take this general LP formulation at user interface level. Internally, the Simplex method needs convert this general form to the canonical form.

In the general form, let โ„ฐ denote the index subset of general constraints that require equality ๐€โ„ฐโข๐ฑ=๐›โ„ฐ to hold. Assume ๐€โ„ฐ has full row rank. Clearly, the number of equality constraints should be less than n. Otherwise the solution is purely determined by ๐€โ„ฐ.

Our main references for Simplex method are [1] by David G. Luenberger and [2] by Jorge Nocedal and Stephen J. Wright. Inspired by [3] by Gill et al., we effectively implemented an active-set based framework that can be customized to handle LP or many other problems.

2 Simplex Method

The classic Simplex algorithm is derived from the canonical LP formulation. In this canonical form, all general constraints are equality constraints. We shall assume ๐€ has full row rank. Let m be the number of linear equality contraints and n be the number of unknowns. Again, m<n, otherwise we either donโ€™t have a feasible solution, or the solution is just determined by the linear system ๐€๐ฑ=๐›.

Next, let โ„ฌ be an index set of exact m number of index elements, โ„ฌโˆˆ{0,1,โ‹ฏ,n}. Let ๐’ฉ denote the complement subset. A unknown variable xi that has index iโˆˆโ„ฌ or iโˆˆ๐’ฉ is called a basic variable or non-basic variable respectively. We can shuffle the order of unknown variables and rewrite the linear equality constraints as

(๐๐)โข(๐ฑB๐ฑN)=๐›

, where ๐ฑB and ๐ฑN are basic variables and non-basic variables. ๐ and ๐ are the constraint coefficient submatrix for ๐ฑB and ๐ฑN respectively. If we could find an index set โ„ฌ such that ๐ is nonsingular and ๐ฑB=๐-1โข๐›โ‰ฅ0, we can then fix ๐ฑN=0 to obtaint a basic feasible solution.

The Fundamental Theorem of Linear Programming, i.e. [2] section 13.2, ensures that if the LP problem (1) has solutions, at least one of them is a basic point. Given an initial basic feasible point ๐ฑ0 and the associated index set, Simplex method iteratively searches in the space of basic feasible points. At each iteration, it tests whether the optimality conditions are satisfied. If not, a non-basic index qโˆˆ๐’ฉ is identified as the โ€œenter indexโ€ and a basic index pโˆˆโ„ฌ is identified as the โ€œquit indexโ€. By increasing variable xq from zero and decreasing variable xp to 0, the objective function ๐œTโข๐ฑ decreases yet maintaining all feasibility conditions, hence obtaining an improved basic feasible point.

2.1 Optimality Conditions

At the k-th iteration, let (๐ฑk,๐k,๐k) denote the basic feasible point, the constraint coefficient matrix for basic variables ๐ฑB and the constraint coefficient matrix for non-basic variable ๐ฑN. Let ๐œB and ๐œN be cost vector relevant to ๐ฑB and ๐ฑN. The following conditions need hold in order for ๐ฑk to be an optimal solution,

๐kTโขฮปB = ๐œB
๐kTโขฮปB+ฮปN = ๐œN
ฮปN โ‰ฅ 0

, where ฮปB is the vector of Lagrange multipliers for strict equality conditions ๐kโข๐ฑB=๐›, ฮปN is the vector of Lagrange multipliers for non-negative conditions ๐ฑNโ‰ฅ0 where ๐ฑN is exactly fixed at 0 currently.

We first solve a mร—m linear system to obtain

ฮปB=๐kT-1โข๐œB

, and then

ฮปN=๐œN-๐kTโขฮปB

If all elements in ฮปN are positive, the Simplex algorithm terminates and returns ๐ฑk as the solution.

2.2 Pivoting and LU Update

When ๐ฑk does not satisfy the optimality conditions, we have ฮปq<0 for some qโˆˆ๐’ฉ. Let ๐€q be the q-th column of matrix ๐€, which is the constraint coefficient vector for variable xq. It can be shown that there exists a feasible descent direction ๐ซ,

๐ซ=-๐k-1โข๐€q

, such that by moving along this direction,

  1. 1.

    xq becomes positive and

  2. 2.

    all other non-basic variables are fixed at 0 and

  3. 3.

    equality condition ๐€๐ฑ=๐› is retained and

  4. 4.

    objective function value ๐œTโข๐ฑ decreases.

Also, by moving along ๐ซ, if all basic variable ๐ฑB increase value, we declare the given LP problem is unbounded. Otherwise, โˆƒpโˆˆโ„ฌ such that the basic variable xp hits zero before any other basic varialbes. Now we have an โ€œenter indexโ€ q to join the basic index set โ„ฌ and a โ€œquit indexโ€ p to join the non-basic index set ๐’ฉ. This is the pivoting operation. By the end of it, we have obtained a new basic feasible point (๐ฑk+1,๐k+1,๐k+1). One can show that ๐k+1 obtained by the pivoting operation also have full rank.

A crucial part of a Simplex implementation is to efficiently solve the linear system ๐๐ฒ=๐ณ for some right hand side vector ๐ณ. At the first iteration, we have a LU decomposition on ๐0=๐‹0โข๐”0. Subsequently, ๐k+1 is obtained by swapping columns ๐€p and ๐€q for quit and enter index p and q. The LU decomposition of ๐k+1 can be economically obtained by updating ๐‹k and ๐”k. Our implementation is based on the LU update algorithm [4] by J.J. Forrest and J. A. Tomlin.

2.3 Canonical Form Conversion

Given a LP problem descriped in the general form (2), we need to convert it to the canonical form (1). First of all, for an unknown variable xi with simple bound constraint liโ‰คxiโ‰คui, we can convert it to two equality constraints xi-sยฏi=li and xi+s^i=ui by introducing slack and surplus variable sยฏiโ‰ฅ0 and s^iโ‰ฅ0. Then xi becomes a free variable. All free variables xi can be further decomposed to positive part xi+โ‰ฅ0 and negative part xi-โ‰ฅ0 as xi=xi+-xi-. This approach effectively makes the augmented unknown space very large.

To implement an efficient algorithm, NPL actually only introduces slack and surplus variables to general inequality constraints. For unknowns with simple bound constraints, we construct the index set โ„ฌ and ๐’ฉ such that the basic variables ๐ฑB stay in their feasible region and non-basic variables ๐ฑN are fixed at one of their bounds. When we test the optimality conditions, we need flip the sign of a Lagrange multiplier ฮปiโˆˆฮปN if the non-basic variable xi is fixed at its upper bound.

All general inequality constraints are splitted to a set of upper bounded constraints ๐€uโข๐ฑโ‰ค๐›u and a set of lower bounded constraints ๐€lโข๐ฑโ‰ฅ๐›l. By introducing slack and surplus variables, ๐ฌยฏโ‰ฅ0 and ๐ฌ^โ‰ฅ0, we construct a new set of linear equality constraints with respect to the augmented unknwns (๐ฑ,๐ฌยฏ,๐ฌ^).

(๐€โ„ฐ๐€u๐ˆ๐€l-๐ˆ)=(๐ฑ๐ฌยฏ๐ฌ^)โข(๐›โ„ฐ๐›u๐›l)

We shall refer this new set of linear constratins as ๐€aโขuโขgโข๐ฑaโขuโขg=๐›. Note that by this construction, we have more unknowns than the linear equality constraints. The constraint specification is then converted to the Canonical form (1).

2.4 Initial Basic Feasible Point

To obtain an initial basic feasible point (๐ฑ0,๐0,๐0), we first enforce feasibility to simple bound constraints. If one unknown variable provided by the initial guess is outside of its simple bounds, we fix it at the nearest boundary. Secondly, we add artificial variables ๐ณโ‰ฅ0 so that equality strictly holds for all linear constraints.

The initial basic index set containts all artificial variables ๐ณ and some of ๐ฌยฏ and ๐ฌ^. This gives us a diagonal matrix ๐0 with diagonal elements being either 1 or -1. Lastly, we solve the following phase-I LP feasibility subproblem.

min๐ฑ,๐ฌยฏ,๐ฌ^,๐ณโก๐žTโข๐ณ,ย s.t. ๐€aโขuโขgโข๐ฑaโขuโขgยฑ๐ณ=๐›
๐ฅโ‰ค๐ฑโ‰ค๐ฎ,
๐ฌยฏโ‰ฅ0,
๐ฌ^โ‰ฅ0,
๐ณโ‰ฅ0,
๐ฑaโขuโขg=(๐ฑ,๐ฌยฏ,๐ฌ^)

Clearly, when the Simplex method terminates at the end of this phase-I feasibility problem, we either declare that the original LP problem has no feasible solution if any artificial variable in ๐ณ is still positive, or have obtained an initial basic feasible point.

2.5 Selection of the Enter Index q

When we test optimality conditions, likely there are multiple negative Lagrange multipliers in ฮปN. There are different strategies in practice to choose the โ€œenter indexโ€ q from those negative elements. We use the โ€œSteepest Edgeโ€ method outlined in [2], section 13.5. This strategy has the advantage of finding a search direction ๐ซ that tends to render large step moves.

3 Active-set Method

Active-set based methods can directly work on the general form (2) of a LP problem. Similar to the Simplex method, an active-set method also generates a sequence of feasible points. At each iteration, it partitions all linear constraints into two sets. One set is called โ€œworking setโ€, which includes the set โ„ฐ of equality constraints and some inequality constraints that now have equality exactly holds for the current feasible point ๐ฑ. The other set is the non-working set. Both working and non-working sets are feasible at ๐ฑ.

Let ๐’ฒ and ๐’ฉ denote the index set of working and non-working general constraints respectively. The size of ๐’ฒ never exceeds n. Let ๐€๐’ฒ and ๐€๐’ฉ be the corresponding working and non-working constraint coefficient matrix. At each iteration, the method computes a null space direction ๐ฉ wrt ๐€๐’ฒ such that moving along this direction does not change the value of ๐€๐’ฒ. Hence we shall refer ๐ฉ as a feasible direction.

By moving along a descent feasible direction ๐ฉ, the objective function value will decrease. Non-working constraints will also change values, unless they are linearly dependent to ๐€๐’ฒ. If contraint ๐€i,iโˆˆ๐’ฉ hits its bound along ๐ฉ before any other non-working constraints, strict equality holds for constraint ๐€i at its bound. It then becomes a working constraint and moves from ๐’ฉ to ๐’ฒ. Whenever the working set ๐’ฒ has exactly n number of constraints, a vertex point is reached, which is a stationary point.

On the other hand, if we can move infinite step length along ๐ฉ without hitting any non-working constraint, we can declare the LP problem has unbounded solution. This could happen when the rows of ๐€๐’ฉ are just linear combinations of ๐€๐’ฒ. In other words, there is redundancy in constraint specification.

At a vertex point, optimality conditions are tested on all inequality working constraints {i,iโˆˆ๐’ฒ-โ„ฐ}. If the Lagrange multiplier condition for working constraint ๐€i is not satisfied, we can move along a negative Lagrange direction such that ๐€i leaves from its bound and becomes strictly feasible. We also move index i from ๐’ฒ to ๐’ฉ. It can be shown that such a direction is a descent direction. At each iteration, the algorithm repeats to udpate ๐’ฒ and move along a descent feasible direction ๐ฉ until all optimality conditions are satisfied.

3.1 Simple Bound Constraints

The overall strategy of an active set method is to maintain a set of working constraints and search in a direction that retains the values of these working constraints. The same principle applies to simple bound constraints as well. Without loss of generality, assume all simple bounds are lower bounded constraints ๐ฑโ‰ฅ๐ฅ. At each iteration, if a variable is equal to its lower bound, it is considered as a fixed varialbe. A variable that stays stictly inside its feasible region is considered as a free variable.

Let ๐ฑb and ๐ฑf denote the set of fixed and free variables respectively. We could treat the simple bounds for ๐ฑb as general inequality constraints ๐ˆ๐ฑbโ‰ฅ๐ฅb with equality exactly being held currently. Therefore we can include as many fixed variables as possible in the current working set and arrange the working constraints in the following block structure,

(๐ˆ๐๐’ฒ๐…๐’ฒ)โข(๐ฑb๐ฑf)=(๐ฅb๐›๐’ฒ)

, where the coefficient matrix ๐€๐’ฒ of general working constraints is partitioned to ๐๐’ฒ and ๐…๐’ฒ for fixed and free variables accordingly. The search direction is a change to those free variables. We just need bookkeep when a free variable hits its bounds and becomes a fixed variable, and when a fixed variable is removed from the working set and becomes free.

3.2 Null Space Direction

Assume there are m number of general constraints in a working set ๐’ฒ. Let ๐…๐’ฒT have a full QR factorization

๐…๐’ฒT=(๐˜๐™)โŸ๐โข(๐‘0)

The first m columns of ๐ forms the range space ๐˜ of ๐…๐’ฒ. The remaining columns of ๐ is the null space ๐™, ๐…๐’ฒโข๐™=0. If we seek a null space direction of free variables ๐ฉf=๐™๐ฉz for some ๐ฉz, then ๐…๐’ฒโข๐ฉf=0. Moving along such a null space direction retains the feasibility with respect to the current working constraints.

To obtain a descent feasible direction, we simply choose ๐ฉz being the projection of the negative gradient to the null space, ๐ฉz=-๐™Tโข๐ โข(๐ฑf). For LP problems, ๐ โข(x) is the cost vector ๐œ. So ๐ฉf=-๐™๐™Tโข๐œf, where ๐œf is the cost of free variables. Let ๐œb denote the cost of fixed variables on the other hand.

3.3 Optimality Conditions

Including the fixed variables as working constraints, the stationary condition requires

(๐ˆ๐๐’ฒT๐…๐’ฒT)โข(ฮปbฮป๐’ฒ)=(๐œb๐œf)

Accordingly, we can solve

ฮป๐’ฒ = ๐…๐’ฒT-1โข๐œf
ฮปb = ๐œb-๐๐’ฒTโขฮป๐’ฒ

Next we test Lagrange multipliers ฮปi,iโˆˆ๐’ฒ-โ„ฐ and ฮปi,xiโˆˆ๐ฑb. For upper bounded constraints, the sign of Lagrange multipliers are flipped. If all ฮปi are positive, the optimal solution is found. Otherwise either the general constraint ๐€i or the fixed variable ๐ฑi that has the most negative ฮปi is selected to quit from the current working set or the set of fixed variables. It can be shown that the feasible direction wrt the updated working set is a descent direction.

3.4 QR Update

The active set method needs repeatedly solve the linear system ๐…๐’ฒTโข๐ฒ=๐ณ for some right hand side vector ๐ณ. Given the QR factorization of ๐…๐’ฒT, we solve

๐‘๐ฒ=๐˜Tโข๐ณ

๐…๐’ฒ changes whenever a working constraint is added or removed from the working set or the set of free variables changes. An efficient QR factorization update scheme is implemented based on [3], section 5.2.4.

3.5 Initial Working Set and Feasible Point

The active set method also has a phase-I feasibility stage to find an initial feasible point and a subsequent optimality stage to solve the actual LP problem. A โ€œcrash startโ€ strategy, [3] section 5.8.1, is implemented to seamlessly connect these two stages. By the end of solving the phase-I feasibility problem, we have not only obtained a pair of initial feasible point and initial working set, but also have all internal resources, i.e. QR factorization, ready to immediately start the the phase-II iterations.

Again, without loss of generality, assume all inequality constraints are lower bound constraints. We first construct a crash index set ๐’ž to include all equality constraints โ„ฐ. Linear dependency is tested so that only independent constraints are added to ๐’ž. Next, given an initial guess ๐ฑ^ and a โ€œcrash radiusโ€ control parameter r provided by a user, we compute the violation amount (๐›-๐€โข๐ฑ^)+ for all general constraints, and (๐ฅ-๐ฑ^)+ for simple bounds.

A violated constraint is added to ๐’ž if its violation amount is smaller than r and it is independent to all existing constraints in ๐’ž. This procedure stops as soon as ๐’ž has n constraints.

Next, we can fix all violated variables in ๐’ž to the corresponding simple bound values and obtain ๐ฑb. Now let ๐’ฒ include all general constraints in ๐’ž and solve free variables from

๐…๐’ฒโข๐ฑf=๐›-๐๐’ฒโข๐ฑb

At this point, we have obtained an initial working set ๐’ฒ and an initial point (๐ฑb๐ฑf) that is feasible to working general constraints ๐€๐’ฒ. Of course, some of the non-working general constraints ๐€๐’ฉ or simple bounds could still be violated. Conceptually, we introduce artificial variables ๐ฒ and ๐ณ and solve a phase-I feasibility subproblem,

min๐ฑ,๐ฒ,๐ณโก๐žTโข๐ฒ+๐žTโข๐ณ,ย s.t. ๐›lโ‰ค๐€๐ฑยฑ๐ณโ‰ค๐›u
๐ฅโ‰ค๐ฑยฑ๐ฒโ‰ค๐ฎ,
๐ฒโ‰ฅ0
๐ณโ‰ฅ0

When the active set method terminates after its phase-I stage, if the objective function value is not zero, we declare the LP problem has no feasible solution. Otherwise, the solution and working set after phase-I is the starting point of phase-II.

4 Termination Control

Both the Simplex method and the active set method have a control parameter N for the max number of iterations allowed. The algorithm always terminates after N iterations. Further, there is an exception-throw control. Once enabled, the algorithm throws an exception after N iterations. If disabled, it just reports the โ€œhalf-wayโ€ solution found after N iterations.

Both algorithms also have a โ€œconvergence toleranceโ€ parameter ฯตc>0 that is used to check the Lagrange multipliers in optimality condition test. If the most negative Lagrange multiplier ฮปโ‰ฅ-ฯตc, the Lagrange multiplier condition is considered as satisfied.

In Simplex method, the number of basic variables is exactly equal to the number of equality constraints. So the basic feasible points generated by the algorithm are always vertex points. Stationary condition is automatically satisfied at a vertex point.

In the active set method, the number of working constraints usually is equal to the number of variables n, hence a vertex point as well. However, it is possible that the cost vector ๐œf for free variables happens to be 0 or a linear combination of ๐…๐’ฒ at a non-vertex point. The active set method has a โ€œstationary toleranceโ€ control parameter ฯตs>0. When |๐œf|โˆžโ‰คฯตs or |๐œfTโข๐ฉf|โ‰คฯตs, we consider the stationary condition is satisfied at a non-vertex point.

5 Summary

The active set method is a very powerful tool to handle LP and many other contrained optimization problems. It directly takes the general LP form (2) without having to introduce slack and artifical variables. When the number of general contraints m is significantly greater than the number of unknowns n, it has the advantage of working on a linear system that is no bigger than the number of free variables.

Simplex method simultaneously works on all linear constraints. Initially, we need convert the general form to canonical form (1) with slack and artifical variables. It is most suitable for the situation nโ‰ซm.

References

  • [1] David G. Luenberger: Linear and Nonlinear Programming, 2nd edition, Springer 2003
  • [2] Jorge Nocedal and Stephen J. Wright: Numerical Optimization, Springer, 1999
  • [3] Philip E. Gill, Walter Murray and Margaret H. Wright: Practical Optimization, Academic Press, 1981
  • [4] J.J. Forrest and J. A. Tomlin: Updated Triangular Factors of the Basis to Maintain Sparsity in the Product from Simplex Method, Mathematical Programming 2(1972) 263-278