Unconstrained Optimization
Mathwrist C++ API Series
Abstract
This document details the C++ programming interface of multivariate unconstrained optimization features in Mathwrist’s Numerical Programming Library (NPL). Please refer to our unconstrained optimization white paper for a technical overview.
1 Introduction
Let be a twice continuously differentiable function with gradient and Hessian . Unconstrained optimization solves the following minimization problem,
The optimality conditions at solution are and is at least positive semi-definite.
Users need define an objective function by deriving it from the FunctionND base class and implementing the eval() virtual function. The eval() function computes function values and gradients. Depending on the actual optimization solver being used, the objective function may also need supply implementation of Hessian matrix calculation. Please refer to our “1-d and n-d functions” white paper and API documentation for the details of n-dimensional funcitons in order to represent .
In Mathwrist NPL, we provide both line search based and trust region based solvers, derived from LineSearch and TrustRegion base classes respectively. Class LineSearch and TrustRegion in turn are all derived from IterativeMethod class for max iterations control and convergence tolerance control. Class IterativeMethod is first introduced and documented in the “Iterative Method” section in our “1-d and n-d Functions” API chapter.
Both line search and trust region algorithms iteratively generate a sequence of improved data points converging to solution . The algorithm terminates at iteration , if reaches to the max number of iterations, or the optimality conditions are satisfied. In particular, if , where is the convergence tolerance, we consider the stationary condition is satisfied.
2 Line Search Method
All line search based solvers are defined in the header file line_search.h. Base class LineSearch is a pure virtual class derived from IterativeMethod class that provides controls on the max number of iterations and convergence tolerance .
LineSearch base class carries out the main work of iteratively making a descent step move . Concrete line search derived classes are responsible for computing a descent direction at each iteration point . The line search base class calculates a step length satisfying Wolfe conditions and then makes a step move.
2.1 Objective Function
Description:
Return a constant reference to the objective function, which is an instance of FunctionND class. LineSearch base class holds a reference to the objective function as data member.
2.2 Solver Operator
Description:
Solve the unconstrained optimization problem using the line search method.
-
•
Parameter max_step_move specifies a control parameter such that the step length is upper bounded by , .
-
•
Parameter x holds the initial guess on call. On return, it stores the solution found by the solver.
-
•
The function returns the value of .
2.3 Restart Control
Description:
Line search solvers internally perform certain checks to protect the algorithm from being deteriorated due to accumulated numerical noise. Once a solver is determined in “dirty” state at point , a restart operation happens like is supplied as the initial guess. Users can force the restart to occur after every number of iterations.
2.4 Step Convergence Control
-
1.
Set the convergence tolerance of step length. Step length interval is considered converged when numerically equals to at this tolerance level.
-
2.
Return the step length convergence tolerance being used.
2.5 Steepest Descent Method
LineSearchSteepestDescent is a concrete line search subtype that computes search direction using the steepest descent method. The constructor takes a constant reference of a FunctionND object as the objective function.
REQUIRED: This objective function needs be alive during the lifecycle of the line search solver.
2.6 Newton Method
LineSearchNewton is a concrete line search subtype that computes search direction using a modified Newton method. The constructor takes a constant reference of a FunctionND object as the objective function.
REQUIRED: This objective function needs be alive during the lifecycle of the line search solver.
If the second parameter discrete in the constructor is passed true, the solver will apply forward finite difference method to gradients to approximate the Hessian matrix. In this case, the objective function can ignore the Hessian calculation when implementing the FunctionND::eval() virtual function. If discrete is passed false to the solver constructor, the objective function is expected to compute Hessian.
2.7 Quasi-Newton Method
LineSearchQuasiNewton is a concrete line search subtype that computes search direction from Quasi-Newton (BFGS) method. The objective function can ignore Hessian calculation when implementing the FunctionND::eval() virtual function.
The constructor takes a constant reference of a FunctionND object as the objective function.
REQUIRED: This objective function needs be alive during the lifecycle of the line search solver.
2.8 Conjugate Gradient Method
LineSearchConjugateGradient is a concrete line search subtype that computes search directions as a sequence of Conjugate Gradient directions using PR+ method. The constructor takes a constant reference of a FunctionND object as the objective function.
REQUIRED: This objective function needs be alive during the lifecycle of the line search solver.
3 Trust Region Method
All trust-region base solvers are defined in the header file trust_region.h. Class TrustRegion is a pure virtual class derived from IterativeMethod class that provides controls on the max number of iterations and convergence tolerance . Concrete trust region methods are further derived from the TrustRegion base class. These derived sub classes are responsible for computing a descent step move , where is the trust region radius. The TrustRegion base class iteratively makes step moves and adjust if necessary.
3.1 Objective Function
Description:
Return a constant reference to the objective function, which is an instance of FunctionND class. TrustRegion base class holds a reference to the objective function as data member.
3.2 Solver Operator
Description:
Solve the unconstrained optimization problem using the trust region method.
-
•
Parameter radius is the initial trust region radius suggested by the caller.
-
•
Parameter x holds the initial guess on call. On return, it stores the solution found by the solver.
-
•
The function returns the value of .
3.3 Max Radius
Description:
At any iteration, the actual radius being used is upper bounded by the max radius. These two functions get and set this max trust regioin radius.
3.4 Ellipse Scaling
Description:
This is an experimental feature as of the current release. It enables/disables search in ellipse scaled trust region. Once enabled, the search step is bounded by for some scaling matrix such that has better condition number than the original Hessian . By default, ellipse scaling is disabled.
3.5 Trust Region with Hessian
Class TrustRegionWithHessian is an intermediate derived class from TrustRegion base class. It serves only for internal implementation purposes and doesn’t have public interface.
3.6 Exact Direction
Class TrustRegionExact is a concrete trust region subtype method that computes “nearly” exact search step for medium size problems, where is computed by solving
(1) |
for some .
3.6.1 Constructor
Description:
The constructor takes a constant reference of a FunctionND object as the objective function. REQUIRED: This objective function needs be alive during the lifecycle of the line search solver. The second parameter max_radius in the constructor specifies the max trust region radius.
3.6.2 Iterations to find
Description:
Finding the in equation (1) most of the time involves root finding. These two functions set and get the max number of iterations to be used in the root finding process. By default, root finder will throw an exception when the max number of iterations are reached. This throw behavior can be disabled by calling the enable_exception_on_max_iter() function through the IterativeMethod base class. If exception throw is disabled, the algorithm will use the “half-way” solution of computed from root finding.
3.7 Conjugate Gradient-Steihaug Direction
Class TrustRegionSteihaug is a concrete trust region subtype method that computes a descent search step usually for large size problems, where is computed by Conjugate Gradient-Steihaug method.
3.7.1 Constructor
Description:
The constructor takes a constant reference of a FunctionND object as the objective function. REQUIRED: This objective function needs be alive during the lifecycle of the line search solver. The second parameter max_radius in the constructor specifies the max trust region radius.
3.7.2 Early Termination Control
Description:
The search direction is a linear combination of a sequence of Conjugate Gradient intermediate step directions. At any Conjugate Gradient intermediate step, we compute the residual vector . If or , we stop the Conjugate Gradient sequence. is the residual tolerance control parameter. is the “exactness” control parameter.
-
1.
Set the tolerance control parameter .
-
2.
Get the tolerance control parameter .
-
3.
Set the “exactness” control parameter .
-
4.
Get the “exactness” control parameter .