Numerical Linear Algebra
Mathwrist White Paper Series

(January 20, 2023)

Abstract

This document presents a technical overview of the numerical linear algebra (LA) functionality implemented in Mathwrist’s C++ Numerical Programming Library (NPL). The LA portion of NPL is a mixture of in-house implementation and wrapping third party’s optimized Blas/Lapack functions in an object-oriented approach, i.e. as memeber functions of the Matrix class. For the later, we explicitly list those functions in relevant sections. For the in-house implementation part, we will give more implementation details.

1 Matrix, View and Iterator

1.1 Matrix

One fundamental built-in block of NPL is the Matrix class. We don’t introduce a separate data type for vectors. Instead, they are just 1-row or 1-column instances of the matrix class. Matrix row and column index starts from 0.

A matrix object is always associated with some underlying memory storage. The actual memory allocation scheme is handled by the library as implementation details. From a user perspective, it is sufficient to know that dynamic memory is allocated to store matrix elements. When a matrix object goes out of scope, its memory is freed automatically.

The standard use case of creating a $m\times n$ matrix is by calling the constructor Matrix::Matrix(m,n), which allocates a $(m\times n)$ -element buffer in heap. In this case, matrix elements are contiguous double values placed in row-wise major. Internally, the library could have different memory representation scheme for special matrices. But only the standard use case is exposed to API level. Most of linear algebra functionality is implemented as member functions of the Matrix class in matrix.h.

1.2 Matrix Views

Very often, we will need work on a part or a transformed representation of an original matrix. For efficiency purpose, we introduce the concept of matrix views that can be conveniently used just like regular matrix objects. A matrix view in this case shares the same data storage as the original matrix. Hence it is important to make sure the original matrix object is alive while using its views.

There are two types of matrix views, Matrix::View and Matrix::ViewConst both as nested classes of the Matrix class. From a constant matrix object, one can only obtain a view of Matrix::ViewConst type, which ensures the content of an original matrix cannot be modified by the view. On the other hand, a non-constant matrix could be modified by its views. The following code example illuatrates this.

Listing 1: Constant vs Non-constant Views

1 // create a 5 x 5 matrix of zero elements

2 Matrix m(5,5);

4 // Obtain a 3 x 3 leading sub matrix view

5 Matrix::View v = m.sub(0,0,3,3);

7 // Change the value from the view also

8 // changes the content of the original matrix.

9 v(1,1) = 0.5;

11 // Now the (1,1) element is 0.5.

12 std::cout << m(1,1) << std::endl;

14 // Let’s have a constant reference to matrix m.

15 const Matrix& mr = m;

17 // From a const matrix, we can only get a constant view.

18 Matrix::ViewConst vc = mr.row(0);

20 // This triggers compilation error.

21 vc(1,1) = 0.5;

Attention needs be paid when we copy matrix and view objects using the assignment operator. The behavior of this assignment operation depends on the actual types of its left hand side(LHS) and right hand side(RHS) parameters.

1.

LHS is a matrix: This matrix object is first resized to the same size as the RHS object and then an elemen-wise copy (deep copy) is performed.
2.
LHS is a non-constant view:
1. (a)
  
  RHS is a matrix: Because a view has a predefined size, an element-wise copy happens if both sides have the same size. Otherwise, an exception is thrown.
2. (b)
  
  RHS is also a non-constant view: No deep copy is involved. LHS becomes the same view as the RHS view object.
3. (c)
  
  RHS is a constant view: RHS is implicitly converted to constant matrix object, then 2.a applies.
3.

LHS is a constant view: RHS can only be a constant view as well. In this case, both LHS and RHS become the same view on the original matrix. All other RHS types will cause a comilation error.

The following code example reflects the behavior of view assignment operations described above.

Listing 2: View Assignment Operator

1 Matrix m(3, 3);

2 Matrix n(2, 2);

4 // m is a diagonal matrix with values = 0.5.

5 m.diagonalize(0.5);

7 // set n with some values.

8 n(0, 1) = 0.2;

9 n(1, 0) = 0.3;

11 // v is a leading sub matrix view of m.

12 Matrix::View v = m.sub(0, 0, 2, 2);

14 // RHS is a matrix, content of n is copied to m.

15 v = n;

17 // The leading 2 x 2 sub matrix of m has the same

18 // value as n now.

19 std::cout << m(0,0) << std::endl;

20 std::cout << m(0,1) << std::endl;

21 std::cout << m(1,0) << std::endl;

22 std::cout << m(1,1) << std::endl;

24 // u is the first row of n.

25 Matrix::View u = n.row(0);

27 // Now v is also the first row view of n.

28 v = u;

30 // Changing v changes n but not m.

31 v(0, 0) = 1;

32 std::cout << m(0,0) << std::endl; // value = 0

33 std::cout << n(0,0) << std::endl; // value = 1

35 // This should trigger an exception since the size

36 // of v is 1 x 2, different from the size of m.

37 bool got_error = false;

38 try

39 {

40 v = m;

41 }

42 catch (...)

43 {

44 got_error = true;

45 }

46 assert(got_error);

Matrix views can be passed as arguments anywhere when a matrix object is expected. We can also call a matrix member function through a view object using the operator - $>$ notation.

Listing 3: Implicit Conversion of Views

1 Matrix m(5,5);

3 // v is a view of m itself

4 Matrix::View v = m.self();

6 // same way to access elements

7 m(0,0) = 1.0;

8 assert(v(0,0) == 1.0);

10 // using -> notation to access matrix member functions.

11 assert(v->row_size() == 5);

13 Matrix b(2,5);

14 Matrix c;

16 // pass v to a function that takes matrix as arguments.

17 b.multiply(v, c);

1.3 Matrix Expression

A special type of matrix views are matrix expressions, introduced as the following matrix nested classes that are all derived from Matrix::View.

•

Matrix::Negation, unary expression;
•

Matrix::Addition, binary expression;
•

Matrix::Substraction, binary expression;
•

Matrix::Multiplication, binary expression;

These expressions are designed with efficiency consideration so as to provide some syntax sugar for programmers. The C++ language ensures arithmetic precedence when multiple operators appear in one single expression. We can write cleaner code using matrix expressions illustrated by the code example below:

Listing 4: Matrix Expressions

1 // A single expression

2 Matrix a = -b * c + (d - e) * f;

4 // Equivalent sequence by individual function calls.

5 Matrix temp1 = b;

6 temp1.negate();

7 Matrix temp2;

8 temp1.multiply(c, temp2);

9 Matrix temp3 = d - e;

10 Matrix temp4;

11 temp3.multiply(f, temp4);

12 a = temp2 + temp4;

Matrix expressions are special in the sense that each expression object has its own memory allocated to store the elements of expression result. A regular view on the contrary just shares the same data storage as its original matrix source.

1.4 Permutations

Matrix permutation reflects a sequence of row interchanges or column interchanges on a target matrix. Conceptually, we can think of permutation itself as a square matrix, where there is exactly one element equal to 1 in each row and column. All other elements are 0. When this permutation matrix left or right multiplies a regular matrix $M$ , effectively row or column interchange are applied to $M$ .

We represent matrix permutation as a nested class Matrix::Permutation in matrix.h. It can left multiply or right multiply a regular matrix object to generate a permutated view of the original matrix. The permutated view then can be used conveniently as a regular matrix.

Listing 5: Matrix Permutation

1 // Create a matrix

2 // | 1 2 |

3 // | 3 4 |

4 Matrix a(2,2);

5 a(0,0) = 1;

6 a(0,1) = 2;

7 a(1,0) = 3;

8 a(1,1) = 4;

10 // A 2 x 2 permuation matrix representing

11 // (0,1) row or column interchange.

12 Matrix::Permutation p(2, 0, 1);

14 // (p * a) generates the (0,1) row

15 // interchanged view of a.

16 Matrix::View v = p * a;

18 // b has content

19 // | 3 4 |

20 // | 1 2 |

21 Matrix b = v;

23 // Changing v also changes a in

24 // permutated position.

25 v(0,1) = 1;

27 // a becomes to

28 // | 1 2 |

29 // | 3 1 |

31 // (a * p) generates the (0,1) column

32 // interchanged view of a.

33 v = a * p;

35 // c has content

36 // | 2 1 |

37 // | 1 3 |

38 Matrix c = v;

Multiplication can also happen between two permutation objects. The end result is a new permutation object the reflects the accumulated row/column interchanges. Let $P_{i,j}$ and $P_{k,l}$ denote $(i,j)$ and $(k,l)$ interchange respectively. $P_{k,l}\times P_{i,j}\times M$ is to first perform $(i,j)$ row interchange on matrix $M$ and then followed by $(k,l)$ row interchange on the result of the first permutation. To perform the same sequence of interchanges on columns, we will use right multiplication $M\times P_{i,j}\times P_{k,l}$ .

Listing 6: Permutated View

1 // 3 x 3 permutation matrix representing

2 // (0,1) row or column interchange.

3 Matrix::Permutation p01(3,0,1);

5 // 3 x 3 permutation matrix representing

6 // (1,2) row or column interchange.

7 Matrix::Permutation p12(3,1,2);

9 // A target 3 x 4 matrix.

10 Matrix m(3, 4);

12 // First perform (0,1) row interchange and

13 // then (1,2) row interchange.

14 Matrix::View v = p12 * p01 * m;

16 // A target 4 x 3 matrix.

17 Matrix n(4, 3);

19 // First perform (0,1) column interchange and

20 // then (1,2) column interchange.

21 Matrix::View u = n * p01 * p12;

Finally, when a permutation is applied to a constant matrix, it results to a Matrix::ViewConst object, a constant view of the matrix as the code example below.

Listing 7: Constant Permutated View

1 Matrix a(2,2);

2 Matrix::Permutation p(2, 0, 1);

4 // Hold a constant reference on a.

5 const Matrix& ac = a;

7 // Permuation on a constant matrix can

8 // only result to a constant view.

9 Matrix::ViewConst v = p * a;

1.5 Iterators

The elements of a matrix can be accessed via the Matrix::operator() member function. Each of such access is a separate function call, which could cause a lot of overhead if an application needs frequently access to matrix elements. In most cases though, we will be working with matrix objects that have elements contiguous in memory, i.e. directly on an original matrix object, or on a non-permutated view of the matrix. In these typical use cases, we can use matrix iterators for more efficient element access.

Matrix iterators are designed by the following mechanism inside the definition of Matrix class.

Listing 8: Definition of Matrix Iterator

1 class Matrix

2 {

3 template<typename _Element>

4 struct _DenseTraits

5 {

6 typedef ftl::PointerJumpingIterator<_Element> row_iterator;

7 typedef ftl::PointerJumpingIterator<_Element> col_iterator;

8 ...

9 row_iterator row_begin(size_t i) const;

10 row_iterator row_end(size_t i) const;

11 col_iterator col_begin(size_t i) const;

12 col_iterator col_end(size_t i) const;

13 };

14 public:

15 typedef _DenseTraits<double> iterator_traits;

16 typedef _DenseTraits<const double> const_iterator_traits;

17 typedef std::shared_ptr<iterator_traits> iterator_support;

18 typedef std::shared_ptr<const_iterator_traits>

19 const_iterator_support;

21 ...

23 iterator_support it_support();

24 const_iterator_support it_support() const;

26 };

Essentially, an iterator support is a shared pointer of dense matrix traits that implements iterators by pointer arithmetic. When we obtain an iterator support from a non-permutated view or from a matrix directly, this shared pointer is not null. From that point on, we can simply use iterators to access matrix elements. Matrix iterator implementation consist of very light inline functions which are more efficient than Matrix::operator() member access function.

From a non-constant matrix we can obtain an iterator_support that allows us to both read and write the value of matrix elements. From a constant matrix, we obtain const_iterator_support that supports read access only. The code example below demonstrates how to use iterators.

Listing 9: Use Matrix Iterators

1 Matrix m(5,5);

3 // Elements of m are contiguous in memory.

4 // Hence an iterator support is available.

5 Matrix::iterator_support its = m.it_support();

7 // its should be not null.

8 assert(its);

10 // Change the element value of the first row.

11 Matrix::iterator_traits::row_iterator

12 it = its->row_begin(0);

13 Matrix::iterator_traits::row_iterator

14 it_end = its->row_end(0);

16 for (; it != it_end; ++it)

17 (*it) = 0.5;

19 // constant matrix has read-only iterator support.

20 const Matrix& mr = m;

21 Matrix::const_iterator_support itcs = mr.it_support();

22 assert(itcs);

24 Matrix::const_iterator_traits::row_iterator

25 itc = itcs->col_begin(0);

26 std::cout << *itc << std::endl; // read-only

28 // A permutated view doesn’t have iterator support,

29 // because its elements are non longer contiguous in memory.

30 Matrix::Permutation p(5, 0, 1);

31 Matrix::View v = p * m;

33 // Now the iterator support is a null pointer.

34 its = v->it_support();

35 assert(its.get() == 0);

2 Multiplication, Inverse and Factorization

Table 1 lists the matrix multiplication, inverse and factorization methods supported in NPL. They are all member functions of the Matrix class. Each of them takes matrix reference as output arguments. On function return, these arguments hold the result. The remark column in this table indicates whether it is wrapping third party’s optimized Blas/Lapack implementation or in-house implementation. For the former, the corresponding Blas/Lapack routine name is given. For in-house implementation, we give further details in separate sub sections.

Table 1: Matrix Multiplication/Factorization Functions

Type	Function name	Remark
Multiplication	Matrix::multiply()	3rd party, dgemm()
Inverse	Matrix::inv()	3rd party, dsytrf(), dpotrf(), dgetrf()
Eigen	Matrix::eigen()	3rd party, dsyev(), dgeev()
SVD	Matrix::svd()	3rd party, dgesvd()
LU	Matrix::lu()	3rd party, dgetrf()
QR	Matrix::qr()	3rd party, dgeqrf()
Full QR	Matrix::full_qr()	In-house
Cholesky	Matrix::cholesky()	In-house
Reduced Cholesky	Matrix::cholesky_reduced()	In-house
Modified Cholesky	Matrix::cholesky_modified()	In-house
Partial Cholesky	Matrix::cholesky_partial()	In-house

2.1 Inverse

Let $\mathbf{A}$ be the $n\times n$ matrix object that we want to compute its inverse $\mathbf{A}^{-1}$ . The Matrix::inv() function takes an optional Matrix::Form parameter that helps to choose what algorithm to use, based on the form of $\mathbf{A}$ . There are three special forms that are relevant in this context.

2.1.1 Singular Form

First of all, if $\mathbf{A}$ is singular, $\mathbf{A}^{-1}$ doesn’t exist. If $\mathbf{A}$ is near singular, computing $\mathbf{A}^{-1}$ in alternative ways could possibly introduce too much numerical noise. In this situation, passing the “SINGULAR” matrix form parameter will tell the function to use SVD method underneath.

Let $\mathbf{U}\mathbf{S}\mathbf{V}^{T}$ be the SVD of $\mathbf{A}$ . Because $\mathbf{U}$ and $\mathbf{V}$ are unitary, we have

\mathbf{A}^{-1}=\mathbf{V}\mathbf{S}^{-1}\mathbf{U}^{T}

The elements in $\mathbf{S}$ are non-negative and arranged in descending order. Let $S_{i}$ be the last element of $\mathbf{S}$ that has condition number greater that the internal threshold $1.e^{-12}$ , we truncate to get the first $i$ columns of $\mathbf{V}$ , the first $i$ rows of $\mathbf{U}^{T}$ and the first $i$ diagonals of $\mathbf{S}$ to obtain sub matrices $\mathbf{V}_{i}$ , $\mathbf{U}_{i}^{T}$ and $\mathbf{S}_{i}$ respectively. Finally we compute,

\mathbf{A}^{-1}=\mathbf{V}_{i}\mathbf{S}_{i}^{-1}\mathbf{U}_{i}^{T}

2.1.2 Symmetric Form

When $\mathbf{A}$ is symmetric, we can set the Matrix::Form parameter to “SYMMETRIC”. This indicates to use $\mathbf{L}\mathbf{D}\mathbf{L}^{T}$ factorization and compute.

	$\displaystyle\mathbf{A}$	$\displaystyle=$	$\displaystyle\mathbf{L}\mathbf{D}\mathbf{L}^{T}$
	$\displaystyle\mathbf{A}^{-1}$	$\displaystyle=$	$\displaystyle\mathbf{L}\mathbf{D}^{-1}\mathbf{L}^{T}$

The $\mathbf{L}\mathbf{D}\mathbf{L}^{T}$ factorization precisely has floating point operation cost $\frac{1}{3}n^{3}$ .

2.1.3 Positive Definite

When $\mathbf{A}$ is positive definite, we can set the Matrix::Form parameter to “POS_DEFINITE”. This indicates to use Cholesky factorization which also has floating point operation cost $\frac{1}{3}n^{3}$ but avoids the subsequent minor cost of $n$ floating point division to compute $\mathbf{D}^{-1}$ in the symmetric form case.

2.1.4 General Forms

This is the default case where $\mathbf{A}$ is a general non-singular matrix. The inverse function will use $\mathbf{L}\mathbf{U}$ factorization which has $\frac{2}{3}n^{3}$ floating point operation cost.

2.2 Eigen

Given a $n\times n$ matrix $\mathbf{A}$ , its eigen decomposition is computed in two phases. The first phase involves the order of $\mathcal{O}(n^{3})$ number of floating point operations. The second phase generally requires $\mathcal{O}(n^{3})$ floating point operations in order to converge to machine precision. However, when $\mathbf{A}$ is symmetric, the second phase can be computed much faster, even in the order of $\mathcal{O}(n^{2})$ if only eigen values are required. Details of eigen algorithms can be found in chapter V in the book [1] by Lloyd N. Trefethen and David Bau. Our Matrix::eigen() function uses the Matrix::Form parameter to distinguish the symmetric case from the general case.

2.3 SVD

Given a generate $m\times n$ matrix $\mathbf{A}$ , the svd() function computes a compact form of singular value decomposition $\mathbf{A}=\mathbf{U}\mathbf{S}\mathbf{V}^{T}$ . Let $r=\min(m,n)$ . $\mathbf{U}$ is resized to $m\times r$ . $\mathbf{S}$ is resized to $r\times 1$ and $\mathbf{V}$ is resized to $n\times r$ . The singular values in $\mathbf{S}$ are non-negative and arranged in descending order.

2.4 QR vs. Full QR

Given a $m\times n$ matrix $\mathbf{A}$ , let $r=\min(m,n)$ . The QR decomposition computes a reduced size $m\times r$ orthonormal basis matrix $\mathbf{Q}$ and a $r\times n$ upper triangular matrix $\mathbf{R}$ such that $\mathbf{Q}\mathbf{R}=\mathbf{A}$ . In this reduced version, when $m>n$ , $\mathbf{Q}$ consists of only the first $n$ orthonormal basis in columns. The full QR decomposition on the other hand computes a full $m\times m$ basis expansion matrix $\mathbf{Q}$ using Householder reflector algorithm.

The reduced size QR and full QR have the same behavior on matrix $\mathbf{R}$ . When $m>n$ , $\mathbf{R}$ is a $n\times n$ upper triangular matrix. When $m<n$ , $\mathbf{R}$ is $m\times n$ an upper trapezoid matrix.

2.5 Variety of Cholesky

Our Cholesky factorization method is based on Algorithm 23.1 outlined in the book [1] by Lloyd N. Trefethen and David Bau. In addition, we also implemented a few other Cholesky decompositin methods of different flavors. They are used internally in our numerical optimization algorithms and could be relevant to other practical problems.

2.5.1 Reduced Cholesky

Very often in practice, we could be facing the problem of computing the Cholesky factor of $\mathbf{W}^{T}\mathbf{H}\mathbf{W}$ , where $\mathbf{H}$ is a $n\times n$ matrix and $\mathbf{W}$ is a $n\times m$ matrix, $m<n$ . So $\mathbf{W}^{T}\mathbf{H}\mathbf{W}$ has reduced rank, compared to $\mathbf{H}$ .

If we know $\mathbf{H}$ is positive definite, Matrix::cholesky_reduced() provides more efficient factorization than first computing matrix multiplications and then performing the standard Cholesky decomposition.

This reduced Cholesky factorization method stops and returns false as soon as it finds $\mathbf{H}$ is actually indefinite. Note that when $\mathbf{H}$ is indefinite, the reduced rank matrix $\mathbf{W}^{T}\mathbf{H}\mathbf{W}$ could still possibly be positive definite. It is just that Matrix::cholesky_reduced() only succeeds when $\mathbf{H}$ is positive definite.

2.5.2 Modified Cholesky

When a matrix $\mathbf{H}$ is indefinite, Matrix::cholesky_modified() function finds a remedy diagonal matrix $\mathbf{E}$ of minimized norm $\|\mathbf{E}\|_{\infty}$ such that the modified matrix $\bar{\mathbf{H}}=\mathbf{H}+\mathbf{E}$ becomes positive definite. This function is based on the algorithm in the book by Gill, Murray and Wright [3], pp 108.

2.5.3 Partial Cholesky

For a general square matrix $\mathbf{H}$ , we can explore the positive definiteness structure of the matrix by pivoting as much as possible and then compute the Cholesky factor of its leading sub matrix. Matrix::cholesky_partial() is an implementation of the partial Cholesky algorithm originally proposed in the paper by A. Forsgren, P. E. Gill, and W. Murray [4].

Let $\mathbf{P}$ be the permutation matrix introduced by the pivoting operation in the algorithm. The permutated matrix has a block structure and can be decomposed as,

\mathbf{P}^{T}\mathbf{H}\mathbf{P}=\left(\begin{array}[]{cc}\mathbf{H}_{11}&% \mathbf{H}_{12}\cr\mathbf{H}_{21}&\mathbf{H}_{22}\end{array}\right)=\left(% \begin{array}[]{cc}\mathbf{L}_{11}&\cr\mathbf{L}_{21}&\mathbf{I}\end{array}% \right)\left(\begin{array}[]{cc}\mathbf{D}&\cr&\mathbf{K}\end{array}\right)% \left(\begin{array}[]{cc}\mathbf{L}_{11}^{T}&\mathbf{L}_{21}^{T}\cr&\mathbf{I}% \end{array}\right)

, where $\mathbf{L}_{11}$ is a unit lower triangular matrix and $\mathbf{D}$ is a positive definite diagonal matrix such that the leading sub matrix $\mathbf{H}_{11}$ has Cholesky factor

\mathbf{H}_{11}=\mathbf{L}_{11}\mathbf{D}\mathbf{L}_{11}^{T}

3 Linear System

Tabel 2 lists the available methodology to solve a linear system $\mathbf{A}\mathbf{x}=\mathbf{b}$ , depending on both the $(m,n)$ size and the matrix form of $\mathbf{A}$ . Please refer to matrix inverse section 2.1 for details of the Matrix::Form parameter.

Once again, the remark column in the table indicates whether it is an in-house implementation or wrapping third party’s Lapack implementation instead. For the later, the Lapack routine name is provided.

Table 2: Linear System Solvers

Form of $\mathbf{A}$	Methodology	Remark
$m<n$	QR	3rd party, dgelsy()
$\mathbf{A}$ is singular or $m>n$	SVD	3rd party, dgelsd()
$\mathbf{A}$ is symmetric	$\mathbf{L}\mathbf{D}\mathbf{L}^{T}$	3rd party, dsysv()
$\mathbf{A}$ is positive definite	Cholesky	3rd party, dposv()
$\mathbf{A}$ is a general matrix	LU	3rd party, dgesv()
$\mathbf{A}$ is upper triangular	backward substitution	In-house
$\mathbf{A}$ is lower triangular	forward substitution	In-house
$\mathbf{A}$ is tridiagonal	backward/forward elimination	In-house
$\mathbf{A}$ is positive definite	conjugate gradient update	In-house
$m<n$ and QR is known	range space solver	In-house
$\mathbf{A}_{i}$ is the result of column
pivoting $\mathbf{A}_{i-1}=\mathbf{L}_{i-1}\mathbf{U}_{i-1}$	LU update	In-house

General guidelines and details of using the appropriate methodology are given in the following subsections.

3.1 $m<n$

When $m<n$ , the linear system is underdetermined and has infinite number of solutions. The QR factorization method effectively produces a minimum norm solution of $\mathbf{x}$ . We first compute $\mathbf{A}^{T}=\mathbf{Q}\mathbf{R}=\left(\begin{array}[]{c|c}\mathbf{Y}&% \mathbf{Z}\end{array}\right)\mathbf{R}$ , where $\mathbf{Y}$ is the $n\times m$ range space and $\mathbf{Z}$ is the $n\times(n-m)$ null space.

Since $\mathbf{Q}$ is an orthonormal basis, we can write solution vector $\mathbf{x}=\mathbf{Y}\mathbf{x}_{Y}+\mathbf{Z}\mathbf{x}_{Z}$ in terms of a range space component $\mathbf{x}_{Y}$ and null space component $\mathbf{x}_{Z}$ . Clearly,

\mathbf{A}\mathbf{x}=\mathbf{A}\mathbf{Y}\mathbf{x}_{Y}=\mathbf{R}^{T}\mathbf{% x}_{Y}

, where $\mathbf{x}_{Y}$ can be found uniquely by forward substitution on equation $\mathbf{R}^{T}\mathbf{x}_{Y}=\mathbf{b}$ .

Different choice of $\mathbf{x}_{Z}$ produces different solution of $\mathbf{x}$ . In practice, we let $\mathbf{x}_{Z}=0$ to obtain a minimum norm solution. If the QR factorization is known, we can directly use the range-space solver LinearSystem::range_space_solve().

3.2 Singular $\mathbf{A}$ or $m>n$

When $m>n$ , the linear system is overdetermined and doesn’t have an exact solution. When $\mathbf{A}$ is singular, the linear system either doesn’t have an exact solution or has infinite many solutions. The solver effectively computes the least square solution of $\mathbf{x}$ using SVD.

3.3 Symmetric $\mathbf{A}$

For any symmetric matrix $\mathbf{A}$ , there exists a $\mathbf{L}\mathbf{D}\mathbf{L}^{T}$ factorization where $\mathbf{L}$ is a lower triangular unit matrix and $\mathbf{D}$ is a diagonal matrix. Further, if $\mathbf{A}$ is positive definite, all elements in $\mathbf{D}$ are positive, $\mathbf{A}$ then has a Cholesky factor.

3.4 Large Positive Definite $\mathbf{A}$

When $\mathbf{A}$ is a very large $n\times n$ positive definite matrix, computing the exact solution of $\mathbf{x}$ could be very time consuming. Instead, one might be satisfied with a “near” solution $\bar{\mathbf{x}}$ such that the residual $\mathbf{r}=\mathbf{A}\bar{\mathbf{x}}-\mathbf{b}$ is small enough.

In this situation, we can use conjugate gradient (CG) update method to generate a sequence of $(\mathbf{x}_{k},\mathbf{p}_{k},\mathbf{r}_{k})$ . For some initial guess $\mathbf{x}_{0}$ , we compute the residual vector $\mathbf{r}_{0}=\mathbf{A}\mathbf{x}_{0}-\mathbf{b}$ . The initial search direction $\mathbf{p}_{0}$ is taken as $\mathbf{p}_{0}=-\mathbf{r}_{0}$ . We always pass $\left(\mathbf{x}_{k},\mathbf{p}_{k},\mathbf{r}_{k}\right)$ as input at the $k$ -th iteration. Subsequently, the CG method updates $\mathbf{x}_{k+1}$ as improved “guess” with smaller residual $\mathbf{r}_{k+1}$ . It also updates the search direction $\mathbf{p}_{k+1}$ that should be taken at the next iteration. Users can stop the iterations when $\mathbf{r}_{k}$ is considered small enough. In the worst case, the CG method ensures the exact soluton can be found after $n$ number of iterations. Our implementation is based on [2], chapter 5.

3.5 LU Update of $\mathbf{A}$

Suppose initially we know a LU decomposition of matrix $\mathbf{A}_{0}=\mathbf{L}_{0}\mathbf{U}_{0}$ . Subsequently, matrix $\mathbf{A}_{i}$ is obtained by pivoting the $j$ -th column of $\mathbf{A}_{i-1}$ with a vector $\mathbf{p}_{i}$ . In this situation, the LU decomposition of $\mathbf{A}_{i}$ can be efficiently computed by updating $\mathbf{L}_{i-1}$ and $\mathbf{U}_{i-1}$ . LinearSystem::LU provides a linear system solver to solve $\mathbf{A}_{i}\mathbf{x}=\mathbf{b}$ . Our implementation is based on [5].

References

[1] Lloyd N. Trefethen and David Bau, III: Numerical Linear Algebra, SIAM, 1997
[2] Jorge Nocedal and Stephen J. Wright: Numerical Optimization, Springer, 1999
[3] Philip E. Gill, Walter Murray and Margaret H. Wright: Practical Optimization, Academic Press, 1981
[4] Anders. Forsgren, Philip E. Gill and Walter Murray: Computing Modified Newton Directions Using a Partial Cholesky Factorization, SIAM J. SCI. COMPUT. Vol. 16, No. 1, pp. 139-150
[5] J.J. Forrest and J. A. Tomlin: Updated Triangular Factors of the Basis to Maintain Sparsity in the Product from Simplex Method, Mathematical Programming 2(1972) 263-278

Numerical Linear Algebra Mathwrist White Paper Series