Data and Model Fitting
Mathwrist Code Example Series

Copyright ©Mathwrist LLC 2023
(October 5, 2023)
Abstract

This document gives some code examples on how to use the data and model fitting features in Mathwrist’s C++ Numerical Programming Library (NPL).

1 Nonlinear Least Square

Assme a nonlinear model function f(t)=aebt with model parameters a=0.02 and b=0.6. We have observed (ti,yi),i=0,99 where yi=f(ti)+ϵ for some random noise ϵ. The following example illustrates using modified Gauss-Newton method to recover model parameters from the observation data.

1 // Model parameters used to populate some observation data.
2 constexpr double a = 0.02;
3 constexpr double b = 0.6;
5 // A vector valued function to compute residuals.
6 class Residual : public VtrValueFunctionND
7 {
8   std::vector<double> m_y;
9   std::vector<double> m_t;
11 public:
12   Residual()
13   {
14     for (double t = 0; t < 1; t += 0.01)
15     {
16       m_t.push_back(t);
18       // Generate some data noise
19       double noise = std::rand() * 0.001 / RAND_MAX;
21       // Populate observation data with noise.
22       m_y.push_back(a * std::exp(b * t) + noise);
23     }
24   }
26   void eval(const Matrix& x, Matrix* _v=0, Matrix* _J=0) const
27   {
28     size_t n = m_y.size();
30     // Model parameters recovered during calibration.
31     double a = x(0, 0);
32     double b = x(1, 0);
34     // Residuals
35     if (_v != 0)
36     {
37       assert(_v->row_size() == n);
39       Matrix::iterator_support its = _v->it_support();
40       assert(its.get() != 0);
42       Matrix::iterator_traits::col_iterator it =
43         its->col_begin(0);
45       Matrix::iterator_traits::col_iterator it_end =
46         its->col_end(0);
48       for (size_t i = 0; it != it_end; ++i, ++it)
49       {
50         *it = a * std::exp(b * m_t[i]) - m_y[i];
51       }
52     }
54     // Jacobian
55     if (_J != 0)
56     {
57       assert(_J->row_size() == n);
58       assert(_J->col_size() == 2);
60       Matrix::iterator_support its = _J->it_support();
61       assert(its.get() != 0);
63       Matrix::iterator_traits::col_iterator it1 =
64         its->col_begin(0);
66       Matrix::iterator_traits::col_iterator it2 =
67         its->col_begin(1);
69       for (size_t i = 0; i < n; ++i)
70       {
71         double temp = std::exp(b * m_t[i]);
72         *it1++ = temp;
73         *it2++ = a * temp * m_t[i];
74       }
75     }
76   }
78   size_t vector_size() const { return m_y.size(); }
79   size_t dimension_size() const { return 2; }
80 };
82 Residual residual;
84 NLS_GaussNewton newton;
86 // Initial guess x = 0
87 Matrix x(2, 1);
89 newton(residual, x);
90 assert(std::abs(x(0, 0) - a) < 0.001);
91 assert(std::abs(x(1, 0) - b) < 0.01);

Note that the calibrated model parameters at line 90 and 91 are not same as the true parameters because we added some random noise to the observation data.

2 Model Fitting

This example is an application of smooth curve fitting in finance industry. In finance, it is a common practice to build a smooth interest rate curve that is a 1-dimensional function wrt time, i.e. r(t). The discount factor f(t) at time t is computed from a nonlinear function f(t)=e-r(t)t. In this toy example, we assume N discount factors are observed from the financial market at a list of time points {ti,fi},i=0,,N-1, i.e. fi=f(ti)+ϵ for some data noise ϵ around 1.e-3.

The model fitting exercise here will use a cubic B-spline curve to represent the interest rate curve, r(t;θ)=ϕ(t)Tθ, where ϕ(t) is a vector of B-spline basis and θ is a vector of basis coefficients. Assume we choose B-spline knot points to be [0,0.25,1,5,10,30] in time horizon. From the observed data {ti,fi}, we want to recover the model parameter θ hence obtain a smooth interest curve.

We choose to build the curve r(t;θ) by minimizing the curve roughness subject to discount factor fitting errors being within 1.e-3 tolerance. In addition, we impose additional curve constraints to require the interest rates being positive at observed time points, i.e. r(ti;θ)0,i=0,,N-1.

1 // time in unit of years
2 const double t[12] =
3 {
4    0.083333333, 0.166666667, 0.25, 0.5, 1,
5    2, 3, 5, 7, 10,
6    20, 30
7 };
9 // a smooth curve of interest rates
10 const double r[12] =
11 {
12    0.0001, 0.0002, 0.0003, 0.0004, 0.0005,
13    0.0016, 0.0031, 0.0076, 0.0116, 0.0147,
14    0.0208, 0.0215
15 };
17 // The observed discount factors: exp(-rt) + some random noise
18 const double df[12] =
19 {
20    0.999837189, 0.999659956, 0.999875655, 0.999845438, 0.999600265,
21    0.996395796, 0.990311641, 0.962516537, 0.921680833, 0.863478084,
22    0.660144366, 0.524662124
23 };
25 // A residual function for these 12 observed time and
26 // discount factors.
27 class Residual : public VtrValueFunctionND
28 {
29    const double* m_df;
30    const double* m_t;
31    mutable BSplineCurve m_spline;
33 public:
34    Residual(
35       const double* _df,
36       const double* _t,
37       const BSplineCurve& sp) : m_df(_df), m_t(_t), m_spline(sp)
38    {
39    }
41    void eval(const Matrix& x, Matrix* _v = 0, Matrix* _J = 0) const
42    {
43       // x is the vector of spline basis coefficients to calibrate.
44       ftl::Assert(x.row_size() == this->dimension_size());
45       m_spline.set_basis_coeff(x);
47       if (_v != 0)
48       {
49          ftl::Assert(_v->row_size() == 12);
51          for (size_t i = 0; i < 12; ++i)
52          {
53             // interest rate from the B-spline curve
54             double r = m_spline.eval(m_t[i]);
56             // Fitting error
57             (*_v)(i,0) = std::exp(-r * m_t[i]) - m_df[i];
58          }
59       }
61       // Jacobian by finite difference.
62       if (_J != 0)
63       {
64          ftl::Assert(_J->row_size() == 12);
65          ftl::Assert(_J->col_size() == x.row_size());
66          this->jacobian_fwd_diff(x, *_J, _v);
67       }
68    }
70    size_t vector_size() const { return 12; }
71    size_t dimension_size() const { return m_spline.basis_coeff().size(); }
72 };
74 // B-spline knot points.
75 double t_knots[] = { 0, 0.25, 1, 5, 10, 30 };
77 // Create a cubic spline curve fitting solver with 6
78 // knot points. Also set curve value constraints to
79 // be positive at all 12 time data points.
80 SmoothSplineCurve smooth_spline(t_knots, 6, 3, 12);
81 {
82    for (size_t i = 0; i < 12; ++i)
83    {
84       SmoothCurveFitting::Constraint c;
85       c.x = t[i];
86       c.type = SmoothCurveFitting::Constraint::VALUE;
87       c.value = Bound::lower(0);
88       smooth_spline.add_interior_constr(c);
89    }
90 }
92 // Set initial guess to a flat rate.
93 size_t n_params = smooth_spline.n_parameters();
94 Matrix beta(n_params, 1);
95 beta = 0.01;
96 smooth_spline.set_parameters(beta);
98 // Create the residual function by passing the time,
99 // discount factor and underlying spline curve.
100 Residual residual(df, t, smooth_spline.spline());
102 // Tolerance of fitting to discount factors.
103 Matrix tolerance(12, 1);
104 tolerance = 1.e-3;
106 smooth_spline(residual, tolerance, 100, true, 1.e-8, 0.01);
108 // This spline curve object can be directly plotted in GDE.
109 const BSplineCurve& spline = smooth_spline.spline();
111 // Check fitting errors.
112 for (size_t i = 0; i < 12; ++i)
113 {
114    double rate = smooth_spline.curve().eval(t[i]);
115    double disc = std::exp(-rate * t[i]);
117    ftl::Assert(std::abs(rate - r[i]) < 1.e-3);
118    ftl::Assert(std::abs(disc - df[i]) < 1.001e-3);
119 }

Note that at line 104, we set the discount factor fitting error tolerance to be 1.e-3, same magnitude as the noise that we artificially added to observation data. Setting the tolerance too small will be fitting to the noise. In practice, financial professionals have good knowledge on how tight market data should be fit.

After calibration, we can retrieve the smooth spline curve at line 109. This curve object is recogonized by GDE and can be plotted as illustrated in Figure 1.

Smooth Interest Rate Curve
Figure 1: Smooth Interest Rate Curve