37 Linear and Cubic Splines

In the last section, we looked at how to fit a Lagrangian polynomial to a data set. However, a Lagrangian polynomial can quickly become a polynomial with a very high order, which is disadvantageous for a couple of reasons.

Let x represent the independent variable and y the dependent variable. One disadvantage is that the polynomial oscillates a lot near the smallest value and largest value for x [8]. This leads to a lot of error in the prediction of the corresponding y when x is equal to a number near the smallest or largest x values.

Another disadvantage is that the coefficients of the polynomial are very sensitive to small changes in the data [8]. This means a slight difference in rounding or a small measuring error may result in a polynomial that is much different than the polynomial that would have been used otherwise. This too leads to inaccurate predictions.

Because of these disadvantages, we will look at two more modeling methods, linear splines and cubic splines, that allow us to model the given data and predict the corresponding y for an x that is between the smallest and largest values given in the data for the independent variable.

Linear Splines

Consider a data set of independent values x_1,x_2,...,x_n in increasing order and corresponding dependent values y_1, y_2,...,y_n. The method of linear splines is to find a line to connect each pair of adjacent points. That is, we find the equation of the line that connects (x_1,y_1) and (x_2,y_2), the line that connects (x_2,y_2) and (x_3,y_3), and so on until we find the line that passes through (x_{n-1},y_{n-1}) and (x_n,y_n) [8]. Each of these lines is referred to as a spline. Let’s look at an example of how to this. I completed this problem for Mathematical Models/Applications 2 [26].


Example 72
Find the linear splines for the following data set and predict the y-value when x=35.


Solution
A linear spline is of the form S(x)=a+bx. Since the first spline S_1(x) must pass through (14, 320) and (22, 490), we have the following system of equations.

    \begin{align*} 320&=a+14b\\ 490&=a+22b \end{align*}

Subtracting the top equation from the bottom one gives us

    \begin{align*} 170&=8b\\ b&=\frac{170}{8} \end{align*}

Substituting b into the first equation gives us

    \begin{align*} 320&=a+14\left(\frac{170}{8}\right)\\ 320&=a+\frac{2380}{8}\\ a&=\frac{45}{2} \end{align*}

Therefore,

    \[S_1(x)=\frac{45}{2}+\frac{170x}{8}\,\,\text{with domain}\,\, [14,22].\]

Following the same method, it can be found that

    \begin{align*} S_2(x)&=\frac{705}{2}+\frac{25}{4}x &&\text{with domain $[22,30]$}\\ S_3(x)&=690-5x &&\text{with domain $[30,38]$}\\ S_4(x)&=595-\frac{5}{2}x &&\text{with domain $[38,46].$}\\ \end{align*}

Because 35\in [30,38], we will use the third linear spline to predict the y-value for x=35.

    \begin{align*} S_3(35)&=690-5(35)\\ &=515 \end{align*}

Linear splines are first-order polynomials and are quite accurate. However, we can also use third-order polynomials, called cubic splines, that are sometimes more accurate depending on the data.

Cubic Splines

The method of cubic splines is very similar to that of linear splines. We want to find a curve for each pair of adjacent points by finding the equation of a second-degree polynomial, or cubic spline, that connects each pair [8]. Let’s go through the following problem that I completed for Mathematical Models/Applications 2 [26].


Example 73
Find the natural cubic splines for the following data set.


Solution
Because we have three sets of points, we need to find two cubic splines, S_1(x)=a+bx+cx^2+dx^3 and S_2(x)=e+fx+gx^2+hx^3. Because we have eight different variables to solve for, we need a system of eight equations. Because the first spline connects (2, 2) and (4, 8), we have

(1)   \begin{align*} a+2b+4c+8d&=2 \\ a+4b+16c+64d &=8 \end{align*}

Similarly, because the second spline connects (4, 8) and (7, 12), we have

(2)   \begin{align*} e+4f+16g+64h&=8 \\ e+7f+49g+343g &=12 \end{align*}

These next equations involve derivatives (see Part I: Chapter 2). We know that the slope at (4, 8) is the same for both splines. That is, S_1'(4)=S_2'(4). Because S_1'(x)=b+2cx+3dx^2 and S_2'(x)=f+2gx+3hx^2, we have

    \[b+2(4)c+3(16)d=f+2(4)g+3(16)h\]

which is equivalent to

(3)   \begin{align*} b+8c+48d-f-8g-48h&=0. \end{align*}

Furthermore, because the first derivatives are equal, the second derivatives, S_1''(4) and S_2''(4) are equal as well. Because S_1''(x)=2c+6dx and S_2'(x)=2g+6hx, we have

    \[2c+6(4)d=2g+6(4)h\]

which is equivalent to

(4)   \begin{align*} 2c+24d-2g-24h=0. \end{align*}

Finally, because we are asked to find the natural cubic spline, the second derivatives at the exterior points, S_1''(2) and S_2''(7), are equal to zero [8]. This gives us

(5)   \begin{align*} 2c+12d&=0 \\ 2g+42h&=0 \end{align*}

We can represent this system of eight equations as an equation of matrices the following way.

    \[\left[ \begin{array}{@{}*{8}{r}@{}} 1&2&4&8&0&0&0&0\\ 1&4&16&64&0&0&0&0\\ 0&0&0&0&1&4&16&64\\ 0&0&0&0&1&7&49&343\\ 0&1&8&48&0&-1&-8&-48\\ 0&0&2&24&0&0&-2&-24\\ 0&0&2&12&0&0&0&0\\ 0&0&0&0&0&0&2&42\\ \end{array} \right]\left[ \begin{array}{@{}*{7}{r}@{}} a\\ b\\ c\\ d\\ e\\ f\\ g\\ h\\ \end{array} \right]=\left[\begin{array}{@{}*{7}{r}@{}} 2\\ 8\\ 8\\ 12\\ 0\\ 0\\ 0\\ 0\\ \end{array} \right]\]

Using Excel, we can find the inverse of the first matrix which is

    \[\left[ \begin{array}{@{}*{8}{r}@{}} 2&-1&0&0&0&0&4&0\\ -0.3&0.3&-0.13&0.13&0.4&0.4&-3.8&-0.2\\ -0.15&0.15&0.1&-0.1&-0.3&-0.3&1.1&0.15\\ 0.03&-0.03&-0.02&0.02&0.05&0.05&-0.1&-0.03\\ 4.67&-4.67&-0.78&1.78&9.33&-6.22&-3.11&-6.22\\ -2.3&2.3&1.2&-1.2&-4.6&3.07&1.53&4.47\\ 0.35&-0.35&-0.23&0.23&0.7&-0.47&-0.23&-1.02\\ -0.02&0.02&0.01&-0.01&-0.03&0.02&0.01&0.07\\ \end{array} \right]\]

Multiplying both sides of the equation by this matrix gives us

    \[a=-4; b=2.33; c=0.5; d=-0.08\]

and

    \[e=-12.89; f=9; g=-1.17; h=0.06\]

which gives us the following cubic splines:

    \begin{align*} S_1(x)&=-4+2.33x+0.5x^2-0.08x^3\\ S_2(x)&=-12.89+9x-1.17x^2+0.06x^3. \end{align*}

We have now seen a few different ways that we can model data. We can find one polynomial that passes through all of the data points, such as a Lagrangian polynomial, which works as long as we only have a few data points. If we have many data points, a more accurate model can be derived by using splines.

Our next topic is on how we can model probabilistic behavior and predict behavior long-term using Markov chains.

License

Portfolio for Bachelor of Science in Mathematics Copyright © by Abigail E. Huettig. All Rights Reserved.

Share This Book