Fitting Models to Data Graphically
Published:
Continuing with the series of posts on numerical analysis, in this post, I am going to discuss Curve Fitting. One may wonder what the difference between interpolation and curve fitting is. To illustrate, one may consider interpolation is a method of constructing a curve to go through all the data points, while curve fitting is a method of constructing a curve that seems to best represent the data points which may not necessarily go through all the data points.
Curve Fitting
For some cases, interpolation may not be the best method to represent the data. This is because the data may contain noise or errors, and interpolating through all the data points may not be the best representation of the underlying trend. In such cases, curve fitting can be used to find a function that best fits the data.
Then one may propose a question: How do we know a function fits the data well? One way to determine this is by calculating the residuals, which are the differences between the observed values and the values predicted by the model. The goal is to minimize these residuals by adjusting the parameters of the model. We have different criteria to determine the best fit:
- Chebyshev Approximation Criterion: This criterion minimizes the maximum absolute error between the data points and the model.
- Least Squares Criterion: This criterion minimizes the sum of the squares of the residuals.
Moreover, suppose we want to use a polynomial to fit the data. In that case, we can use the method of least squares to find the coefficients of the polynomial that minimize the sum of the squares of the residuals. This method involves solving a system of linear equations to find the coefficients of the polynomial.
Least Squares Method
The least squares method is a popular technique for fitting a model to data. It involves minimizing the sum of the squares of the residuals between the observed values and the values predicted by the model. The least squares method can be used to fit a wide range of models, including linear models, polynomial models, exponential models, and more.
Suppose we have a set of data points $(x_1, y_1), (x_2, y_2), \ldots, (x_n, y_n)$, and we want to fit a polynomial of degree $m$ to the data. The polynomial can be written as:
\[f(x)=a_0+a_1 x+a_2 x^2+\ldots+a_m x^m\]The goal is to find the coefficients $a_0, a_1, \ldots, a_m$ that minimize the sum of the squares of the residuals. This can be done by solving a system of linear equations, which can be written in matrix form as: \(\mathbf{A} \mathbf{a}=\mathbf{b}\) where:
- $\mathbf{A}$ is an $n \times (m+1)$ matrix whose $i$-th row is $(1, x_i, x_i^2, \ldots, x_i^m)$,
- $\mathbf{a}$ is a vector of coefficients $(a_0, a_1, \ldots, a_m)$,
- $\mathbf{b}$ is a vector of observed values $(y_1, y_2, \ldots, y_n)$.
- The solution to this system of equations gives the coefficients of the polynomial that best fits the data.
Let us consider the following example to illustrate the least squares method. Suppose we have the following data points:
| $x$ | $y$ |
|---|---|
| 1 | 1 |
| 2 | 3.9 |
| 3 | 9.1 |
| 4 | 15.8 |
We want to fit a quadratic polynomial of the form $f(x)=a_0+a_1 x+a_2 x^2$ to the data. We can use the least squares method to find the coefficients $a_0, a_1, a_2$ that minimize the sum of the squares of the residuals. To find the coefficients, we need to solve the following system of equations:
\[\begin{aligned} a_0+1 \cdot a_1+1 \cdot a_2 &=1 \\ a_0+2 \cdot a_1+2^2 \cdot a_2 &=3.9 \\ a_0+3 \cdot a_1+3^2 \cdot a_2 &=9.1 \\ a_0+4 \cdot a_1+4^2 \cdot a_2 &=15.8 \end{aligned}\]Solving this system of equations gives the coefficients $a_0, a_1, a_2$ that best fit the data. Then we may have that \(a_0=0.1, \quad a_1=0.9, \quad a_2=2.1\)
My Project
In this project, I first presented the idea of fitting models to data graphically. I then implemented the curve fitting method using Python with the help of the numpy and matplotlib libraries. In numpy, we use the polyfit function to fit a polynomial to the data, and in matplotlib, we use the plot function to visualize the data and the fitted model.
To showcase the results of my project, I have created an interactive visualization using HTML. You can view the visualization below:
Comments