Given numeric data points, find an equation that approximates the data
with a least square fit. This is one of many techniques for getting
an analytical approximation to numeric data.
The problem is stated as follows :
Given measured data for values of Y based on values of X1,X2 and X3. e.g.
Y_actual X1 X2 X3 observation, i
-------- ----- ----- -----
32.5 1.0 2.5 3.7 1
7.2 2.0 2.5 3.6 2
6.9 3.0 2.7 3.5 3
22.4 2.2 2.1 3.1 4
10.4 1.5 2.0 2.6 5
11.3 1.6 2.0 3.1 6
Find a, b and c such that Y_approximate = a * X1 + b * X2 + c * X3
and such that the sum of (Y_actual - Y_approximate) squared is minimized.
(We are minimizing RMS error.)
The method for determining the coefficients a, b and c follows directly
form the problem definition and mathematical analysis given below.
Set up and solve the system of linear equations:
(Each SUM is for i=1 thru 6 per table above, note symmetry)
| SUM(X1*X1) SUM(X1*X2) SUM(X1*X3) | | a | | SUM(X1*Y) |
| SUM(X2*X1) SUM(X2*X2) SUM(X2*X3) | x | b | = | SUM(X2*Y) |
| SUM(X3*X1) SUM(X3*X2) SUM(X3*X3) | | c | | SUM(X3*Y) |
Easy to program, not good data:
lsfit_lect.c
lsfit_lect_c.out
lsfit_lect.java
lsfit_lect_java.out
lsfit_lect.py
lsfit_lect_py.out
lsfit_lect.f90
lsfit_lect_f90.out
lsfit_lect.m similar to c
lsfit_lect_m.out
test_lsfit.py3 Python3 two methods
test_lsfit_py3.out output
test_lsfit2.py3 Python3 two methods
test_lsfit2_py3.out output better
Now, suppose you wanted a constant term to make the fit:
Y_approximate = Y0 + a * X1 + b * X2 + c * X3
Then the linear equations would be:
| SUM( 1* 1) SUM( 1*X1) SUM( 1*X2) SUM( 1*X3) | | Y0 | | SUM( 1*Y) |
| SUM(X1* 1) SUM(X1*X1) SUM(X1*X2) SUM(X1*X3) | | a | | SUM(X1*Y) |
| SUM(X2* 1) SUM(X2*X1) SUM(X2*X2) SUM(X2*X3) | x | b | = | SUM(X2*Y) |
| SUM(X3* 1) SUM(X3*X1) SUM(X3*X2) SUM(X3*X3) | | c | | SUM(X3*Y) |
Note the symmetry! Easy to program.
Note the simultaneous equations, from Lecture 3: |A| x |X| = |Y|
|A| and |Y| easily computable, solve for |X| to get Y0, a, b and c
We now have a simple equation to compute Y approximately from a reasonable
range of X1, X2, and X3.
Y is called the dependent variable and X1 .. Xn the independent variables.
The procedures below implement a few special cases and the general case.
The number of independent variables can vary, e.g. 2D, 3D, etc. .
The approximation equation may use powers of the independent variables
The user may create additional independent variables e.g. X2 = SIN(X1)
with the restriction that the independent variables are linearly
independent. e.g. Xi not equal p Xj + q for all i,j,p,q
Mathematical derivation
The mathematical derivation of the least square fit is as follows :
Given data for the independent variable Y in terms of the dependent
variables S,T,U and V consider that there exists a function F
such that Y = F(S,T,U,V)
The problem is to find coefficients a,b,c and d such that
Y_approximate = a * S + b * T + c * U + d * V
and such that the sum of ( Y - Y_approximate ) squared is minimized.
Note: a, b, c, d are scalars. S, T, U, V, Y, Y_approximate are vectors.
To find the minimum of SUM((Y - Y_approximate)^2)
the derivatives must be taken with respect to a,b,c and d and
all must equal zero simultaneously. The steps follow :
SUM((Y - Y_approximate)^2) = SUM((Y - a*S - b*T - c*U - d*V)^2)
d/da = -2 * S * SUM( Y - a*S - b*T - c*U - d*V )
d/db = -2 * T * SUM( Y - a*S - b*T - c*U - d*V )
d/dc = -2 * U * SUM( Y - a*S - b*T - c*U - d*V )
d/dd = -2 * V * SUM( Y - a*S - b*T - c*U - d*V )
Setting each of the above equal to zero (derivative minimum at zero)
and putting constant term on left, the -2 is factored out,
the independent variable is moved inside the summation
SUM( a*S*S + b*S*T + c*S*U + d*S*V = S*Y )
SUM( a*T*S + b*T*T + c*T*U + d*T*V = T*Y )
SUM( a*U*S + b*U*T + c*U*U + d*U*V = U*Y )
SUM( a*V*S + b*V*T + c*V*U + d*V*V = V*Y )
Distributing the SUM inside yields
a * SUM(S*S) + b * SUM(S*T) + c * SUM(S*U) + d * SUM(S*V) = SUM(S*Y)
a * SUM(T*S) + b * SUM(T*T) + c * SUM(T*U) + d * SUM(T*V) = SUM(T*Y)
a * SUM(U*S) + b * SUM(U*T) + c * SUM(U*U) + d * SUM(U*V) = SUM(U*Y)
a * SUM(V*S) + b * SUM(V*T) + c * SUM(V*U) + d * SUM(V*V) = SUM(V*Y)
To find the coefficients a,b,c and d solve the linear system of equations
| SUM(S*S) SUM(S*T) SUM(S*U) SUM(S*V) | | a | | SUM(S*Y) |
| SUM(T*S) SUM(T*T) SUM(T*U) SUM(T*V) | x | b | = | SUM(T*Y) |
| SUM(U*S) SUM(U*T) SUM(U*U) SUM(U*V) | | c | | SUM(U*Y) |
| SUM(V*S) SUM(V*T) SUM(V*U) SUM(V*V) | | d | | SUM(V*Y) |
Some observations :
S,T,U and V must be linearly independent.
There must be more data sets (Y, S, T, U, V) than variables.
The analysis did not depend on the number of independent variables
A polynomial fit results from the substitutions S=1, T=x, U=x^2, V=x^3
The general case for any order polynomial of any number of variables
may be used with a substitution, for example, S=1, T=x, U=y, V=x^2,
W=x*y, X= y^2, etc to terms such as exp(x), log(x), sin(x), cos(x).
Any number of terms may be used. The "1" is for the constant term.
Using the S,T,U,V notation above, fitting Y_approx to find a, b, c, d
Y_approx = a*S + b*T + c*U + d*V + ...
Choose S = 1.0 thus a is the constant term
Choose T = log(x) thus b is the coefficient of log(x)
Choose U = log(x*x) thus c is the coefficient of log(x*x)
Choose V = sin(x) thus d is the coefficient of sin(x)
see 4 additional terms in lsfit_log.c
Thus our x data, n = 21 samples in code, is fit to
Y_approx = a + b*log(x) + c*log(x*x) + d*sin(x) + ...
By putting the terms in a vector, simple indexing builds the matrix:
A[i][j] = A[i][j] + term_i * term_j summing over n terms using k
lsfit_log.c fitting log and other terms
lsfit_log_c.out
fitting a simple polynomial, 1D
Now, suppose you wanted to fit a simple polynomial:
Given the value of Y for at least four values of X,
Y_approximate = C0 + C1 * X + C2 * X^2 + C3 * X^3
Then the linear equations would be A*X=Y:
| SUM(1 *1) SUM(1 *X) SUM(1 *X^2) SUM(1 *X^3) | | C0 | | SUM(1 *Y) |
| SUM(X *1) SUM(X *X) SUM(X *X^2) SUM(X *X^3) | | C1 | | SUM(X *Y) |
| SUM(X^2*1) SUM(X^2*X) SUM(X^2*X^2) SUM(X^2*X^3) |x| C2 | = | SUM(X^2*Y) |
| SUM(X^3*1) SUM(X^3*X) SUM(X^3*X^2) SUM(X^3*X^3) | | C3 | | SUM(X^3*Y) |
Note that the (i,j) subscript in the A matrix has SUM(X^(i)*X^(j))
for i=0..3, j=0..3
In C, to build A matrix and Y vector for solving simultaneous equations:
// sample polynomial least square fit, nth power, m values of xd and yd
for(i=0; i<n+1; i++)
{
for(j=0; j<n+1; j++)
{
A[i][j] = 0.0;
}
Y[i] = 0.0;
}
for(k=0; k<m; k++)
{
y = yd[k];
x = xd[k];
pwr[0] = 1.0;
for(i=1; i<=n+1; i++) pwr[i] = pwr[i-1]*x;
for(i=0; i<n+1; i++)
{
for(j=0; j<n+1; j++)
{
A[i][j] = A[i][j] + pwr[i]*pwr[j]; // SUM
}
Y[i] = Y[i] + y*pwr[i];
}
}
Solve the simultaneous equations A*X=Y for X[0]=C0, X[1]=C1, X[2]=C2, X[3]=C3
Note that the sum is taken over all observations and the "1" is
just shown to emphasize the symmetry.
Sample code in various languages:
least_square_fit.c really old
least_square_fit_c.out
least_square.py3
least_square_py3.out
test_lsfit.py3 Python3 two methods
test_lsfit_py3.out output
lsfit.py3 just 1D version for copying
peval.py3 just 1D version for copying
least_square_fit.f90
least_square_fit_f90.out
Least_square.rb Ruby class new
test_least_square.rb test
test_least_square_rb.out test output
least_square_fit.java
least_square_fit_java.out
least_square_fit_3d.java
least_square_fit_3d_java.out
least_square_fit_4d.java
least_square_fit_4d_java.out
A specialized version for use later with PDE's
lsfit.java
test_lsfit.java
test_lsfit_java.out
test_lsfit2.java
test_lsfit2_java.out
test_lsfit3.java
test_lsfit3_java.out
test_lsfit4.java
test_lsfit4_java.out
test_lsfit5.java
test_lsfit5_java.out
test_lsfit6.java
test_lsfit6_java.out
test_lsfit7.java
test_lsfit7_java.out
uses simeq.java
The Makefile entry that makes test_lsfit_java.out
test_lsfit_java.out: test_lsfit.java lsfit.java simeq.java
javac -cp . simeq.java
javac -cp . lsfit.java
javac -cp . test_lsfit.java
java -cp . test_lsfit > test_lsfit_java.out
rm -f *.class
least_square_fit.adb
least_square_fit_ada.out
least_square_fit_2d.adb
least_square_fit_2d_ada.out
least_square_fit_3d.adb
least_square_fit_3d_ada.out
least_square_fit_4d.adb
least_square_fit_4d_ada.out
real_arrays.ads
real_arrays.adb
lsfit.ads has 1D through 6D
lsfit.adb
test_lsfit6.adb
test_lsfit6_ada.out
array4d.ads
test_lsfit5.adb
test_lsfit5_ada.out
array5d.ads
test_lsfit4.adb
test_lsfit4_ada.out
array4d.ads
test_lsfit3.adb
test_lsfit3_ada.out
array3d.ads
test_lsfit2.adb
test_lsfit2_ada.out
integer_arrays.ads
integer_arrays.adb
real_arrays.ads
real_arrays.adb
The Makefile entry to make test_lsfit6_ada.out
test_lsfit6_ada.out: test_lsfit6.adb lsfit.ads lsfit.adb \
array3d.ads array4d.ads array5d.ads array6d.ads \
real_arrays.ads real_arrays.adb \
integer_arrays.ads integer_arrays.adb
gnatmake test_lsfit6.adb
./test_lsfit6 > test_lsfit6_ada.out
rm -f test_lsfit
rm -f *.ali
rm -f *.o
similarly for test_lsfit2_ada.out, test_lsfit3_ada.out,
test_lsfit4_ada.out, test_lsfit5_ada.out
Similar code in plain C (up to 7th power in up to 6 or 4 dimensions)
lsfit.h
lsfit.c
test_lsfit.c
test_lsfit_c.out
test_lsfit7.c
test_lsfit7_c.out
test_write_lsfit7.c
test_write_lsfit7_c.out
test_write_lsfit71.c generated
test_write_lsfit72.c generated
Simple one variable versions polyfit polyval
polyfit.h
polyfit.c
polyval.h
polyval.c
test_polyfit.c
test_polyfit_c.out
test_polyfit.py3
test_polyfit_py3.out
test_polyfit_py3.png
poly.java
test_poly.java
test_poly_java.out
Poly.scala
Test_poly.scala
Test_poly_scala.out
truncated series vs lsfit for sin(x+y) and sin(x+y+z)
using code from above, another test case:
fit_sin.adb demonstrates that a lsfit to a specified power,
does not give the same coefficients as a truncated approximation,
to the same power, and is a more accurate fit.
fit_sin.adb
fit_sin_ada.out
comparison to very small Matlab code
generic_real_least_square_fit.ada
lsfit.m MatLab source code (tiny!)
lsfit_m.out MatLab output and plot
comparison to small Python code
test_polyfit.py Python source code
test_polyfit_py.out Python output and plot
Terms for fitting two and three variables, 2D and 3D
And to see how polynomials in two and three variables may be fit:
Given a value of Y for each value of X, get polynomial with X^2
Y_approximate = C0 + C1 * X + C2 * X^2
Then the linear equations would be:
| SUM(1 *1) SUM(1 *X) SUM(1 *X^2) | | C0 | | SUM(1 *Y) |
| SUM(X *1) SUM(X *X) SUM(X *X^2) | | C1 | | SUM(X *Y) |
| SUM(X^2*1) SUM(X^2*X) SUM(X^2*X^2) |x| C2 | = | SUM(X^2*Y) |
Given a value of Y for each value of X, get polynomial with X^3
Y_approximate = C0 + C1 * X + C2 * X^2 + C3 * X^3
Then the linear equations would be:
| SUM(1 *1) SUM(1 *X) SUM(1 *X^2) SUM(1 *X^3) | | C0 | | SUM(1 *Y) |
| SUM(X *1) SUM(X *X) SUM(X *X^2) SUM(X *X^3) | | C1 | | SUM(X *Y) |
| SUM(X^2*1) SUM(X^2*X) SUM(X^2*X^2) SUM(X^2*X^3) |x| C2 | = | SUM(X^2*Y) |
| SUM(X^3*1) SUM(X^3*X) SUM(X^3*X^2) SUM(X^3*X^3) | | C3 | | SUM(X^3*Y) |
Note that index [i][j] of the matrix has SUM(X^i*X^j)) when subscripts 0..3
Note that the terms for two and three variables in a polynomial are:
two three can optimize
a^0 b^0 a^0 b^0 c^0 constant, "1"
a^1 b^0 a^1 b^0 c^0 just a
a^0 b^1 a^0 b^1 c^0 just b
a^0 b^0 c^1
a^2 b^0 a^2 b^0 c^0 just a^2
a^1 b^1 a^1 b^1 c^0 a*b
a^0 b^2 a^1 b^0 c^1
a^0 b^2 c^0
a^0 b^1 c^1
a^0 b^0 c^2
Then terms with a sum of the powers equal to 3, 4, 5, 6 are available
Note that the matrix has the first row as the sum of each term
multiplied by the first term. The second row is the sum of each
term multiplied by the second term, etc.
The data for the terms is from the raw data sets of
y_actual a b or y_actual a b c
being used to determine a fit
y_approx=F(a,b) or y_approx=F(a,b,c)
Terms for the data point y1 a1 b1 are:
1 a1 b1 a1^2 a1*b1 b1^2 the constant term y1
These terms are multiplied by the first term, "1" and added to row 1.
These terms are multiplied by the second term, "a1" and added to row 2, etc.
Then the terms for data point y2 a2 b2 are:
1 a2 b2 a2^2 a2*b2 b2^2 the constant term y2
These terms are multiplied by the first term, "1" and added to row 1.
These terms are multiplied by the second term, "a2" and added to row 2, etc.
Then the simultaneous equations are solved for the coefficients C1, C2, ...
to get the approximating function
y_approx = F(a,b) = C1 + C2*a + C3*b + C4*a^2 + C5*a*b + C6*b^2
The following sample programs compute the least square fit of
internally generated data for low polynomial powers and compute
the accuracy of the fit in terms of root-mean-square error,
average error and maximum error. Note that the fit becomes exact
when the data is from a low order polynomial and the fit uses at
least that order polynomial.
least_square_fit_2d.c
least_square_fit_2d.out
least_square_fit_3d.c
least_square_fit_3d.out
least_square_fit_4d.c
least_square_fit_4d.out
least_square_fit2d.adb
least_square_fit2d_ada.out
real_arrays.ads
real_arrays.adb
Development of Python 2d and 3d least square fit
polyfit2.py3
polyfit2_py3.out
polyfit3.py3
polyfit3_py3.out
You can translate the above to your favorite language.
Now, if everything works, a live interactive demonstration of
least square fit.
The files needed are:
Matrix.java
LeastSquareFit.java
LeastSquareFitFrame.java
LeastSquareFit.out
LeastSquareFitAbout.txt
LeastSquareFitHelp.txt
LeastSquareFitEvaluate.txt
LeastSquareFitAlgorithm.txt
LeastSquareFitIntegrate.txt
The default parameter is an n, all, point fit.
Then set parameter to 3, for n=3, third order polynomial fit.
Then set parameter to 4 and 5 to see improvement.
Homework 2
Spline fit, possible demonstration
There are many other ways to fit a set of points.
The Spline fit smooths the fit by controlling derivatives.
SplineFrame.java
(Replace "LeastSquareFit" with "Spline" then get same files as above.)
Demonstration to see a different type of fit.
Right click on points, left to right increasing X.
Left click to see plot.
Optional lecture on taylor fit to implement math functions
that may not be availabke in programming language you are using.
optional lecture 4.5
updated 10/30/2021