Matlab Class Home      Class Outline      Previous Task      Next Task      Main Class Page      Evaluation 5

Task 5.3 Detail: Fit polynomials to data.

Summary of new tools and commands.

Task: Write a script to read the data file introduced in the first task. Extract the first (time) and second columns (D1) of the data matrix. Fit a first and second order polynomial in time to these values. Plot the original data vs time along with the two fitted polynomials (using different colors).

It is useful to calculate trends or smoothed versions of observations by fitting simple functions to data. There are other circumstances were we know from theory that some quantity depends in a specific way on another quantity.

We start with the simplest functions which are polynomials having the following form:

    y(x) = a(n) x^n + a(n-1) x^(n-1) + ... + a(2) x^2 + a(1) x^1 + a(0) x^0
where a is a set of numbers (the coefficients of the polynomial) which determine its shape. The coefficients can be positive or negative or zero. The last term is a constant term because x^0 = 1 for all x.

By definition, an nth order polynomial has the dependent variable (x) to the nth power. Because of this, a(n) must be non-zero.

For a given vector of y's and associated x's, we want to determine the set of a(n)'s that cause the polynomial to "best" fit the values. Specifically, we minimize the "squared error" to get a "least squared error" fit of the polynomial to the values.

The error is e(x)=y(x)-p(x) where p(x) is the polynomial evaluated at x. The squared error is sum(e(x)^2), which is the quantity to minimize. There are formulas in many places showing how to determine the coefficients to minimize the error.

The function polyfit performs these calculations providing the best-fit polynomial coefficients. The command

  a1 = polyfit(time,D1,1);   %  first order (linear) polynomial
  a2 = polyfit(time,D2,2);   %  second order (quadratic) polynomial
In the above statement, a1 will be two numbers: a1(1) is the slope of the line and a1(2) is the constant (y-intercept). And, a2 will be 3 numbers where a2(1) is the coefficient of the squared term, with a2(2) and a2(3) being the coefficient of the linear and constant terms, respectively. Notice the order (numbering) of the coefficients is reveresed compared to the polynomial formula above.

It is easy to change the order of the polynomial to fit in the call to polyfit to see which type of curve provides a good fit to the data points.

As a practical matter, it is not always useful to fit a high order polynomial to noisy data. The curve may go through the data points, but may be very different between the points (referred to as "over-fitting"). It is always useful to plot the fitted curve and the data points together to determine the overall quality of the fit.

A companion function polyval is available to evaluate the polynomial at some set of points given the coefficients. This evaluation of the polynomial can be used to plot the polynomial along with the original data.

  p1=polyval(a1,time); % evaluate line at data times
  e1=D1-p1;            % mis-fit (difference) between data and linear polynomial.
  p2=polyval(a2,time); % evaluate quadratic at data times
  e2=D1-p2;            % mis-fit (difference) between data and quadratic polynomial.
Notice that polyval uses the number of coefficients (the number of values in a1 or a2) to determine the order of the polynomial.

Flow chart for task 5.3

%%%  Task 5.3
%%%  read data, extract variables
%%%  fit polynomials to data (orders 1, 2)
%%%  reconstruct polynomials from fit
%%%  plot original data and fit curves

Here is the script to generate the answer to this task.

%%%  Task 5.3
%%%  read data, extract variables
  data=load('fitdata.dat');
  time=data(:,1);D1=data(:,2);
%%%  fit polynomials to data (orders 1, 2)
%%%  reconstruct polynomials from fit
  p1=polyfit(time,D1,1);D11=polyval(p1,time);
  p2=polyfit(time,D1,2);D12=polyval(p2,time);
%%%  plot original data and fit curves
  figure
    plot(time,D1,'k',time,D11,'r',time,D12,'b')
    title('D1 (k), linear (r), quadratic (b)')
    xlabel('time (day)')
    ylabel('D1')

   SS5_3

Matlab Class Home      Class Outline      Previous Task      Next Task      Main Class Page      Evaluation 5


email: J. Klinck