% Example of least squares fitting. Consider the linear dependency between
% variables x and y:
%
% y = a*x+b,
%
% where a and b are real numbers. Assume that we can measure the value of y
% corresponding to several x, and that there is some random measurement
% error. We call the noisy measurement
%
% m = y + (random noise).
%
% The goal is to recover the numbers a and b from two vectors: x and m.
% The idea is that model the relationshoip between x and y by y = a*x+b,
% and we use the noisy data to find the model that fits the data best.
% (Here "best" means optimal in the least squares sense; there are other
% possibilities as well. It depends on the application which approach is
% the best in a particular situation.)
%
% We construct the system matrix as follows:
%
% |x1 1|
% |x2 1|
% A = |. .|
% |. .|
% |. .|
% |xN 1|
%
% Construct the data vector as
%
% |m1|
% |m2|
% m = |. |
% |. |
% |. |
% |mN|
%
% Denote by z the unknown vertical vector z = [a;b].
% Then we look at the matrix equation Az = m. Now we cannot invert the
% matrix A since it is not a square matrix. But we will solve Az = m in the
% least squares sense as z = A\m. Here \ is Matlab's built-in operator.
%
% Samuli Siltanen October 2013
% Plot parameters. These do not affect the mathematical computations.
% They are for controlling how the result is plotted in the end.
msize = 15;
fsize = 16;
titlefsize = 20;
lwidth = 2;
% Choose amplitude of noise. The bigger number here, the more erratic is
% the measurement. Setting noiselevel=0 means that there is no error in the
% measurement.
noiselevel= 8;
% Choose parameters for the model y = a*x+b. We use these values for
% simulating data. In the end we want to recover these values of a and b
% from the noisy measurement.
a = .8;
b = 10;
% Create ideal (y) and noisy (m) data vectors.
x = 10:5:100;
y = a*x+b;
m = y+noiselevel*randn(size(y));
% Plot the ideal data as black line and noisy data as red dots
figure(1)
clf
subplot(1,2,1)
plot(x,y,'k','linewidth',lwidth)
hold on
plot(x,m,'r.','markersize',msize)
axis equal
set(gca,'ytick',[0:20:100],'fontsize',fsize)
set(gca,'xtick',[0:20:100],'fontsize',fsize)
box off
axis([0 110 0 100])
title(['Simulation of data: a=', num2str(a),' , b=', num2str(b)],'fontsize',titlefsize)
pause
% Construct the system matrix.
A = [x(:),ones(length(x),1)];
% Compute the least squares solution
reg = A\m(:);
% Plot the result of least squares fit (regression) by showing the data
% points as red dots and the fitted line (given by the recovered values of
% a and b) as a blue line.
subplot(1,2,2)
plot(x,m,'r.','markersize',msize)
hold on
axis equal
set(gca,'ytick',[0:20:100],'fontsize',fsize)
set(gca,'yticklabel',{})
set(gca,'xtick',[0:20:100],'fontsize',fsize)
box off
axis([0 110 0 100])
t = linspace(10,100,2);
plot(t,reg(1)*t+reg(2),'b','linewidth',lwidth)
title(['Least squares fit: a=',num2str(round(1000*reg(1))/1000), ', b=', num2str(round(1000*reg(2))/1000)],'fontsize',titlefsize)