Saturday, December 1, 2012

Linear Regression with Matlab Using Batch Gradient Descent Algorithm


i will implement linear regression which can be adapted classification easily,  i use Matlab by following the  Dr. Andrew Ng's class. You can watch the classes online from here.
While implementing i also came across a very nice blog post, actually only dataset differs, in this case i use the original dataset given by the Dr. Ng, more details of the code below can be reached from here (DSPlog )

download data

 the linear regression model is 

\begin{displaymath}
h_{\theta}(x) = \theta^Tx = \sum_{i=0}^n \theta_i x_i, \nonumber
\end{displaymath}

 and the batch gradient descent update rule is

\begin{displaymath}
\theta_j := \theta_j - \alpha \frac{1}{m} \sum_{i=1}^m (h_{\...
...{(i)}) x_j^{(i)} \;\;\;\;\;\mbox{(for all $j$)} \nonumber
\par
\end{displaymath}


theta_vec = [0 0]';
alpha = 0.007;
err = [0 0]';
for kk = 1:1500
 h_theta = (x*theta_vec);
 h_theta_v = h_theta*ones(1,n);
 y_v = y*ones(1,n);
 theta_vec = theta_vec - alpha*1/m*sum((h_theta_v - y_v).*x).';
 err(:,kk) = 1/m*sum((h_theta_v - y_v).*x).';
end

Cost Function:
\begin{displaymath}
J(\theta)=\frac{1}{2m}\sum_{i=1}^{m}\left(h_{\theta}(x^{(i)})-y^{(i)}\right)^{2} \nonumber
\end{displaymath}
For different values of theta, in this case theta0 and theta1, we can plot the cost function J(theta) in 3d space or as a contour.

j_theta = zeros(250, 250);   % initialize j_theta
theta0_vals = linspace(-5000, 5000, 250);
theta1_vals = linspace(-200, 200, 250);
for i = 1:length(theta0_vals)
   for j = 1:length(theta1_vals)
  theta_val_vec = [theta0_vals(i) theta1_vals(j)]';
  h_theta = (x*theta_val_vec);
  j_theta(i,j) = 1/(2*m)*sum((h_theta - y).^2);
    end
end
figure;
surf(theta0_vals, theta1_vals,10*log10(j_theta.'));
xlabel('theta_0'); ylabel('theta_1');zlabel('10*log10(Jtheta)');
title('Cost function J(theta)');
figure;
contour(theta0_vals,theta1_vals,10*log10(j_theta.'))
xlabel('theta_0'); ylabel('theta_1')
title('Cost function J(theta)');






%linear regression using gradient descent algorithm to find the coefficients, 
%there are many ways to find the coefficients but now we apply gradient descent
%one dimensional data, with an output 
x = load('ex2x.dat');
y = load('ex2y.dat');

figure % open a new figure window
plot(x, y, 'o');
ylabel('Height in meters')
xlabel('Age in years')


m = length(y); % store the number of training examples
x = [ones(m, 1), x]; % Add a column of ones to x
n = size(x,2);
%this part is for minimizing the Theta_vec: coefficients.
theta_vec = [0 0]';
alpha = 0.007;
err = [0 0]';
for kk = 1:10000
 h_theta = (x*theta_vec);
 h_theta_v = h_theta*ones(1,n);
 y_v = y*ones(1,n);
 theta_vec = theta_vec - alpha*1/m*sum((h_theta_v - y_v).*x).';
 err(:,kk) = 1/m*sum((h_theta_v - y_v).*x)';
end

figure;
plot(x(:,2),y,'bs-');
hold on
plot(x(:,2),x*theta_vec,'rp-');
legend('measured', 'predicted');
grid on;
xlabel('x');
ylabel('y');
title('Measured and predicted ');



j_theta = zeros(250, 250);   % initialize j_theta
theta0_vals = linspace(-5000, 5000, 250);
theta1_vals = linspace(-200, 200, 250);
for i = 1:length(theta0_vals)
   for j = 1:length(theta1_vals)
  theta_val_vec = [theta0_vals(i) theta1_vals(j)]';
  h_theta = (x*theta_val_vec);
  j_theta(i,j) = 1/(2*m)*sum((h_theta - y).^2);
    end
end
figure;
surf(theta0_vals, theta1_vals,10*log10(j_theta.'));
xlabel('theta_0'); ylabel('theta_1');zlabel('10*log10(Jtheta)');
title('Cost function J(theta)');
figure;
contour(theta0_vals,theta1_vals,10*log10(j_theta.'))
xlabel('theta_0'); ylabel('theta_1')
title('Cost function J(theta)');

1 comment: