While performing Linear Regression, we try to find out the best fit line (minimizing the error) through the data and calculate the slope and intercept of the line. (W and b respectively)
Fig. 1: Linear Regression best fit line
The basic Linear Regression algorithm that we are considering for data (X,Y) can be written in the following form:
Y = WT * X + b
where, W are the weights learned from the data, WT is the transpose of the W matrix and b is bias vector added to the equation. (in case the line does not pass through the origin)
Maximum likelihood Estimation of Model Parameters:
One way of solving the Linear Regression equation is to use Maximum Likelihood Estimation, In this, we find the value of model parameters (W and b) such that it maximizes the likelihood function, i.e. makes the data observed most probable.
Let probability of data given theta (parameters) be:
The more is this probability, the better is the theta (parameters) chosen. We have to maximize it. Hence,
where theta (ML) is the maximum likelihood estimated value of the parameters.
Least Squares Estimation of Model Parameters:
In least squares, instead of maximizing the probability of the data for a given set of parameters, we try to minimize the value of the squared error for every data point observed. Error here is the difference between the true output value and the value predicted by the regression line estimated.