A gradient descent calculator of the linear regression model \(y = ax + b\) is presented. The values of the coefficients \(a\) and \(b\) are compared to those calculated using the analytical method based the linear least squares fitting . Comparisons of the results are also presented for educational purposes.
Let us assume that we have a set of data of \( n \) data points \( (x_i , y_i) \) and we need to find a linear model of the form \(y = ax + b\) by minimizing the sum
\[ \Sigma^{n}_{i=1}(y_i - (a \cdot x_i + b))^2 \]
which is the sum of the distances between the data points and the points on the linear model as shown below.
Note that the sum of the squares is used instead of the absolute value for practical reason when doing partial derivatives and other analytical computations.
We now present two methods to calculate the coefficients \( a \) and \( b \) included in the linear model.
Gradient descent is an iterative optimization algorithm that finds the parameters \(a\) and \(b\) by minimizing the cost function defined by:
\[ J(a, b) = (1/2n) \Sigma^{n}_{i=1}(y_i - (a \cdot x_i + b))^2 \] Whose partial derivatives are needed in the update rules and are given by: \[ \dfrac{\partial J}{\partial a} = (1/n) \Sigma^{n}_{i=1}(- x_i (y_i - (a x_i + b))) \] \[ \dfrac{\partial J}{\partial b} = (1/n) \Sigma^{n}_{i=1}(-(y_i - (a x_i + b ))) \]The update rules for the parameters \( a \) and \( b \) are:
\[ a_{k+1} = a_k - r \dfrac{\partial J}{\partial a}|_{a=a_k,b=b_k} = a_k - r (1/n) \Sigma^{n}_{i=1}(-x_i (y_i - (a_k * x_i + b_k))) \] \[ b_{k+1} = b_k - r \dfrac{\partial J}{\partial b}|_{a=a_k,b=b_k} = b_k - r (1/n) \Sigma^{n}_{i=1}(-(y_i - (a_k x_i + b_k))) \] WhereThe analytical method based on the linear least squares fitting uses closed-form formulas derived from minimizing the sum of squared errors. The formulas for parameters \(a\) and \(b\) are:
\[ a = \dfrac{(n \cdot \displaystyle \Sigma^{n}_{i=1} (x_i \cdot y_i) - \Sigma^{n}_{i=1} x_i \cdot \Sigma^{n}_{i=1} y_i)}{ (n \cdot \Sigma^{n}_{i=1} (x_i^2) - (\Sigma^{n}_{i=1}x_i)^2) } \] \[ b = \dfrac{(\Sigma^{n}_{i=1}y_i - a \cdot \Sigma^{n}_{i=1}x_i) }{ n } \]Enter the number of points:
\( n = \)
\( \dfrac{\partial J}{\partial a} \) =
\( \dfrac{\partial J}{\partial b} \) =
Results Form Gradient Descent Mathod\(a = \)
\(b = \)
Results Form Linear Least Squares Method (Using formulas for \( a \) and \( b \) above)\(a = \)
\(b = \)
Minimum Value of the cost function \( J(a, b) \):
Graph Legend: Blue for gradient descent Green Green for the linear least square method.