Normal Equation Using Python

5 min readJan 29, 2019

Well, in this new era, terms like Artificial Intelligence and Machine learning are quite common. All devices either small scale or large, use/integrate AI algorithms to some extent.

On my journey to learn AI , I basically followed the most popular MOOC from Coursera by the Tutor — Andrew Ng which has all the amazing content and serves as a very good introduction for anyone who wants to enter the vast world of Artificial Intelligence/Machine Learning. But the practical explanation of this course has been done using Octave.

In this blog/story I’ll be focusing on Calculating Cost using Normal Equation using Python and plotting the dataset and hence the line of linear regression.

Prerequisites

Basic Knowledge of Python(Beginner Level)
Familiarity with Numpy, Matplotlib(Beginner Level)
Familiarity with Pandas(Not Necessary)

Linear Regression with One Variable

Here I will calculate Linear Regression with one variable for 2 Datasets:

Sample Dataset
Salary VS. Years of Experience Dataset

Necessary Imports

Since we are using Python we will need to import certain libraries to speed up work and calculations, plot graphs etc.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Reading and Plotting the Data

1.Sample Dataset — We will use numpy to generate random arrays/feature(X) and criterion(Y) matrix.

x = np.array([1,2,3,4,5])
y = np.array([7,9,12,15,16])

Graph:

2. Salary VS. Years of Experience Dataset

Here I will be using Salary_Data.csv. X corresponds to Years of experience and Y corresponds to Salaries.

dataset = pd.read_csv('Salary_Data.csv')
x = dataset.iloc[:, 0].values #Feature matrix
y = dataset.iloc[:, 1].values #Criterion Matrix

Graph :

Understanding Normal Equation

The cost function for any Linear Regression problem is given by -

Cost Function as mentioned above, measures the average of square of the errors i.e average of squared difference between the predicted values(after model training) and original values.

Here ŷͥ is the predicted value for every possible i in the range. The mathematical formula for calculating ŷͥ is given by,

ŷͥ = Ɵ˳ + Ɵࢭx
#  Ɵ˳ is the intercept and Ɵࢭ is the slope of the line.

yͥ represents the original values of the dataset. Simply we can say that:

yͥ in our Cost Function J(Ɵ) is simple equal to our Criterion vector- y.

Since the Cost Function J(Ɵ) calculates errors of our model, so we tend to decrease it to the minimum value possible because the lowest the error, the more robust and efficient is our model.

In minimizing Cost Function J(Ɵ), Normal Equation helps.

The mathematical Normal Equation goes by :

As obvious,x is our feature matrix and y is the criterion vector.

The result of the above equation Ɵ contains the values that minimize the cost function:

Python Code

After successfully importing the required libraries and assigning x and y, we add an additional column of 1s in our feature matrix and we call this additional matrix as x_bias.

x _ bias = np.ones((m,1)) #m is number of records in the dataset.

Now we need to add this x_bias to x(Feature Matrix). For addition of x_bias to x, we need the shape of both the matrix to be same.

#shape of x can be calculated by 
print(x.shape) #which turns out to be (5,) for our sample dataset.
               #We need to convert it to (5,1) for successful                
               #addition as np.ones yields us with an array
               #of (5,1) dimension. Click here to know more.

Reshaping the array and appending to x_bias

x = np.reshape(x,(m,1))
updated_x = np.append(x_bias,x,axis=1) #axis=1 to join matrix using 
                                       #column.

Since your updated_x is now ready, we will calculate the transpose,inverse and dot products using numpy.

Firstly, i will calculate the first term and store its value in temp_1 such that

x_transpose = np.transpose(x)   #calculating transpose
x_transpose_dot_x = x_transpose.dot(x)  # calculating dot product
temp_1 = np.linalg.inv(x_transpose_dot_x) #calculating inverse

Now i will calculate the second term, temp_2 such that,

temp_2 = x_transpose.dot(y)

Finally calculating Ɵ,

Ɵ = temp_1.dot(temp_2)

Now running my python script and substituting values in y which goes as follows:

Sample datasets yields me,

Fitting Regression line to our Sample Dataset

2. Salary VS Years of Exp datasets yields me,

Regression Line for our Salary VS Exp. Dataset

Please find here the complete python code/script (Git):

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd# x =  np.array([1,2,3,4,5]) #Uncomment this when using Sample  Dataset# y =  np.array([7,9,12,15,16])   #Uncomment this when using Sample Dataset# dataset = pd.read_csv('Salary_Data.csv') #Uncomment this when using SalaryVsExp Dataset# x = dataset.iloc[:, 0].values  #Uncomment this when using SalaryVsExp Dataset# y = dataset.iloc[:, 1].values  #Uncomment this when using SalaryVsExp Datasetplt.scatter(x,y,color='red')x_bias = np.ones((5,1))x = np.reshape(x,(5,1))x = np.append(x_bias,x,axis=1)x_transpose = np.transpose(x)x_transpose_dot_x = x_transpose.dot(x)temp_1 = np.linalg.inv(x_transpose_dot_x)temp_2=x_transpose.dot(y)theta =temp_1.dot(temp_2)print(theta)# y = 4.6 + 2.4*x            #Uncomment this when using Sample Dataset# y = 25792.2 +  9449.96*x  #Uncomment this when using SalaryVsExp Datasetplt.plot(x,y,color='blue')plt.show()

Thanks. That’s it. Thank you for reading. And hey this was my first blog ever.

Please hit claps. Thanks in Advance.