Machine Learning - Building Linear Regression Algorithm from scratch

Machine Learning - Linear Regression Algorithm

We have seen, how linear regression model is built in R using the predefined function, “lm()”. However, it is also important to know how the machine learning works at the backend. What are the functions running inside the main function. We will walk through various stages of machine learning process in linear regression. We will create our own machine learning functions step-by-step:

Define Cost Function using regularization
Define Gradient function using regularization
Define predict function
Define feature normalization function
Define polynomial terms function
Define function to plot the model fit
Define function for cross validation
Define function for optimum value of lambda
Initialize parameters (or coefficients) of all the predictors
Use an optimizer to train the model
Make predictions

In the video lecture, the algorithm has been explained.

Cost Function

Let’s start with cost function. The main purpose is to minimize the cost to get the optimum values of parameters. This function requires X, y, theta and regularization term lambda. Returns cost.

costFunction = function (X, y, theta,lambda) {
  
  m = length(y) # number of training examples
    
  X = as.matrix(X) #Converting to matrix for matrix operations
  
  J = 0 # Initializing cost
  
  temp = theta 
  temp[1] = 0 #We never regularize the bias (or intercept)
  
  #Instead of running loop, we are using vectorized operation to calculate cost for all records
  J = (1/(2*m))*(t(X%*%theta - y))%*%(X%*%theta - y) + (lambda/(2*m))*t(temp)%*%(temp)
  
  return(J) 
  
}

Gradient Function

Let’s define the gradient function. This function requires X, y, theta and regularization term lambda. Returns gradient values for corresponding theta values.

gradFunction <- function(X, y, theta, lambda){
  
   m = length(y) # number of training examples

   X = as.matrix(X) #Converting to matrix for matrix operations
  
   theta <- as.matrix(theta)
  
   grad = rep(0,length(theta)) # Initializing gradient
  
  
   temp = theta 
  
   temp[1] = 0 #We never regularize the bias (or intercept)
  
    #Instead of running loop, we are using vectorized operation to calculate grads for all records
   grad = (1/m)*(t(X)%*%((X%*%theta - y))) + (lambda/(m))*(temp)
  
  
   grad
}

Training Function to train the model

We need to combine cost function and gradient function to train the model. This function requires X and y, and regularization term lambda. Returns final cost values and theta values

trainModel <- function(X, y, lambda){
  
  X <- as.matrix(X)
  
  initial_theta = rep(0,ncol(X)) # Initialize Theta values  

  
  #To Calculate theta we use optim() function in R as an optimizer
  #Optim() takes cos and grad functions, X, y values and lambda 
  # We will run the iteration 200 times
  
  costh <- optim(par=initial_theta, fn=costFunction, 
                 gr=gradFunction, method="BFGS", X=X,y=y,lambda=lambda, 
                 control = list(maxit=200))
  
  cost <- costh$value
  theta <- costh$par
  
  list(cost=cost, theta=theta)
 
}

Cross Validation

Let’s write a function that calculates the different cost values of training data and validation data as the model is getting trained. As we train the model, we can use the validation data with train data and calculate costs for both. To do this, we iterate over training records from 1 to i, but use the entire validation data during each iteration.

This function takes X, y, X_val, y_val and regularization term lambda. Returns validation costs and training costs.

crossValidation <- function(X, y, X_val, y_val, lambda){
  
  
    m = length(y) # Number of training records
  
  train.cost = rep(0,m-1) #Vector to store training costs
  valid.cost = rep(0,m-1) #Vector to store validation costs
  
  for(i in 2:m){
    
    theta = trainModel(X[1:i,], y[1:i], lambda); #training model on 1:i training data
    theta = theta$theta
    train.cost[i-1] = costFunction((X[1:i,]), y[1:i], theta, 0); #Calculating cost for 1:i training data  
    valid.cost[i-1] = costFunction(X_val, y_val, theta, 0); #Calculating cost for the entire validation data
    
   }
  
  list(train.cost=train.cost,valid.cost=valid.cost)
}

Feature normalization

For some data sets, we may like to normalize all the features (due to difference in the scales of variables).

So, we need a function that normalizes all the variables. The function takes X (all the features) and returns normalized features, mean and sd values of all the original features

Normalize <- function(X){
  
  X.norm = as.matrix(X) #Converting X into a Matrix to store normalized features
  mu = rep(0, dim(X)[2]) #Vector to store mean values of features
  sigma = rep(0, dim(X)[2]) #Vector to store standard devaition values of features
  
  for (i in 1:dim(X)[2]){
  mu[i] = mean(X[,i])
  sigma[i] = sd(X[,i])
  X.norm[,i] = (X.norm[,i]-mu[i])/sigma[i]
  }
  
 list(X=X.norm, mu=mu, sigma=sigma) 
}

Polynomials of predictors (features)

In some cases, to fit the model well, we need to raise the powers of features (predictors), that means we need to calculate polynomials of predictors. The function takes X and p (the highest polynomial degree). It returns all the polynomials up to p (i.e. X1,x2,..xn.. x1^2, x22…xn2… X1^p, x2p…Xnp; where n = number of predictors)

poly <- function(X, p){
  
  X <- as.matrix(X) #Converting dataset X into a matrix 
  m = nrow(X) #No. of training records
  X.poly = matrix(rep(0,m*p),m) #Creating an empty matrix to store polynomials of X
  
  for (i in 1:p){ 
  X.poly[,i] = X^i
  
  }
 
  X.poly
   
}

Optimum lambda value

To get the training model, besides theta values, we also need to get the most optimum value of regularization term lambda.

The function takes X, y, X_val, y_val, lambda.val. Returns, different sets of theta values, training/validation costs corresponding to each value of lambda.

optLambda <- function(X, y, X_val, y_val, lambda.val){
  
  
  m <- length(lambda.val)
  
  #To store different set of theta values for different lambdas
  theta.val <- matrix(rep(0,length(lambda.val)*ncol(X)),ncol=ncol(X)) 
  
  train.cost = rep(0,m) #Vector to store training costs
  valid.cost = rep(0,m) #Vector to store validation costs
  
  for(i in 1:m){
    
    lambda <- lambda.val[i]
    res = trainModel(X, y, lambda);
    theta.val[i,] = res$theta
    train.cost[i] = costFunction(X, y, theta.val[i,], 0);
    valid.cost[i] = costFunction(X_val, y_val, theta.val[i,], 0);
    
  }
  
  
  list(lambda.val=lambda.val, train.cost=train.cost, valid.cost=valid.cost, theta=theta.val)  
}

Model Plot

Once we have trained the model, we need to plot the model fit against the original data. The function takes range of x-axis, i.e. x.min and x.max, mean and standard deviation of original data (as we have normalized the data), theta values for each variable and p (for polynomial degree). The function, in a way, uses x.min and x.max to create new unknown data for the model to predcit the response variable

modelPlot <- function(x.min, x.max, mu, sigma, theta, p){
  
  x = t(t(seq(from=x.min - 10, to=x.max + 10, by=0.05 ))) #The x-axis
  
  #calculating polynomial terms of x to fit the model on new values of x 
  X.poly = poly(x, p)
 
#Normalize x.poly     
for(i in 1:p){
  X.poly[,i] = X.poly[,i]-mu[i]
  X.poly[,i] = X.poly[,i]/sigma[i]
}


  #Add bias term
  X.poly = cbind(rep(1,nrow(x)), X.poly)
  
  #Plot
  lines(x, X.poly%*%theta, col="blue",lty="dashed", lwd=2)
  
  
}

Model building process

We load the data. Here we have used a data set that has two variables: Experiment (predictor) and Effect (response). We split the data into training set, test set and validation set. We use the functions defined above to build the model, plot the predcited model, look at the performance based on training and validation costs and lambda values. If needed, we create polynomial features to fit the model well. We use the regularization to penalize the model for using unecessary (more) predictors. Finally, we predict the response variable on test data to see the model performance.

Load the data and split into training, test and validation sets

data <- read.csv("experiment_3.csv") # data set from base
m = nrow(data)

set.seed(1); m.train = sample(1:m, 60, replace=F)

Train set

X <- data[m.train,1]; X <- as.matrix(X)
y <- data[m.train,2]; y <- as.matrix(y)

Test set

newData = data[-m.train,]; m1 = nrow(newData)

set.seed(1); m.test = sample(1:m1, 20, replace=F)

Xtest <- newData[m.test,1]; Xtest <- as.matrix(Xtest)
ytest <- newData[m.test,2]; ytest <- as.matrix(ytest)

Validation set

Xval <- newData[-m.test,1]; Xval <- as.matrix(Xval)
yval <- newData[-m.test,2]; yval <- as.matrix(yval)

dim(X)

## [1] 60  1

dim(y)

## [1] 60  1

m <- nrow(X)

Plotting the data

Let’s plot the data and see how it looks.

plot(X, y,col="red", lwd=1.5, pch=3, xlab = "Experiment", 
     ylab = 'Effect')

As we can see that the data looks non-linear.

Regularized Linear Regression Cost

Let’s just see how our cost function works with some values of theta and lambda. It should give us a single value.

theta <- c(0.1,1) #One for bias (intercept) and another for Experiment.

X1 <- cbind(rep(1,m),X) #Adding the bias (intercept) term to the data
J = costFunction(X1, y, theta, 0.01) #Calculating cost

cat('Cost at theta = c(0.1 , 1):', J)

## Cost at theta = c(0.1 , 1): 3.168479

Regularized Linear Regression gradiant

Let’s just see how our gradient function works with some values of theta and lambda. It should give us two values for two thetas.

grad = gradFunction(X1, y, theta, 0.01)

cat('Gradient at theta = c(0.1 , 1):', grad)

## Gradient at theta = c(0.1 , 1): -0.54704 4.357158

Train Linear Regression

Let’s train linear regression model without regularization, i.e. lambda = 0

lambda=0

res <- trainModel(X1, y, lambda)

theta <- res$theta
cost <- res$cost

plot(X, y,col="red", lwd=1.5, pch=3, xlab = "Experiment", 
     ylab = 'Effect')
lines(X, X1%*%theta, col="blue", lwd=2)

As we can see that the linear regression model is too simple to fit the data.

Learning Curve for Linear Regression

Let’s perform cross validation and plot the learning curve

lambda = 0

m1 <- nrow(Xval)
Xval1 <- cbind(rep(1,m1),Xval)

lc <- crossValidation(X1, y,Xval1, yval, lambda)
train.cost <- lc$train.cost
valid.cost <- lc$valid.cost

plot(1:(m-1), train.cost,
     xlab ='Number of training records', ylab='Costs', ylim = c(0, 3), type = 'l',col="red")
lines(1:length(valid.cost), valid.cost, col="blue")
legend("topright",legend=c('Training', 'Cross Validation'), lwd = 1, lty=c(1,1),col=c("red","blue"))

df <- data.frame(train.cost, valid.cost,row.names = NULL)
df

##      train.cost valid.cost
## 1  8.238166e-17  2.5329315
## 2  6.208744e-04  1.1782905
## 3  1.958922e-02  0.9198270
## 4  1.598052e-02  0.9180856
## 5  1.471525e-02  0.9270142
## 6  2.709880e-02  1.0237036
## 7  2.380749e-02  1.0228061
## 8  2.186192e-02  1.0243780
## 9  2.472863e-02  1.0677124
## 10 2.289297e-02  1.0731875
## 11 5.978665e-02  1.2330645
## 12 5.584098e-02  1.1983436
## 13 5.375420e-02  1.2381606
## 14 7.353493e-02  1.3377572
## 15 7.018640e-02  1.3683848
## 16 6.685591e-02  1.3624731
## 17 6.349557e-02  1.3580959
## 18 6.185678e-02  1.3886543
## 19 6.054790e-02  1.4132569
## 20 5.923730e-02  1.4263750
## 21 5.663662e-02  1.4347870
## 22 5.544398e-02  1.4502448
## 23 5.314111e-02  1.4512749
## 24 5.202749e-02  1.4661070
## 25 5.017966e-02  1.4629603
## 26 4.930747e-02  1.4665674
## 27 1.107082e-01  1.1665378
## 28 1.069751e-01  1.1651912
## 29 1.054114e-01  1.1884521
## 30 1.023467e-01  1.1930478
## 31 1.162883e-01  1.2511462
## 32 1.145147e-01  1.2718232
## 33 1.795942e-01  1.0409203
## 34 1.750298e-01  1.0528734
## 35 1.701680e-01  1.0529367
## 36 1.655855e-01  1.0550358
## 37 1.639234e-01  1.0672450
## 38 1.627077e-01  1.0804279
## 39 1.766981e-01  1.0015004
## 40 1.741928e-01  1.0049297
## 41 1.714303e-01  1.0118736
## 42 1.703430e-01  1.0284868
## 43 1.686335e-01  1.0356160
## 44 1.901865e-01  1.0849883
## 45 1.864134e-01  1.0759814
## 46 2.499428e-01  0.9372758
## 47 2.502287e-01  0.9090719
## 48 2.492188e-01  0.9204667
## 49 2.474424e-01  0.9337173
## 50 2.444149e-01  0.9355073
## 51 2.416340e-01  0.9471019
## 52 2.382988e-01  0.9473666
## 53 2.365881e-01  0.9543288
## 54 2.327089e-01  0.9532821
## 55 2.292555e-01  0.9527438
## 56 2.262076e-01  0.9531084
## 57 2.223651e-01  0.9505470
## 58 2.190224e-01  0.9497567
## 59 2.161141e-01  0.9497138

With increasing training records, the cost for both keep increasing. It indicates that the model is suffering from high bias. Therefore, we need to consider adding polynomial features to the data. Even after adding polynomial features, we are still solving linear regression problem. You can see it as a multivariate linear regression, where each higher polynomial is another feature.

Adding Higher Polynomial Features

Transform X into Polynomial Features and Normalize

p = 11

X.poly = poly(X, p)

fn <- Normalize(X.poly)

X.poly <- fn$X

X.poly <- cbind(rep(1,nrow(X.poly)),X.poly) #Adding bias term

mu <- fn$mu

sigma <- fn$sigma

Transform Xtest into Polynomial Features and normalize (using mu and sigma)

X.poly.test = poly(Xtest, p)

for(i in 1:p){
X.poly.test[,i] = X.poly.test[,i]-mu[i]
X.poly.test[,i] = X.poly.test[,i]/sigma[i]
}

X.poly.test <- cbind(rep(1,nrow(X.poly.test)),X.poly.test) #Adding bias term

Transform Xval into Polynomial Features and normalize (using mu and sigma)

X.poly.val = poly(Xval, p)

for(i in 1:p){
  X.poly.val[,i] = X.poly.val[,i]-mu[i]
  X.poly.val[,i] = X.poly.val[,i]/sigma[i]
}

X.poly.val <- cbind(rep(1,nrow(X.poly.val)),X.poly.val) #Adding bias term


cat('Normalized Record 1:\n')

## Normalized Record 1:

X.poly[1,]

##  [1]  1.0000000  0.8481296 -0.4429418  0.3967849 -0.4925327  0.2686365
##  [7] -0.4166886  0.2440979 -0.3473548  0.2357370 -0.2985897  0.2278379

Cross Validation for Polynomial Regression

lambda = 0
theta = trainModel(X.poly, y, lambda)
theta <- theta$theta

plot(X, y, col='red', lwd = 1.5,
     xlab = "Experiment", 
     ylab = 'Effect',
     main=paste0('Polynomial Model Fit ','(lambda = ',lambda,')'))
modelPlot(min(X), max(X), mu, sigma, theta, p)

lc <- crossValidation(X.poly, y,X.poly.val, yval, lambda)
train.cost = lc$train.cost
valid.cost = lc$valid.cost

plot(1:length(train.cost), train.cost, ylim = c(0,2),
     xlab ='Number of training records', ylab='Costs', type = 'l',col="red")
lines(1:length(valid.cost), valid.cost, col="blue")
legend("topright",legend=c('Training', 'Cross Validation'), lwd = 1, lty=c(1,1),col=c("red","blue"))

df <- data.frame(train.cost, valid.cost,row.names = NULL)
df

##      train.cost   valid.cost
## 1  1.692218e-30 1.497701e+00
## 2  1.840049e-24 1.763859e+00
## 3  3.809049e-23 1.226121e-02
## 4  5.840967e-12 1.649018e-01
## 5  2.626279e-11 1.011731e-01
## 6  8.645454e-10 6.513967e-02
## 7  6.208707e-10 6.849404e-02
## 8  1.623302e-09 1.660143e-01
## 9  8.812272e-08 2.655083e-01
## 10 1.381842e-07 3.296597e-01
## 11 7.789076e-08 2.520524e-01
## 12 2.511082e-07 2.521174e-01
## 13 1.662805e-08 1.971682e-02
## 14 1.433622e-07 1.503060e-01
## 15 1.537551e-07 1.381756e-01
## 16 1.848025e-07 1.604831e-01
## 17 1.919797e-07 1.756369e-01
## 18 7.570572e-08 7.017080e-02
## 19 3.166883e-07 1.836446e-01
## 20 2.099767e-07 1.540647e-01
## 21 1.182953e-07 7.385023e-02
## 22 2.153670e-07 1.274156e-01
## 23 1.137101e-07 7.745879e-02
## 24 1.121845e-07 7.330662e-02
## 25 1.163703e-07 7.644229e-02
## 26 1.945566e-07 1.298087e-01
## 27 1.478446e-07 1.113172e-05
## 28 1.809334e-07 2.557291e-04
## 29 1.624464e-07 1.177070e-04
## 30 1.653354e-07 1.298465e-04
## 31 2.469471e-07 1.391577e-04
## 32 5.073651e-07 1.559152e-03
## 33 1.766576e-07 9.718768e-05
## 34 2.983483e-07 5.923543e-05
## 35 3.602593e-07 7.906708e-05
## 36 1.079405e-07 1.964334e-04
## 37 9.462420e-08 1.888349e-04
## 38 3.378960e-07 4.277911e-05
## 39 3.004376e-07 2.064315e-04
## 40 9.696974e-08 1.740224e-04
## 41 3.771527e-07 9.689354e-05
## 42 2.574060e-07 6.972879e-05
## 43 1.837707e-07 1.200619e-04
## 44 1.702496e-07 1.413903e-04
## 45 1.755394e-07 1.191719e-04
## 46 4.219377e-07 3.530114e-05
## 47 3.924733e-07 4.133957e-05
## 48 8.644382e-07 2.325064e-05
## 49 6.675353e-07 2.282897e-05
## 50 5.018120e-07 4.772265e-05
## 51 2.590419e-07 4.537684e-05
## 52 2.158058e-07 6.636912e-05
## 53 5.164929e-07 2.336941e-05
## 54 2.766608e-07 6.541890e-05
## 55 4.674697e-07 4.271593e-05
## 56 5.264396e-07 2.681352e-05
## 57 3.921298e-07 5.643553e-05
## 58 4.544231e-07 3.987351e-05
## 59 5.714429e-07 2.181708e-05

As we can see that the model has over fit the data. One way to counter over fitting is to introduce regularization term.

Selecting optimum Lambda

lambda.val = c(0, 0.001, 0.002, 0.01, 0.02, 0.1, 0.2, 1, 2, 10, 20)
vald <- optLambda(X.poly, y, X.poly.val, yval, lambda.val)

lambda.val <- vald$lambda.val
train.cost <- vald$train.cost
valid.cost <- vald$valid.cost
theta <- vald$theta

plot(lambda.val, train.cost, xlab='lambda', ylab='Cost', type = 'l',col="red")
lines(lambda.val, valid.cost, col="blue")
legend("topright",legend=c('Training', 'Cross Validation'), lwd = 1, lty=c(1,1),col=c("red","blue"))

df <- data.frame(lambda.val, train.cost, valid.cost)
df

##    lambda.val   train.cost   valid.cost
## 1       0e+00 5.714429e-07 2.181708e-05
## 2       1e-03 7.055049e-07 1.753199e-05
## 3       2e-03 7.286189e-07 9.839364e-06
## 4       1e-02 2.668830e-06 2.145467e-05
## 5       2e-02 6.389923e-06 1.161987e-04
## 6       1e-01 4.314566e-05 5.873152e-04
## 7       2e-01 8.153668e-05 4.989860e-04
## 8       1e+00 2.764711e-04 5.345506e-04
## 9       2e+00 4.950950e-04 2.370073e-03
## 10      1e+01 2.930996e-03 2.054137e-02
## 11      2e+01 6.720245e-03 4.268544e-02

theta

##            [,1]       [,2]       [,3]          [,4]      [,5]         [,6]
##  [1,] 0.4718562 -0.2719980 0.04676559  0.0004042597 0.6099101 -0.017455450
##  [2,] 0.4718616 -0.2707171 0.04980207 -0.0051433648 0.6001699 -0.011664797
##  [3,] 0.4718555 -0.2710717 0.05065001 -0.0032369844 0.5983282 -0.013402371
##  [4,] 0.4718563 -0.2695476 0.06364271 -0.0118711389 0.5580440  0.002428919
##  [5,] 0.4718563 -0.2686524 0.07549709 -0.0158698316 0.5237670  0.009451362
##  [6,] 0.4718563 -0.2653539 0.11941639 -0.0179480856 0.4159750  0.001156948
##  [7,] 0.4718562 -0.2605315 0.14259015 -0.0250519295 0.3664384 -0.005648855
##  [8,] 0.4718569 -0.2401197 0.19450192 -0.0563069293 0.2774883 -0.011296353
##  [9,] 0.4718563 -0.2278716 0.21362390 -0.0655929734 0.2505640 -0.013406771
## [10,] 0.4718570 -0.1782921 0.22487381 -0.0759614763 0.1978033 -0.030825727
## [11,] 0.4718562 -0.1477648 0.20485989 -0.0762213497 0.1730292 -0.041510512
##            [,7]          [,8]        [,9]        [,10]       [,11]        [,12]
##  [1,] 0.1590666  0.0348346249 -0.06132644  0.026034959 -0.04489261 -0.064142283
##  [2,] 0.1658679  0.0358205782 -0.05593344  0.024003735 -0.04917703 -0.063498688
##  [3,] 0.1649822  0.0339171024 -0.05449174  0.024158106 -0.04745203 -0.060249360
##  [4,] 0.1906244  0.0275122168 -0.03175709  0.013344632 -0.06543944 -0.048765880
##  [5,] 0.2072500  0.0214537346 -0.01227303  0.008140271 -0.07287281 -0.035306791
##  [6,] 0.2258309  0.0029181418  0.04871520  0.013149089 -0.04897836  0.023974844
##  [7,] 0.2240594  0.0009775677  0.07428020  0.021534374 -0.02493468  0.047479342
##  [8,] 0.1987069  0.0109337452  0.10456051  0.035070142  0.02777985  0.060947668
##  [9,] 0.1815572  0.0111999567  0.10371419  0.031857829  0.03956760  0.051792250
## [10,] 0.1405794 -0.0088217106  0.08857517  0.005644218  0.04798311  0.017199783
## [11,] 0.1248937 -0.0233774928  0.08333742 -0.011224968  0.05144271 -0.001618515

As we can see that the validation cost is minimum for lambda = 0.002. Therefore, we select the theta values corresponding to lambda = 0.002.i.e. 3rd row, which is theta[3,]. We do not use regularization on test data.

Let’s calculate the cost of test data, setting lambda = 0

cost <- costFunction(X.poly.test, ytest, theta[3,], lambda=0)

cat('Cost of test data using theta values for lambda = 0.002:',cost, "\n")

## Cost of test data using theta values for lambda = 0.002: 9.549702e-07

Predictions

Let’s see how the model has done on validation and test data

pred.test = X.poly.test%*%theta[3,]
df.test = data.frame(ytest, pred.test)
df.test

##           ytest    pred.test
## 1  -0.086860931 -0.087009032
## 2  -0.070731683 -0.071542451
## 3  -0.001495037 -0.003878185
## 4   0.258682762  0.260004062
## 5   3.667941127  3.667521500
## 6   0.110443078  0.108017968
## 7  -0.097241981 -0.096671944
## 8   0.087963528  0.085600909
## 9   0.075487926  0.073180007
## 10  0.221998098  0.219813305
## 11  0.081498446  0.080194369
## 12  0.163884745  0.164418787
## 13  0.169575876  0.170199380
## 14  0.840536180  0.839515390
## 15  1.658490703  1.658414002
## 16 -0.114621340 -0.113421708
## 17  0.405718935  0.406599033
## 18  0.410398100  0.411251841
## 19 -0.058407489 -0.059661317
## 20  0.503356831  0.503662151

#pred.train = X.poly%*%theta[3,]
#df.train = data.frame(y, pred.train)
#df.train

pred.val = X.poly.val%*%theta[3,]
df.val = data.frame(yval, pred.val)
df.val

##           yval    pred.val
## 1   0.31969428  0.31823229
## 2   1.35273273  1.35200300
## 3   3.90193978  3.90168255
## 4  -0.10625291 -0.10550927
## 5   0.28334866  0.28468469
## 6   0.26803921  0.26937278
## 7   5.39522654  5.41214070
## 8   0.10078797  0.09996767
## 9  -0.04535178 -0.04617349
## 10  5.05634543  5.06579571
## 11  0.10703730  0.10637306
## 12  2.98731094  2.98758038
## 13  0.05717282  0.05531270
## 14  0.22135088  0.22252376
## 15  0.35131289  0.35014563
## 16  0.13182187  0.13174109
## 17  1.34153674  1.34078501
## 18  2.53584707  2.53657788
## 19  0.93626048  0.93512935
## 20 -0.10776809 -0.10682980

As we can see the model has performed extremely well on the test and validation data. If you like to see the results of training data, remove # signs and run the codes.

Click on links below for more:

Simple Linear Regression with R

Multivariate Linear Regression in R

Business and AI

Search This Blog