Lab_SVM_3_RM
Asmi Ariv
2022-10-05
Machine Learning - SVM kernel functions and kernlab package
We have seen, how SVM model is built in R using e1071 package and its function “svm()”. In this lab, we will see how to define linear and RBF (or Gaussian) kernel functions
The first part (defining kernel functions) of this lab, which is to give you some knowledge about these kernel functions, is optional. Just like previous machine learning algorithms we have built in the previous sections, we can also build a fully functional machine learning SVM algorithm by defining an optimizer, but that requires a lot of coding and is not fit for a basic course like this.
In this lab, we will also go through another very useful package “kernlab” for building a support vector machine model using a dataset, promotergene that comes along with the package and its function ksvm(). We will also generate a simulated data to build an svm model and then see how its plot looks like.
Linear Kernel
This kernel is mainly used for data set with classes which are linearly separable
linear.Kernel <- function(x1, x2){
# It is a similarity function
# Calculates the similarity between x1 and x2
# Returns the value
# x1 and x2 must be column vectors
x1 = as.matrix(x1)
x2 = as.matrix(x2)
# Computing the similarity using dot product
lin.sim = t(x1)%*%x2 # dot product
lin.sim
}Vectorized version of Linear Kernel
linear.Kernel.vec <- function(X){
# Computes the kernel on every pair of examples
# X must be matrix of different variables
# Computing the similarity on every pair in X
K.lin = (X)%*%t(X)
K.lin
}Radial-based kernel function (RBF) or Gaussian Kernel
This kernel is used for data set with classes which are separable only using non-linear boundaries
RBF.Kernel <- function(x1, x2, sigma){
# It is a similarity function
# Calculates the similarity between x1 and x2
# Returns the value
# sigma = is tuning or bandwidth parameter to control the non-linearity in the model
# Lower values of sigma make the decision boundaries more non-linear
# it also determines how fast the similarity metric reaches 0, as data points go further apart
# Ensure that x1 and x2 are column vectors
x1 <- as.matrix(x1)
x2 <- as.matrix(x2)
rbf.sim <- 0
rbf.sim <- exp(-(t(x1-x2)%*%(x1-x2))/(2*sigma^2)) #We use L2 Norm
rbf.sim
}Vectorized Radial-based kernel function (RBF) or Gaussian Kernel
RBF.Kernel.vec <- function(X, sigma){
#Vectorized RBF Kernel
# This is equivalent to computing the kernel on every pair of examples
X2 <- matrix(rowSums(X^2))
K.rbf <- sweep(sweep(-2*X%*%t(X),2,t(X2),FUN="+"),1,X2,FUN="+")
K.rbf <- matrix(rep((RBF.Kernel(1, 0,sigma)),length(K.rbf)),nrow(K.rbf))^K.rbf
K.rbf
}Implementing Linear Kernel
Let’s see how our Linear kernel works on some data sets
x1 = c(5, 3, 2, 9); x2 = c(1, 9, -6, 2);
lin.sim = linear.Kernel(x1, x2)
cat('Linear Kernel between 4-dimensional 2 data points x1 = c(5, 3, 2, 9), x2 = c(1, 9, -6, 2):','\n', lin.sim) ## Linear Kernel between 4-dimensional 2 data points x1 = c(5, 3, 2, 9), x2 = c(1, 9, -6, 2):
## 38Implementing RBF Kernel
Let’s see how our RBF kernel works on some data sets
x1 = c(5, 3, 2, 9); x2 = c(1, 9, -6, 2); sigma = 2;
rbf.sim = RBF.Kernel(x1, x2, sigma)
cat('RBF Kernel between 4-dimensional 2 data points x1 = c(5, 3, 2, 9), x2 = c(1, 9, -6, 2), with sigma = 2 :','\n', rbf.sim) ## RBF Kernel between 4-dimensional 2 data points x1 = c(5, 3, 2, 9), x2 = c(1, 9, -6, 2), with sigma = 2 :
## 1.103256e-09Implementing Vectorised Linear Kernel
Let’s see how our Vectorised Linear kernel works on some data sets
x1 = c(5, 3, 2, 9); x2 = c(1, 9, -6, 2);
X = rbind(x1, x2)
k.lin = linear.Kernel.vec(X)
cat('Vectorised Linear Kernel 4-dimensional matrix X:','\n'); k.lin## Vectorised Linear Kernel 4-dimensional matrix X:## x1 x2
## x1 119 38
## x2 38 122In vectorized version of linear kernel, we get a kernel matrix of dimension nrow(X)-by-nrow(X), due to the pairwise calculation. We will get the same result when we put the non-vectorised version in two for loops, as we have to calculate similarity for each pair. Each pair will be repeated twice due to two loops and there will be an entry for each record calculating similarity with itself. For example, x1 and x2 are two records and therefore, either vectorised or two for loops version will calculate similarities of pairs x1-x1, x1-x2,x2-x1 and x2-x2.Diagonal values are similarities of a record with itself (e.g. x1-x1, x2-x2) and that’s why they have large values.
Implementing Vectorised RBF Kernel
Let’s see how our Vectorised RBF kernel works on some data sets
x1 = c(5, 3, 2, 9); x2 = c(1, 9, -6, 2); sigma = 2;
X = rbind(x1, x2)
k.rbf = RBF.Kernel.vec(X, sigma)
cat('Vectorized RBF Kernel 4-dimensional matrix X, with sigma = 2 :','\n'); k.rbf ## Vectorized RBF Kernel 4-dimensional matrix X, with sigma = 2 :## x1 x2
## x1 1.000000e+00 1.103256e-09
## x2 1.103256e-09 1.000000e+00In vectorized version of RBF kernel, we get a kernel matrix of dimension nrow(X)-by-nrow(X), due to the pairwise calculation. We will get the same result when we put the non-vectorised version in two for loops, as we have to calculate similarity for each pair. Each pair will be repeated twice due to two loops and there will be an entry for each record calculating similarity with itself. For example, x1 and x2 are two records and therefore, either vectorised or two for loops version will calculate similarities of pairs x1-x1, x1-x2,x2-x1 and x2-x2. Diagonal values are similarities of a record with itself (e.g. x1-x1, x2-x2) and that’s why they have large values. In rbf kernel, the maximum value cannot exceed 1.
Exercise: Try writing functions with two for loops using non-vectorized rbf and linear kernels and verify the results against vectorized kernels. You can directly call the non-vectorized function within your function inside second for loop. You can use the same example of matrix X (with two rows x1 and x2) that we have used above.
Verify our functions with Kernlab package
library(kernlab)
k.lin = vanilladot() #Creating linear kernel function
k.lin(x1, x2)## [,1]
## [1,] 38k.rbf = rbfdot(sigma=1/8) #Creating rfb kernel function,
#kernlab equation is exp(-σ (||x - x'||)^2), while our equation is
#exp(- (||x - x'||)^2/2σ^2), hence kernlab σ = 1/2*(σ)^2 (our sigma)
k.rbf(x1, x2)## [,1]
## [1,] 1.103256e-09As we can see that the results of our functions and that of kernlab functions are same.
Exercise: Try different set of values/vectors/matrices and calcluate rbf and linear kernel using the functions we have defined and verify against kernlab functions.
SVM using kernlab package
We will use the package kernlab to build an SVM model on the dataset promotergene (part of the package). You need to install the package.
dataset promotergene: A data frame with 106 observations and 58 variables. The first variable Class is a factor with levels + for a promoter gene and - for a non-promoter gene. The remaining 57 variables V2 to V58 are factors describing the sequence. The DNA bases are coded as follows: a adenine c cytosine g guanine t thymine.
data(promotergene)
dim(promotergene)## [1] 106 58promotergene[1:4,1:3]## Class V2 V3
## 1 + g c
## 2 + a t
## 3 + c c
## 4 + t c106 records with 58 features.
All the features are factor variables.
Train and Test dataset
m = nrow(promotergene)
set.seed(1); train.idx = sample(1:m, 0.8*m, replace=F)
train = promotergene[train.idx,]
test = promotergene[-train.idx,]
dim(train)## [1] 84 58dim(test)## [1] 22 58SVM model
We will use the function ksvm from the package to build the SVM model.
svm.model = ksvm(Class~.,data=train,kernel="rbfdot",kpar="automatic",C=60,cross=3,prob.model=TRUE)
svm.model## Support Vector Machine object of class "ksvm"
##
## SV type: C-svc (classification)
## parameter : cost C = 60
##
## Gaussian Radial Basis kernel function.
## Hyperparameter : sigma = 0.0160353535353535
##
## Number of Support Vectors : 77
##
## Objective Function Value : -43.9461
## Training error : 0
## Cross validation error : 0.154762
## Probability model included.So, the model has used 77 support vectors with sigm= 0.016, and its train error = 0
Performance of test data
pred.test = predict(svm.model, newdata=test)
table(test$Class, pred.test)## pred.test
## + -
## + 11 1
## - 0 10cat('\n Accuracy on test data set:\n',mean((pred.test == test$Class))*100, '\n')##
## Accuracy on test data set:
## 95.45455Accuracy = 95.45% is quite good; hence model has performed well.
Since, it is a multi-dimensional data set with all variables as factors, plotting the model may not be possible.
However, if you wish to see how this package could be used to plot a model, we can generate some simulated data.
Plotting SVM model with kernlab package
Let’s generate a simulated data and build an SVM model to plot it.
set.seed(1)
x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2))
y <- matrix(c(rep(1,60),rep(-1,60)))
svm.model2 <- ksvm(x,y,type="C-svc")
svm.model2## Support Vector Machine object of class "ksvm"
##
## SV type: C-svc (classification)
## parameter : cost C = 1
##
## Gaussian Radial Basis kernel function.
## Hyperparameter : sigma = 1.38298912592912
##
## Number of Support Vectors : 31
##
## Objective Function Value : -9.4017
## Training error : 0plot(svm.model2,data=x)See, how beautifully, the model is plotted with different shades of two colors (representing two classes -1 and +1). All the solid data points (triangles as well as circles) are support vectors.
Comments
Post a Comment