Skip to main content

SVM with multi-class

 

SVM with multi-class

In the lab1, we used a simulated data set with two-class response variable to build an SVM classifier. In this lab, we will use another data set with multi-class response variable to build an SVM Model.

We will use Khan data set from ISLR package. We have used this data set in Lab2 of Logistic Regression. Kindly go through that lab to know more about the data set. Basically, this data set is comprised of tissue samples related to four different blue cell tumors (cancer types).

Let’s load packages for this lab

library(ISLR)
library(e1071)

Loading data

Let’s load the data

names(Khan)
## [1] "xtrain" "xtest"  "ytrain" "ytest"
dim( Khan$xtrain )
## [1]   63 2308
dim( Khan$xtest )
## [1]   20 2308
length (Khan$ytrain )
## [1] 63
length (Khan$ytest )
## [1] 20

So, the data set is already split into train and test.

2308 dimensions represent expression measurements of that many genes.

As we can see that the number of features is much higher than the number of records. In such cases, usually, classes are easily separable. Hence, we can use simple hyperplane such as “Linear” hyperplane.

train = data.frame(X = Khan$xtrain, y=as.factor(Khan$ytrain))
test = data.frame(X = Khan$xtest, y=as.factor(Khan$ytest))

Training the SVM model

svm.model1 = svm(y ~ ., data = train, kernel="linear" ,cost=5)

summary(svm.model1)
## 
## Call:
## svm(formula = y ~ ., data = train, kernel = "linear", cost = 5)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  5 
## 
## Number of Support Vectors:  58
## 
##  ( 20 20 11 7 )
## 
## 
## Number of Classes:  4 
## 
## Levels: 
##  1 2 3 4

Plotting support vectors

Let’s plot the 58 support vectors used by SVM to build the model.

(Note: We are not plotting the model itself as done in lab1. We are just plotting support vectors)

plot(train$X.1, train$X.2, col=as.integer(train$y))

points(train[svm.model1$index,], pch=5, cex=2) #index in SVM returns indices of                                                                                     #support vectors

So, as we can see that the points in the boxes are support vectors used by SVM for training the model.

Predictions

Train Accuracy

pred.train = fitted(svm.model1)

table(train$y, pred.train)
##    pred.train
##      1  2  3  4
##   1  8  0  0  0
##   2  0 23  0  0
##   3  0  0 12  0
##   4  0  0  0 20
cat('\n Accuracy on train data set:\n',mean((pred.train == train$y))*100, '\n')
## 
##  Accuracy on train data set:
##  100

The model has performed extremely well on the train data with 100% accuracy

Test Accuracy

pred.test = predict(svm.model1, newdata=test, type="class")

table(test$y, pred.test)
##    pred.test
##     1 2 3 4
##   1 3 0 0 0
##   2 0 6 0 0
##   3 0 2 4 0
##   4 0 0 0 5
cat('\n Accuracy on test data set:\n',mean((pred.test == test$y))*100, '\n')
## 
##  Accuracy on test data set:
##  90

As we see that the model has done well on test data with 90% accuracy.

e1071 package comes with a function tune(), which performs 10-fold cross validation on a set of models using differnt values of parameters supplied to it by the user. For example, in our SVM classifier for the data set we are using, we can provide different values of cost.

tune() will then build different models using the values of the parameter and give us the best model.

Following are some of the different parameters we use in tune() functions:

  • cost: is more like a regularization term we have used in other machine learning algorithms
  • gamma: is more like a tuning parameter for non-linear functions such as polynomial, radial basis, and sigmoid
  • degree: is polynomial term

Let’s provide different values of cost = (0.001 , 0.01, 0.1, 1,10,100)

tune.mod = tune(svm, y~., data=train, kernel = "linear", ranges =list(cost=c(0.001 , 0.01, 0.1, 1,10,100) ))

Summary of tune.mod

summary(tune.mod)
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##   cost
##  0.001
## 
## - best performance: 0.01666667 
## 
## - Detailed performance results:
##    cost      error dispersion
## 1 1e-03 0.01666667 0.05270463
## 2 1e-02 0.01666667 0.05270463
## 3 1e-01 0.01666667 0.05270463
## 4 1e+00 0.01666667 0.05270463
## 5 1e+01 0.01666667 0.05270463
## 6 1e+02 0.01666667 0.05270463

If you look at the error, all the models have performed equally well. Let’s see, which model has been selected as the best model by tune function.

svm.best = tune.mod$best.model
summary(svm.best)
## 
## Call:
## best.tune(method = svm, train.x = y ~ ., data = train, ranges = list(cost = c(0.001, 
##     0.01, 0.1, 1, 10, 100)), kernel = "linear")
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.001 
## 
## Number of Support Vectors:  58
## 
##  ( 20 20 11 7 )
## 
## 
## Number of Classes:  4 
## 
## Levels: 
##  1 2 3 4

As we can see that tune() has selected the model with least cost, i.e. 0.001, as the best model.

let’s see the performance of the model on test data:

pred.test = predict(svm.best, newdata=test, type="class")

table(test$y, pred.test)
##    pred.test
##     1 2 3 4
##   1 3 0 0 0
##   2 0 6 0 0
##   3 0 2 4 0
##   4 0 0 0 5
cat('\n Accuracy on test data set:\n',mean((pred.test == test$y))*100, '\n')
## 
##  Accuracy on test data set:
##  90

It looks the same as before, which is quite obvious as all the models trained by tune have similar error and therefore, similar performance on training data and test data.

Exercise: Try using different Kernel, such as “radial”, “polynomial”, or “sigmoid” and also use tune() function for different values of cost, gamma (for all except linear kernel) and degree (only for polynomial) to see if the performance on test data improves.


Click the links below for more


Comments

Popular posts from this blog

Metaverse needs better technology, scalable infra, strong governance

Many minds have been intrigued by the idea of metaverse, and its effect is such that the social media giant like Facebook has been rebranded as Meta. Yet, there is a big question mark on the future of this technology. The enablers of metaverse such as augmented reality, mixed reality and virtual reality operating on computers, smartphones and other devices have failed to give the complete real-world like immersive experience to end users. There is a clear lack of standard virtual environment and technical specifications for implementing metaverse  –  a bottleneck in using technologies from different proprietors. Due to the business privacy and transparency concerns, interoperability of services from various providers has become a big challenge. Although, the efforts to standardize virtual reality, such as Universal Scene Description, glTF and OpenXR may help in a long run, but a lot more needs to be put in.  The technologies and devices, such as wireless he...

What is ChatGPT?

Introduction ChatGPT is a language model developed by OpenAI based on the GPT-3.5 architecture. It is designed to perform various natural language processing tasks such as language translation, text summarization, question-answering, and chatbot interactions. In this blog, we will discuss ChatGPT, its architecture, applications, and benefits. Architecture ChatGPT is based on the GPT-3.5 architecture, which is an extension of the GPT-3 architecture. The model has 175 billion parameters, making it one of the largest language models available. The architecture consists of 96 transformer blocks with a hidden size of 12,288 and 10 attention heads. The model is trained using a combination of unsupervised and supervised learning techniques. Applications ChatGPT has a wide range of applications in various fields such as healthcare, finance, customer service, and education. Some of the applications of ChatGPT are as follows: Language translation: ChatGPT can translate text from one language to ...

Exploratory Data Analysis

  Lab_D_2_RM Asmi Ariv 2022-10-14 Exploratory Data Analysis In this lab, we will go through various steps to explore a dataset using descriptive statistics, summary of data, different graphs, etc. Factor Variables (try the following in R): data = read.csv( "patient.csv" );data #Reading patient data ## Patient Gender Age Group ## 1 Dick M 20 2 ## 2 Anna F 25 1 ## 3 Sam M 30 3 ## 4 Jennie F 28 2 ## 5 Joss M 29 3 ## 6 Don M 21 2 ## 7 Annie F 26 1 ## 8 John M 32 3 ## 9 Rose F 27 2 ## 10 Jack M 31 3 data$Gender #It is a string/character variable ## [1] "M" "F" "M" "F" "M" "M" "F" "M" "F" "M" data$Gender = factor(data$Gender,levels=c( "M" , "F" ), ordered= TRUE ) #...