Skip to main content

R Programming 1

 

R Programming

In this lab we will go through some basic operations in R

Simple Calculator (try the following in R):

11 + 12

20 - 30

2*3

10/5

1+
## Error: <text>:10:0: unexpected end of input
## 8: 
## 9: 1+
##   ^

Mathematical Operations (try the following in R):

11*9*(2+3+5)/6
## [1] 165
2.334 - 0.297
## [1] 2.037
10^2; 10**2 #Seperate different codes by semi-colon (;), use hash (#) for comments
## [1] 100
## [1] 100
40%%10; 40%%9
## [1] 0
## [1] 4

Relational/Logical Operations (try the following in R):

4 < 5
## [1] TRUE
2 <= 3
## [1] TRUE
4 > 5
## [1] FALSE
6 >= 3
## [1] TRUE
4 == 4
## [1] TRUE
3 != 2
## [1] TRUE
! (2==2)
## [1] FALSE
(2==2) | (2==3)
## [1] TRUE
(2==2) & (2==3)
## [1] FALSE
(6>=9 & 7<9)
## [1] FALSE
(6>=9 | 7<9)
## [1] TRUE

Creating and using variables (try the following in R):

price = 100; price
## [1] 100
quantity = 10
sale = price * quantity ; sale
## [1] 1000

Examples of some inbuilt functions (try the following in R):

sqrt(9) # Square root of a numerical value
## [1] 3
abs(-10) # Absolute value (non-negative value) of a number without its sign (+ or -)
## [1] 10
round(4.6789) #Rounds off to the nearest whole, try another number of your choice
## [1] 5
round(4.6789, 3)
## [1] 4.679
round(x = 4.6789, digits = 3) #All three return rounded off value, second argument is for                        #number of decimal points
## [1] 4.679

Creating and using vectors (try the following in R):

monthly.income = c(10, 20, 60, 90, 40, 30, 0, 0, 80, 100, 25, 50)
monthly.income
##  [1]  10  20  60  90  40  30   0   0  80 100  25  50
monthly.income[5] #Returns the value of fifth index
## [1] 40
monthly.income[1:3] #Returns the values of 1st  to 3rd indices
## [1] 10 20 60
monthly.income[c(3,6,7)] #Returns the values of 3rd , 6th , and 7th indices
## [1] 60 30  0
monthly.income[7] = 35 #Changes the value of 7th index to 35
monthly.income[7]
## [1] 35
length(monthly.income) #Returns the size of the variable
## [1] 12
A = seq(0, 2, by = 0.5) #Creates a variable of sequence 0 to 2 incremented by 0.5
A
## [1] 0.0 0.5 1.0 1.5 2.0
B = rep(3, 5) #Creates a variable with value 3 repeated 5 times
B
## [1] 3 3 3 3 3
C = rep(c(3,1), 5) #Creates a variable with values (3,1) repeated 5 times
C
##  [1] 3 1 3 1 3 1 3 1 3 1
D = 5:10 #Sequence of numbers from 5 to 10
D
## [1]  5  6  7  8  9 10
E = c(A, B, C, D) # Concatenates all the variables into one
E
##  [1]  0.0  0.5  1.0  1.5  2.0  3.0  3.0  3.0  3.0  3.0  3.0  1.0  3.0  1.0  3.0
## [16]  1.0  3.0  1.0  3.0  1.0  5.0  6.0  7.0  8.0  9.0 10.0

Vector operations (try the following in R):

monthly.income*10 #Multiplies every value by 10
##  [1]  100  200  600  900  400  300  350    0  800 1000  250  500
monthly.income/10 #Divides every value by 10
##  [1]  1.0  2.0  6.0  9.0  4.0  3.0  3.5  0.0  8.0 10.0  2.5  5.0
monthly.days.worked = c(23, 20, 23, 24, 22, 21, 20, 24, 21, 25, 22, 21)

monthly.income/ monthly.days.worked #division between corresponding values
##  [1] 0.4347826 1.0000000 2.6086957 3.7500000 1.8181818 1.4285714 1.7500000
##  [8] 0.0000000 3.8095238 4.0000000 1.1363636 2.3809524

Exercise: Try the following:

Create a numerical vector with less than 12 values (i.e. less than 12 indices) in it and then divide, multiply, add and subtract using your vector and monthly.income, and try to understand what is happening.

Examples of some functions on vectors (try the following in R):

mean(monthly.days.worked) #Returns mean or average
## [1] 22.16667
sum(monthly.days.worked) #Returns sum
## [1] 266
median(monthly.days.worked) #Returns median
## [1] 22
m = mean(monthly.income)/mean(monthly.days.worked); m
## [1] 2.030075
s = sum(monthly.income)/sum(monthly.days.worked); s
## [1] 2.030075

Exercise: Try and understand why m and s return the same value.

median(monthly.income)/median(monthly.days.worked)
## [1] 1.704545

Character/String/Text Variables and Vectors(try the following in R):

nString = "Good Morning" #Warning: the “ ” marks have different fonts in R, copy-paste #may return errors, it’s always a good practice to type your codes

nchar(nString) #Returns number of characters (including space) in the variable
## [1] 12
employee.names = c("Jack", "Ravi", "Neil", "Julie", "Reshma", "Robby", "Hellen", "Ninja", "Ajay", "Seth", "Rahul", "David")     #A vector of text  values
employee.names
##  [1] "Jack"   "Ravi"   "Neil"   "Julie"  "Reshma" "Robby"  "Hellen" "Ninja" 
##  [9] "Ajay"   "Seth"   "Rahul"  "David"

Exercise: Try the following:

Using indexing method, i.e. [], display a single as well as set of values in employee.names. Using same method, try and change the values of some of the indices.

Logical Variables and Vectors(try the following in R):

L = TRUE  #Unlike string variables, “ ” marks are not used here
L
## [1] TRUE
L = 4+9 < 5+3; L
## [1] FALSE
L = 5<4 & 3>2; L
## [1] FALSE
L = 5<4 | 3>2; L
## [1] TRUE
L = c(TRUE, FALSE, FALSE, TRUE, TRUE)
L
## [1]  TRUE FALSE FALSE  TRUE  TRUE

Exercise: Try the following:

Using relational and logical operations create a logical vector of five values (i.e. only TRUE and FALSE), in one line code.

Creating empty Vectors(try the following in R):

n=vector("numeric", 10); n  # Empty numeric vector, with ten indices
##  [1] 0 0 0 0 0 0 0 0 0 0
c=vector("character", 10); c    # Empty character vector, with ten indices
##  [1] "" "" "" "" "" "" "" "" "" ""
l=vector("logical", 10); l      # Empty logical vector, with ten indices
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Exercise: Try the following:

Try assigning different values to all the indices of all these empty vectors. Choose an index higher than 10 (e.g. 11th , 12th , etc.) for each vector and try assigning a value to it and see what happens. Try assigning a value of different type other than the defined type for each empty vector (you can select any index) and see what happens.

Matrices(try the following in R):

Following creates a 5-by-2 matrix, using matrix() function

M = matrix(c(1, 4, 5, 2, 6, 7, 10, 11, 9, 12), nrow=5, ncol=2)

A = c(2,3,4)
B = c(4,5,6)
C = cbind(A,B); C    #matrix of A and B as columns
##      A B
## [1,] 2 4
## [2,] 3 5
## [3,] 4 6
C = rbind(A,B); C    #matrix of A and B as rows
##   [,1] [,2] [,3]
## A    2    3    4
## B    4    5    6
M[,2] #Returns second column; rows and columns are separated by comma (,)
## [1]  7 10 11  9 12
M[1,] #Returns first row
## [1] 1 7

Exercise: Try the following:

Display the 1st and 5th rows. Display the first 3 rows.

M1 = t(M)   #Transpose of matrix M
M%*%M1  #Matrix multiplications 
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   50   74   82   65   90
## [2,]   74  116  130   98  144
## [3,]   82  130  146  109  162
## [4,]   65   98  109   85  120
## [5,]   90  144  162  120  180
M*M         #Element-wise product
##      [,1] [,2]
## [1,]    1   49
## [2,]   16  100
## [3,]   25  121
## [4,]    4   81
## [5,]   36  144

Exercise: Try the following:

Try different operations (addition, subtraction, etc.) between two matrices as well as a matrix and a scalar value

Subset of Vectors and matrices:

days.worked = c(23, 20, 23, 24, 22, 21, 20, 24, 21, 25, 22, 21)

employee.names = c("Jack", "Ravi", "Neil", "Julie", "Reshma", "Robby", "Hellen", "Ninja", "Ajay", "Seth", "Rahul", "David")

id = days.worked > 22 #Returns TRUE for indices of days worked greater than 22 and  
id                 #FALSE for indices of days worked less than or equal to 22
##  [1]  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE
employee.names[id] #Returns the names of all those who have worked more than 22 days
## [1] "Jack"  "Neil"  "Julie" "Ninja" "Seth"
Y = M>6 ; Y     #Based on the condition, returns the matrix of True and False
##       [,1] [,2]
## [1,] FALSE TRUE
## [2,] FALSE TRUE
## [3,] FALSE TRUE
## [4,] FALSE TRUE
## [5,] FALSE TRUE
M*Y         #To return all the values in matrix greater than 6
##      [,1] [,2]
## [1,]    0    7
## [2,]    0   10
## [3,]    0   11
## [4,]    0    9
## [5,]    0   12

Vectorized operations (try the following in R):

apply() is an important function in R for vectorized operation

apply(M,2,mean)      #returns mean / average of all columns
## [1] 3.6 9.8
apply(M,1,mean)      #returns mean / average  of all rows
## [1] 4.0 7.0 8.0 5.5 9.0
apply(M,1,sd)           #returns standard deviation of all rows
## [1] 4.242641 4.242641 4.242641 4.949747 4.242641

Exercise: Using apply() try different operation on a matrix (sum, median, etc.)

Assigning names to matrices and vectors (try the following in R):

rownames(M) = c("a", "b", "c", "d", "e") #Assigning names to rows of a matrix
colnames(M) = c("c1", "c2") #Assigning names to columns of a matrix

M
##   c1 c2
## a  1  7
## b  4 10
## c  5 11
## d  2  9
## e  6 12
M["a",]         #Indexing by row names
## c1 c2 
##  1  7
M[, "c2"]   #Indexing by column names
##  a  b  c  d  e 
##  7 10 11  9 12

Exercise: Display the subset of matrix M containing 1st , 3rd, 5th rows, using row names.

n <- c( 2.4, 3.1, -1.5, -2.1 ) ; n
## [1]  2.4  3.1 -1.5 -2.1
names(n) <- c("x","y","z","r"); n        #Assigning names to vector  elements 
##    x    y    z    r 
##  2.4  3.1 -1.5 -2.1
n <- c( "x" = 2.4, "y" = 3.1, "z" = -1.5, "r" = -2.1 ); n
##    x    y    z    r 
##  2.4  3.1 -1.5 -2.1

Data Frames (try the following in R):

df = read.csv("patient.csv") #To load a csv file in R, it loads it as a data frame 

dim(df)               #Returns the dimensions of the data frame
## [1] 10  4
names(df)             #Returns the names of columns
## [1] "Patient" "Gender"  "Age"     "Group"
df[2,4]                   #Returns element from 2nd row and 4th column
## [1] 1
df$Age        #Returns variable from the data whose name matches “Age”
##  [1] 20 25 30 28 29 21 26 32 27 31
df[3,]                    #Returns 3rd row
##   Patient Gender Age Group
## 3     Sam      M  30     3
head(df)              #Returns the first 6 rows of data frame 
##   Patient Gender Age Group
## 1    Dick      M  20     2
## 2    Anna      F  25     1
## 3     Sam      M  30     3
## 4  Jennie      F  28     2
## 5    Joss      M  29     3
## 6     Don      M  21     2
tail(df)                  #Returns the last 6 rows of data frame
##    Patient Gender Age Group
## 5     Joss      M  29     3
## 6      Don      M  21     2
## 7    Annie      F  26     1
## 8     John      M  32     3
## 9     Rose      F  27     2
## 10    Jack      M  31     3

Exercise: Store a csv file (should have at least 10-15 rows and 4-5 columns of data) in your current working directory and perform all the above operations

List (try the following in R):

lifExp = 70
earthQuack = TRUE 
disease = c("cancer", "fever", "corona")
L = list(lifExp, earthQuack, disease)

class(L) #dispalys the type of variable
## [1] "list"
class(L[[1]]) #Returns the class of the first variable   
## [1] "numeric"
L = list("lifExp"=lifExp, "earthQuack"=earthQuack, "disease"=disease) #Assigning name to each var

L$lifExp ; L[[1]] #Both options return the value of lifExp variable, which is 1st  variable
## [1] 70
## [1] 70

Workspace management:

ls() #To display the content of the current workspace

rm(a) #To remove a variable/object from the workspace

getwd() #To display the current working directory

setwd("C:/Users/xyz/Data Science/folder1/folder2/folder3/finalfolder") #To set the new working directory
list.files() #To display all the files in current working directory

Saving and retrieving files:

save.image(file = "filename.Rdata") #To save the entire workspace

score = c(10,25,56,70); student = c("Vicky", "Joy", "Arun")

name.list = c("score", "student") #To create a vector of names for above two variables

save(file="studentScore.Rdata", list= name.list) #Saves the variables in current WD

load("studentScore.Rdata") #To retrieve the file in R from current WD

Exercise: Create five different variables with names and save them in your working directory as fiveVar.Rdata. Close your R session and reopen. Load the file back into your new R session. Use ls() to check if the variables have been loaded back

df = read.csv(file="filename.csv", header=TRUE) #To read csv file from WD

write.csv(df, "filename.csv", row.names=F) #To save a data table as .csv in WD

Exercise: Read a csv file in R as df from your working directory. Write it back into your working directory by a different name.

df = read.table(file = "filename.txt", header=T) #To read a text file from WD

#sep = " " #(space);  sep = “\t” #(tab-delimited); sep = “,” #(comma-separated)

#skip = number of lines #no. of lines to be skipped before reading the text file 

Exercise: Read a text file in R as df from your working directory using the appropriate sep argument.

Installing a package and accessing it in R

install.packages("e1071") #To install a package in R

library("e1071")  #To access the package in R

exists("svm") #To check if this function (part of the package) exists in current R session

detach("package:e1071", unload=TRUE) #Remove the package from the current R session

exists("svm") #To verify if this function has been removed from the current session 

Exercise: Install the above package in R. You can use any mirror depending on your location or you can select 0-Cloud option, when the following screen pops up.


Click the links below for more

R Programming 2

Comments

Popular posts from this blog

Metaverse needs better technology, scalable infra, strong governance

Many minds have been intrigued by the idea of metaverse, and its effect is such that the social media giant like Facebook has been rebranded as Meta. Yet, there is a big question mark on the future of this technology. The enablers of metaverse such as augmented reality, mixed reality and virtual reality operating on computers, smartphones and other devices have failed to give the complete real-world like immersive experience to end users. There is a clear lack of standard virtual environment and technical specifications for implementing metaverse  –  a bottleneck in using technologies from different proprietors. Due to the business privacy and transparency concerns, interoperability of services from various providers has become a big challenge. Although, the efforts to standardize virtual reality, such as Universal Scene Description, glTF and OpenXR may help in a long run, but a lot more needs to be put in.  The technologies and devices, such as wireless he...

What is ChatGPT?

Introduction ChatGPT is a language model developed by OpenAI based on the GPT-3.5 architecture. It is designed to perform various natural language processing tasks such as language translation, text summarization, question-answering, and chatbot interactions. In this blog, we will discuss ChatGPT, its architecture, applications, and benefits. Architecture ChatGPT is based on the GPT-3.5 architecture, which is an extension of the GPT-3 architecture. The model has 175 billion parameters, making it one of the largest language models available. The architecture consists of 96 transformer blocks with a hidden size of 12,288 and 10 attention heads. The model is trained using a combination of unsupervised and supervised learning techniques. Applications ChatGPT has a wide range of applications in various fields such as healthcare, finance, customer service, and education. Some of the applications of ChatGPT are as follows: Language translation: ChatGPT can translate text from one language to ...

Exploratory Data Analysis

  Lab_D_2_RM Asmi Ariv 2022-10-14 Exploratory Data Analysis In this lab, we will go through various steps to explore a dataset using descriptive statistics, summary of data, different graphs, etc. Factor Variables (try the following in R): data = read.csv( "patient.csv" );data #Reading patient data ## Patient Gender Age Group ## 1 Dick M 20 2 ## 2 Anna F 25 1 ## 3 Sam M 30 3 ## 4 Jennie F 28 2 ## 5 Joss M 29 3 ## 6 Don M 21 2 ## 7 Annie F 26 1 ## 8 John M 32 3 ## 9 Rose F 27 2 ## 10 Jack M 31 3 data$Gender #It is a string/character variable ## [1] "M" "F" "M" "F" "M" "M" "F" "M" "F" "M" data$Gender = factor(data$Gender,levels=c( "M" , "F" ), ordered= TRUE ) #...