R-Tutorial - Introduction

Installation of R
Go to the website www.r-project.org
Choose "Download / CRAN"
[CRAN is a network of ftp and web servers around the world that store identical, up-to-date, versions of code
and documentation for R. Please use the CRAN mirror nearest to you to minimize network load.]
Choose a nearby location.

Choose "Download R for Windows"
Choose "base".
Choose "Download".
Starting R
R Console
Setting Working Directory

Create a folder of name "R" in root directory of C-drive.
Right click "R Icon"
Choose "Properties"
In "Start In" box type "C:\R"
Working with Scalars
Basic Data Types

Numeric
Character
Logical
Complex
Displaying Scalar Values

Displaying numeric values
print(47)
print(47.5)
print(35 + 56)
Displaying textual values

print ("rabi")
print (rabi)
print("Rabi is working")
Displaying logical values

print(TRUE)
print(FALSE)
print(3>2)
print(47.1==47.2)
print(47.1=47.2)
Displaying Complex Values

print(3+2i)
print ( (3 + 2i) + (6 + 7i))
Using Memory with Scalar Values

Storing Scalar Values in Memory as Variables
aa=7
bb<-56.6
cc<-"My Nepal"
dd<-TRUE
ee<-3 + 2i
Displaying values in variables

> print(aa)
> print(bb)
>print(cc)
> print(dd)
> print(ee)
Note: Values in variables can be displayed simply by typing its name, such as
> aa
> bb
> cc
> dd
> ee
To display type of variables Use 'class()'function, or also 'mode()' function
> class(aa)
> class(bb) ....
To display combinations of textual prompts and variable values- Use 'cat()' function
E.g. > cat("The value in variable 'aa' is",aa,"\n")
> cat("The type of variable dd is",class(dd),"\n")
...............not now...............
To display names of all variables stored in memory Use 'ls()' function
E.g. > ls() # displays all variable names
> ls(pat="m") # displays all variable names containing m
> ls(pat = "^m") # displays all variable names which start with m.
.........Some left..............
R Objects
R is Object Oriented Programming (OOP) language. There are many built-in objects of R. Some common
R-objects used to handle data are
Vectors
List
Matrices
Factors
Data Frames
Arrays
Vectors
A vector is combination of two or more variables all of same data type. It is the simplest type of R object.
All variables so far created previously are R objects containing one element (member).
Creating Vectors
Using 'c( )' function
> ab <- c(23,35,56)
> ab
> ac <- c("Nepal","India","China")
> ac
> ad<-c(TRUE,TRUE,2<3,0==0)
> ad
> ae<-c(3+4i,7+2i)
> ae
Using 'assign()' function
> assign("a", 7)
> assing("b", c(1,2,3,4))
Creating arithmetic sequences

a) Using ' : ' Operator
> 1:30
> ba<- 1:30
> baa<-5.6:12.6
> bab < - 5.6: 12.7
b) Using 'seq()' function

Syntax: seq(start_value, end_value, increment)
>bb <- seq(3,54) or bb<-seq(3,54,1)
> bc<-seq(3,54,2)
> bd<-seq(1,10,0.5)
> be <-seq(50,0,-5)
> bf<-seq(10,100, by = 10)
> bg<-seq(length = 40, from = 4.6, by = 0.2)
(c) Using 'sequence()' function
It create a series of sequences of integers each ending by the numbers given as parameters. E.g.
> sequence(4 : 7)
> sequence(6 : 3)
> sequence(c(10, 5))
Accessing Vector Elements

a) Using position of element
> rb = c("Violet","Indigo","Blue","Green","Yellow","Orange","Red")
> rb[2]
> rb[c(2, 5)]
> rb[2:5]
> rc = rb[3]
> rc
> rd = rb[c(3,4,7)]
> rd
b) Using logical indexing

> re = rb[c(TRUE, TRUE, FALSE,FALSE,TRUE,TRUE,FALSE)]
c) Using negative indexing

> rf = rb[c(-3,-4,-6)]
Carrying Mathematical Operations on Vectors

Carrying mathematical operations on single vector
ma = c(34,65,76,21,23)
maa = ma + 2
mab = ma 2
mac = ma * 2
mad = ma/2
mada = 1/ma
mae = ma ^2
maf = ma^(1/2)
mag = ma^(1/3)
Carrying mathematical operations on two vectors
mb = c(12,54,23, 45,32)
mc = ma + mb
md = ma mb
me = ma * mb
mf = ma/mb
mg = c(1,2,3,2,1)
mh = mb ^ mg
Note: If length of two vectors are not same, then mathematical operations on two vectors is not
possible. However, if length of one vector is scalar multiple of another, then in this case, the
values of shorter vector is recycled while carrying mathematical operations. E.g.
> m1 = c(2, 6, 7)
> m2 = c(3, 6, 8, 7, 4, 5)
> m3 = m1 + m2
Displaying Statistical Values of Elements in Vector
mb = c(12,54,23, 45,32)
mi = mean(mb)
mi
mj = var(mb)
mj
mk = sum(mb)
mk
ml = prod(mb)
ml
mm=sqrt(mb)
mm
mn = length(mb)
mn
mo = min(mb)
mp = max(mb)
mq = sort(mb)
Working on logical and relational operators with vectors
> a = c(1:5)
>b=a>3
>b
> a==3
>a!=3
> TRUE & TRUE
> TRUE & FALSE
> FALSE & TRUE
> FALSE & FALSE
> TRUE | TRUE
> TRUE | FALSE
> FALSE | TRUE
> FALSE | FALSE
> ! FALSE
> ! TRUE
> ! (TRUE & FALSE)
Mode, Length and Attribute of Vectors

To display the type of elements in vector use 'mode(vector_name)'
To display the number of elements in vector use 'length(vector_name)'

Lists
Introduction
A vector contains elements of same type. A list is similar to vector, but it may contain elements of
different type.
Creating List
Exm.
> a1 = list("Rabi", 23, 54.5)
> a1
A list may contain vectors, e.g.
> aa = list(c(2, 3, 4), 21.4, TRUE)
> aa
A list may also contain functions. E.g.
> bb = list( c(2, 3, 4), 21.4, sqrt(c(23, 45, 32)))
> bb
A list may also contain another list. E.g.
> cc = list(c("Rara", "Phewa", "Begnas"), aa)
Factors
Introduction
A factor is a R data type that stores categorical variables. Such type of data types are abundantly used in
statistical modeling.
A data variable is said to be of categorical, if the contents to be included in it are not all different, but
can be any one of two or more types.
For example, variables related to gender may be of only two types- male and female.
Variables related to blood group may be any one of four types- A, B, AB and O.
Variable related to GPA grade may be any one of types, A, B, C, D, E and F.

Here gender and blood group variables have no intrinsic ordering, however, the GPA grade has an
intrinsic ordering. The categorical variables which have no intrinsic ordering is said to be nominal
variable.
Creating Factors of Nominal Category

Suppose the gender of 6 consecutive customers entering a restaurant are observed to be "male, male,
female, male, female, male"
To store these values as a factor data type
> factor(c("Male", "Male", "Female", "Male", "Female", "Male"))
Or,
gen_fact = factor(c("Male", "Male", "Female", "Male", "Female", "Male"))
Or,
gen = c("Male", "Male", "Female", "Male", "Female", "Male")
gen_fact = factor(gen)
Interpreting Values in Factors

The distinct values that are repeated in creating a factor are called 'levels'. The names of these 'levels'
are displayed when factor is created.
In fact, these levels are sorted alphabetically.
For example- the blood group of 11 patients admitted at a hospital on a day are recorded and are
changed into factor below
> bg = factor(c("A","B","A","AB","A","O","O","A","AB","B","B"))
> bg
R stores different levels of factors as a vector of integers. R assigns integer values to different elements
in a factor in the order of the alphabetical listing.
To display the numeric integers corresponding to different elements of a factor the structure function
'str()' is used. E.g.
> str(bg)
These integers are used by R for storing textual description of elements in a factor.
By default, the values provided to different elements of a factor are set according as alphabetical
ordering. However, we can provide our own integer values to the different elements of factor by using
'levels' parameter inside 'factor()' function.
Exm.
> bg = factor(c("A","B","A","AB","A","O","O","A","AB","B","B"), levels = c("O","A","B","AB"))
> str(bg)
Here, O group is given value 1, A group 2, B group 3 and AB group 4.
The number of levels in a factor can be accessed with 'nlevels' function. E.g.
print(nlevels(gen_fact))
The complete syntax of factor() function is as follows-
factor(x, levels = sort(unique(x), na.last = TRUE), labels = levels, exclude = NA, ordered =
is.ordered(x))
Here, levels specifies the possible levels of the factor (by default the unique values of the vector x),
labels defines the names of the levels, exclude defines the values of x to exclude from the levels, and
ordered is a logical argument specifying whether the levels of the factor are ordered. Recall that x is of
mode numeric or character.
Types of Categorical Variables

The categorical variables so far we have encountered do not have any intrinsic ordering. They are also
called nominal variables.
In some categorical variables different levels associated may have specific ordering. For example,
economic status of citizens can be categorized as low, medium, high. Here different levels have some
sort of ordering. Such type of categorical variables are said to be 'Ordinal'.
In fact, there are four types of categorical variable, they are:
a) Nominal Variable
b) Ordinal Variable
c) Scale Variable
d) Ratio Variable
Ordinal Variables
To create ordinal variable, while creating factor, 'ordered' attribute is set to 'TRUE'. E.g.
> tshirt_size = c("Large","Small","Large","Large","Medium","Small","Large")

> ts_fact = factor(tshirt_size, ordered = TRUE)
If one views the structure of this factor by using 'str()' function, then according to alphabetical order
"Large" is provided value 1, "Medium" is provide value 2 and "Small" is provided value 3.
To provide values 1, 2 and 3 for "Small", "Medium" and "Large", one can use 'levles' attribute of factor
function, as
> ts_fact = factor(tshirt_size, ordered = TRUE, levels = c("Small","Medium", "Large"))
The categorical variables which are defined for certain ranges are called interval variables. For example:
(a) age-group (0 10, 10 20, 20- 30, etc.) (b) income groups ( $ 100 500, $ 600 1000, etc.)
Description on interval variables and ratio variable are left over now.
Accessing Elements in Factors

To access a specific element in the factor created, one can use 'factor_name[position]". E.g.
> ts_fact[2]
> ts_fact[c(1,3,4)]
Generating Levels of Factor

Regular levels of factors can be created by using 'gl()' function. For example,
> gl (3, 5) // [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 Levels: 1 2 3
> gl(3, 5, length=30) // [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 Levels: 1 2 3
> gl(2, 6, label=c("Male", "Female")) //
[1] Male Male Male Male Male Male
[7] Female Female Female Female Female Female
Levels: Male Female
> gl(2, 10) // [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 Levels: 1 2
> gl(2, 1, length=20) // [1] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Levels: 1 2
> gl(2, 2, length=20) // [1] 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2 Levels: 1 2

Matrix
A matrix is a two dimensional rectangular data set in which data values are arranged into rows and
columns.
The function which is used to create a matrix is 'matrix()'. The syntax of this function is as follows-
matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)
The option byrow indicates whether the values given by data must fill successively the columns (the
default) or the rows (if TRUE). The option dimnames allows to give names to the rows and columns.
Creating Matrices
Creating matrix of integers
> m2 = matrix(c(12, 43, 43, 23,34, 26), nrow = 2, ncol= 3)
> m2 = matrix(c(12, 43, 43, 23,34, 26), nrow = 2, ncol= 3, byrow=TRUE)
Creating matrix of texts

> m3 =matrix(c("Kavre","Kathmandu","Nuwakot","Sunsari","Morang","Jhapa"), nrow=3,ncol=2)
Creating matrix of sequence of integers

> m = matrix(1:10, nrow=2, ncol=5)
> m1 = matrix(1 : 10, nrow = 2, ncol = 5, byrow = TRUE)
Creating matrix of same element repeatedly

> m4 = matrix(0, nrow=2, ncol =3)
> m5 = matrix(c(2, 5), nrow = 2, ncol=3)
Specifying Columns and Row Headings

Row and column headings are provided to a matrix by using 'dimnames' attribute in 'matrix' function by
specifying them as a list of two vectors. E.g.
Marks of two students "Rajan" and "Hari" in three subjects "Math", "Science" and "Computer" are
stored in a matrix and headings are provided below-
> mm1 = matrix(c(34, 57,54,76,57,87), nrow=2, ncol = 3, dimnames =

list(c("Rajan","Hari"),c("Math","Science","Computer")))
Alternative method of creating matrix

> x=1:15
> dim(x) = c(5, 3)

>x
Accessing Elements in Matrices

Accessing particular element
> m2 = matrix(c(12, 43, 43, 23,34, 26), nrow = 2, ncol= 3)
> m2[2, 3]
> n1 = m2[2, 3]
Accessing elements in a row(s), column(s)

> m2[1, ]
> m2[2, ]
> m2[ , 2]
Accessing diagonal elements

> m3 = matrix(1 : 25, nrow = 5, ncol = 5)
> diag(m3)
Matrix Manipulation
> a = matrix(1:10, nrow=2, ncol=5)
>a
> b = matrix(-5:4, nrow = 2, ncol = 5)
>b
> a + b (addition of matrix)
>c=ab
>5*a
> a * b (product of corresponding elements)
> a / b (element by element division)
>1/a
Real matrix multiplication
> a = matrix(1:10, nrow=2, ncol=5)
> b = matrix(-5: 4, nrow=5, ncol=2)
> a %*% b (real matrix multiplication)
> sqrt(a)
> sum(a)
> mean(a)
> sum(a[1, ])
> sum(a[2, ])
> mean(a[2, ])
> c = matrix(1 : 25, nrow = 5, ncol = 5)
> sum(diag(c))
> mean(diag(c))
> p1 = diag(c)
> mean(p1)
> eigen(c) [ eigen value of matrix 'c']
> eigen(c)$values
> eigen(c) $ vectors
> det(c) // displays determinant of matrix 'c'
> solve(c) // displays inverse of matrix 'c'

Data Frame
Introduction
R is a statistical programming language and in Statistics we work with datasets. Such data sets typically
comprises of observations. All observations consist of some variables which may be of different types.
For example - .........
In datasets, different instances of observations are stored in different rows. Each of these observations
has specific attributes, e.g. name, age, gender, score, etc. Since there will be a lot of observations a
particular attribute is placed in same column of dataset.
So, a dataset is similar to matrix, since it is a two dimensional array consisting of rows and columns.
However, a matrix can contain all data of same type, but a dataset needs each observation containing
data of one or more different data type.
In fact, a list represents a single observation (row) of dataset and a dataset can also be created by using
list of lists.
However, R provides a special way to create a dataset and it is by using object 'dataframe'.
A data frame is fundamental data structure that stores datasets.
In a data frame all columns contains elements of same data type and they represent different attributes
of observations. Data representing common attribute of different observations are placed in a particular
column of data frame. In the same way, rows contain list of elements belonging to a particular instance
or particular observation.
Creating data frame

A data set is created by using 'data.frame()' function.
Let us create a data frame containing three columns (or vectors) of names- name, age, and gender, each
containing five observations.
> name = c("Roni", "Rabi", "Sunita", "Arjun", "Mani")
> age = c(34, 65, 45, 23, 34)
> male = c(TRUE, TRUE,FALSE,TRUE,TRUE)
> df = data.frame(name, age, male)
> df
Labeling Variables of Data Frame

To provide clear descriptive labels to the variables, i.e., columns, 'names()' function is used as
> names(df) = c("Name of Student","Age","Male")
An alternative method is
> df = data.frame(Name = name, Age = age, Male = male)
Or,
> df = data.frame("Name-of-Student" = name, Age = age, Male = male)
In the same way, different rows of observations can also be named. (Later)
To View Structure of Data Frame

To view the structure of the data frame, 'str()' function is used.
E.g.
> str(df)
To Access Elements of Data Frame

a) By Treating Data Frame as Matrix
To access age of third person (since age is in second column of dataframe-
> df[3, 2]
> dg[3, "Age"] //'Age' is name of variable 'age'
To display all records of third person
> df[ 3 , ]
To display names of all students, i.e., first column-
> df [ , 1]
> df[ , "Name-of-Student"]
To display the data in third and fifth row
> df[ c(3,5), ]
To display the data from second to fourth row
> df[ 2:4, ]
To display data in entire observations in first and third columns, i.e., name and male columns
> df[ , c(1,3)]
To display data in entire observations from second to third row
> df[ , 2:3]
b) By Treating Data Frame as Vector of Lists

All above commands are used by behaving data frame as a matrix.
Alternatively, data frame can also be visualized as vector of lists, where each list corresponds to a
particular observation or row.
In this method different elements of data frame can be accessed as follows
To display all data in 'age' columns
> df $ Age
> df $ Male
> df [["Age"]]
> df[[ 2]]
To Add New Row(s) and Column(s) to Data Frame

To add new column of name 'height' with values 132, 143, 214, 245, 243
> ht = c(132, 143, 214, 245, 243) //creating vector for heights
> df $ height = ht
Or,
> df[["height"]] = ht
Another equivalent is to use 'cbind()' function as follows-
> wt = c{35, 56,54,46,54)
> cbind(df, wt)
To add a new row to the data frame, we create another data frame containing rows to be added and use
'rbind()' function as follows-
Sorting and Ordering in Data Frame

Sorting and Ordering observations in data frame
To sort ages in ascending order

> sort(df $ age)
To rank ages in ascending order
> order(df $ age)
Or,
> rnk = order(df $ age)
.... it shows position of smallest value in the rank and so on ...
To display entire data in order by 'rnk'
> df[rnk, ]
To order in descending order and display entire obervations-
> df[ order(df $ age, decreasing = TRUE), ]
Creating data frame from combinations of vectors, factors, etc.

The function 'expand.grid()' creates a data frame with all possible combinations of vectors or factors
given as arguments.
> expand.grid(h=c(60,80), w=c(100, 300), sex=c("Male", "Female"))
h w sex
1 60 100 Male
2 80 100 Male
3 60 300 Male
4 80 300 Male
5 60 100 Female
6 80 100 Female
7 60 300 Female
8 80 300 Female
Data from External Files
Practically, a dataset is not created directly, but it is imported from some data source, such as Excel,
SQL, Access, SPSS, etc.
Practically, data required for statistical analysis are not entered in R directly, as we have practiced in
data frame, but they are usually imported from different sources, such as text editor, spreadsheets,
databases, etc.
Reading Data From Text Files/ Spreadsheets

There are two ways of creating data files in text editors. First by separating each data in a row by comma
" , "and second by using "Tab". Different rows or observations in data file are separated by pressing
"Enter" key in both ways.
Data files created by using comma are commonly called comma separated files (or .CSV files) and those
using "Tab" are called tab delimited text files (or .TXT files)
Other functions that can be used to import dataset into R are scan(), read.fwf, etc.
Opening '.csv' file -
> read.csv(file = ..........)
Another way is to use 'file.choose()' function to display list of files to open as
> read.csv( file.choose())
To open '.txt' file -
> read.delim(file.choose()) // or // >read.delim(file = .......... )
The general way to open a data file is to use 'read.table()' function as -
> read.table(file=......., header = TRUE, sep = ",")
> read.table(file=......., header = TRUE, sep = "\t")
// OR
> read.table( file.choose(), header = TRUE, sep = "," )
> read.table( file.choose(), header = TRUE, sep = "\t" )
Complete structure of 'read.table' function is as follows-
read.table(file, header = FALSE, sep = "", quote = "\"'", dec = ".", row.names, col.names, as.is =
FALSE, na.strings = "NA", colClasses = NA, nrows = -1, skip = 0, check.names = TRUE, fill =
!blank.lines.skip, strip.white = FALSE, blank.lines.skip = TRUE, comment.char = "#")
'scan()' function to open data sets
The function scan is more flexible than read.table. A difference is that it is possible to specify the mode
of the variables, for example:
> mydata <- scan("data.dat", what = list("", 0, 0))
reads in the file 'data.dat' three variables, the first is of mode character and the next two are of mode
numeric.
Another important distinction is that scan() can be used to create different objects, vectors, matrices,
data frames, lists, . . .
In the above example, mydata is a list of three vectors. By default, that is if what is omitted, scan()
creates a numeric vector. If the data read do not correspond to the mode(s) expected (either by default,
or specified by what), an error message is returned.
The options are the followings.
scan(file = "", what = double(0), nmax = -1, n = -1, sep = "", quote = if (sep=="\n") "" else "'\"",
dec = ".", skip = 0, nlines = 0, na.strings = "NA", flush = FALSE, fill = FALSE, strip.white =
FALSE, quiet = FALSE, blank.lines.skip = TRUE, multi.line = TRUE, comment.char = "",
allowEscapes = TRUE)
Using 'read.fwf' function
The function read.fwf can be used to read in a file some data in fixed width format:
read.fwf(file, widths, header = FALSE, sep = "\t", as.is = FALSE, skip = 0, row.names,
col.names, n = -1, buffersize = 2000, ...)
Manipulating Data in Data Frames

Once a dataset is imported into R, there are different ways of accessing data in it. If a dataset is name
"data.dat" is imported in the data frame of name "mydata" by using function
> mydata <- read.table("data.dat")

then by default each variable in it are named "V1", "V2", ....... and so on. Then different variables can be
accessed by following methods-
i) mydata$V1, mydata$V2, ..........

// in this form data are accessed by treating them as vector objects.
ii) mydata["V1"], mydata["V2"], .................

// in this form data are accessed by treating them as data frame object.
iii) mydata[ , 1], mydata[ , 2] , ..............

// in this method data are accessed by treating them as matrix
To work with imported data table
> bb = read.table( file.choose(), header = TRUE, sep = "," )
> mean(bb $ Math)
> sum(bb $ Math)
> sqrt(bb $ Math)
> mean(bb) will not work, since data frame is not stored into memory.
So it is required to use
> attach(bb) // It imports data from files into working memory of computer.
// Once attach() function is used to import a data file, it will not be necessary to use '$' operator to refer
to any variable in it. E.g.
> mean(Math)
> dim(bb) // displays the number of rows and columns in the data frame
> bb[c(1,3), ] //displays the first and the third observations
> bb[2:3, ]
> bb[-(2:3), ] // all except second and third row
> bb[ , 4]
> bb[ , "Math"]
> bb[ , c("Math","Science")]
> bb [ , c(4,5)]
To display summary of data Frame
> summary(bb)
For numeric data it displays mean, median, mode, quartiles, etc. If there are categorical data, i.e.,
factors, then it displays counts of different categories.
Filtering Data in Data Frames

One common way of filtering data in data frame is to split data frame.
To split a data frame into two or more data frames according to some categorical data, e.g. Gender
> maledata = bb[bb$Gender=="male", ]
> femdata = bb[bb$Gender=="female", ]
> maledata
> femdata
> dim(maledata)
> summary(maledata)
> femdata[1:3,]
To display mean age of females only in following data frame
> mean(bb$Age[bb$Gender=="female"])
Array
While matrices of are confined to two dimensions, arrays can be any number of dimensions. In fact,
vectors, lists and factors are one dimensional array. Similarly, matrices are two dimensional arrays.
Creating arrays
Marks of 3 students in 4 subjects recorded for two terminal examinations can be presented in the form
of a 3-dimensional array as 2 number of 3 x 3 matrices as follows:
a) > ar1 = array(c(24,65,76,54,34,56,67,67,78,78,76,56,47,84,57,63,35,45,67,89,87,56,34,23),

dim=c(4,3,2))
b) > term1 = matrix(c(24,65,76,54,34,56,67,67,78,78,76,56), nrow=4, ncol=3)
> term2= matrix(c(47,84,57,63,35,45,67,89,87,56,34,23), nrow = 4, ncol = 3)

> ar2 = array(c(m1, m2), dim = c(4,3,2))
c) > sub11 = c(24,65,76,54)
> sub21 = c(34,56,67,67)
> sub31 = c(78,78,76,56)
> sub12 = c(47,84,57,63)
> sub22 = c(35,45,67,89)
> sub32 = c(87,56,34,23)
> ar3 = array(matrix(c(sub11, sub21, sub31, sub12, sub22, sub32),nrow=4, ncol=3), dim=c(4,3,2))
Manipulating Arrays
To provide names to the row headings, column heading, and matrix headings in above array.
a)
> ar4 = array(c(24,65,76,54,34,56,67,67,78,78,76,56,47,84,57,63,35,45,67,89,87,56,34,23), dim=c(4,3,2),

dimnames=list(c("Stud1", "Stud2", "Stud3","Stud4"),c("Sub1", "Sub2", "Sub3"),c("Term1", "Term2")))
b)
c)
To display marks of "Stud2" in "Sub3" in "Term1"
ar4[2, 3, 1]
To display marks of "Stud2" in all subjects in the first term
ar4[2, , 1]
To display average mark of "Stud2" (of all subjects) in first term
mean(ar4[2, , 1])
To display marks of all students in "Sub3" in "Term2"
ar4[ , 3, 2]
To display average mark of all students in "Sub3" in "Term2"
mean(ar4[ , 3, 2])
To display all marks of "Term1"

ar4[ , , 1]
To display sum of all marks in "Term1"
sum(ar4[ , , 1])
To display average marks of all students in both terms
apply(ar4, c(1), mean)
To display average marks in all subjects in both terms
To display grand average marks of all students in all subjects in both terms
To display grand average marks different students in all subject in "Term1"
apply(ar4[ , , 1], c(1), mean)
To display grand average marks in different subjects of all students in "Term1"
apply(ar4[ , , 1], c(2), mean)
Saving Objects into File

The function write.table writes in a file an object, typically a data frame but this could well be another
kind of object (vector, matrix, . . . ). The arguments and options are:
write.table(x, file = "", append = FALSE, quote = TRUE, sep = " ", eol = "\n", na = "NA", dec = ".",
row.names = TRUE, col.names = TRUE, qmethod = c("escape", "double"))

R-Tutorial - Introduction

Uploaded by

R-Tutorial - Introduction

Uploaded by

Installation of R

Go to the website www.r-project.org

Choose "Download / CRAN"

Choose a nearby location.

Setting Working Directory

Right click "R Icon"

Basic Data Types

Displaying Scalar Values

Displaying textual values

Displaying logical values

Displaying Complex Values

print ( (3 + 2i) + (6 + 7i))

Using Memory with Scalar Values

Displaying values in variables

To display type of variables Use 'class()'function, or also 'mode()' function

> class(bb) ....

E.g. > cat("The value in variable 'aa' is",aa,"\n")

> cat("The type of variable dd is",class(dd),"\n")

To display names of all variables stored in memory Use 'ls()' function

E.g. > ls() # displays all variable names

> ls(pat="m") # displays all variable names containing m

> ac <- c("Nepal","India","China")

Using 'assign()' function

Creating arithmetic sequences

> ba<- 1:30

> bab < - 5.6: 12.7

b) Using 'seq()' function

>bb <- seq(3,54) or bb<-seq(3,54,1)

> bf<-seq(10,100, by = 10)

> bg<-seq(length = 40, from = 4.6, by = 0.2)

(c) Using 'sequence()' function

> sequence(c(10, 5))

Accessing Vector Elements

b) Using logical indexing

c) Using negative indexing

Carrying Mathematical Operations on Vectors

Carrying mathematical operations on two vectors

Displaying Statistical Values of Elements in Vector

> TRUE & TRUE

> TRUE & FALSE

> FALSE & TRUE

> FALSE & FALSE

> TRUE | TRUE

> TRUE | FALSE

> FALSE | TRUE

> FALSE | FALSE

> ! (TRUE & FALSE)

Mode, Length and Attribute of Vectors

To display the number of elements in vector use 'length(vector_name)'

> a1 = list("Rabi", 23, 54.5)

A list may contain vectors, e.g.

> aa = list(c(2, 3, 4), 21.4, TRUE)

A list may also contain functions. E.g.

> bb = list( c(2, 3, 4), 21.4, sqrt(c(23, 45, 32)))

A list may also contain another list. E.g.

> cc = list(c("Rara", "Phewa", "Begnas"), aa)

Variable related to GPA grade may be any one of types, A, B, C, D, E and F.

Creating Factors of Nominal Category

To store these values as a factor data type

> factor(c("Male", "Male", "Female", "Male", "Female", "Male"))

gen_fact = factor(c("Male", "Male", "Female", "Male", "Female", "Male"))

gen = c("Male", "Male", "Female", "Male", "Female", "Male")

Interpreting Values in Factors

In fact, these levels are sorted alphabetically.

> bg = factor(c("A","B","A","AB","A","O","O","A","AB","B","B"), levels = c("O","A","B","AB"))

Here, O group is given value 1, A group 2, B group 3 and AB group 4.

The complete syntax of factor() function is as follows-

Types of Categorical Variables

In fact, there are four types of categorical variable, they are:

> tshirt_size = c("Large","Small","Large","Large","Medium","Small","Large")

> ts_fact = factor(tshirt_size, ordered = TRUE, levels = c("Small","Medium", "Large"))

Accessing Elements in Factors

Generating Levels of Factor

> gl (3, 5) // [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 Levels: 1 2 3

> gl(3, 5, length=30) // [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 Levels: 1 2 3

> gl(2, 6, label=c("Male", "Female")) //