0% found this document useful (0 votes)
79 views

ATA Tructures In: Pavan Kumar A

The document discusses various data structures in R including vectors, matrices, arrays, lists, and data frames. It provides examples of how to create and manipulate each type of data structure. Vectors can be integer, character, logical, or a combination. Matrices and arrays represent multi-dimensional data. Lists allow storing different data types. Data frames combine data into a tabular form with named rows and columns that can be accessed and manipulated.

Uploaded by

naresh darapu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

ATA Tructures In: Pavan Kumar A

The document discusses various data structures in R including vectors, matrices, arrays, lists, and data frames. It provides examples of how to create and manipulate each type of data structure. Vectors can be integer, character, logical, or a combination. Matrices and arrays represent multi-dimensional data. Lists allow storing different data types. Data frames combine data into a tabular form with named rows and columns that can be accessed and manipulated.

Uploaded by

naresh darapu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

DATA STRUCTURES IN R

Pavan Kumar A
DATA STRUCTURES IN R
 Types of data structures in R
 Vector : It is the structure that can contain one or more values of a single
type or mixed (characters, integers)
 It is represented as one dimensional data

 Matrices : It is the 2-dimensional representation of data.


 Arrays : It can be more than 2-dimensional representation of data.
 Lists: A list is a generic vector that is allowed to include different types of
objects.
 Data Frames: It is the rectangular 2-dimensional representation of data
DATA STRUCTURES IN R- INTEGER VECTORS
 Following functions are used to create the character vectors
 c() : Concatenate (joining items end to end)
 seq() : Sequence (Generating equidistant series of numbers)
 rep() : Replicate (used to generate repeated values)

 c() examples
> c(42,57,12,39,1,3,4)
[1] 42 57 12 39 1 3 4
 You can also concatenate vectors of more than one element
> x <- c(1, 2, 3)
> y <- c(10, 20)
DATA STRUCTURES IN R- INTEGER VECTORS
 seq(): It is used to generate the series of numbers which is of equidistant
 It accepts three arguments

 Start element
 Stop element
 Jump element

> seq(4,9)#It generates the numbers from 4 to 9, only 2 arguments are given
[1] 4 5 6 7 8 9

> seq(4,10,2) #Three arguments are given, jump by 2 elements


[1] 4 6 8 10
DATA STRUCTURES IN R- INTEGER VECTORS
 seq() vector creation is used in plotting the x and y axis in the graphical
analysis.
 For example:
 If x-axis co-ordinates are being created as
c(1.65,1.70,1.75,1.80,1.85,1.90)
 Then simply using following command, can create the same
Syntax :
seq(from, to, by)
Seq (1.65,1.90,0.05)
> 4:9 #exactly the same as seq(4,9)
[1] 4 5 6 7 8 9
> sum(1:10)
[1] 55
DATA STRUCTURES IN R- INTEGER VECTORS
 Another Example of seq() command, Here we are adding length.out
argument for the seq() command
from = “Starting Element”
to = “Ending Element”
by = ((to - from)/(length.out - 1))
DATA STRUCTURES IN R- INTEGER VECTORS
 rep(), is used to generate repeated values.
 It is used in two variants, depending on whether the second argument is a
vector or a single number
> oops <- c(7,9,13)
> rep(oops,3) # It repeats the entire vector oops 3 times
[1] 7 9 13 7 9 13 7 9 13
> rep(oops,1:3)
[1] 7 9 9 13 13 13
Here, oops should be repeated by vector of 1:3 values.
Indicating that 7 should be repeated once, 9 twice, and 13 three times
DATA STRUCTURES IN R- INTEGER VECTORS
Look at following examples
> rep(oops,1:4)
Error in rep(anow, 1:4) : invalid 'times' argument
> rep(1:2,c(10,15))
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> rep(1:2,each=10)
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
> rep(1:2,c(10,10)
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
DATA STRUCTURES IN R- INTEGER VECTORS
 Integer vectors : Indexing
DATA STRUCTURES IN R- CHARACTER VECTORS
 Character Vector: A character vector is a vector of text strings, whose elements
are specified and printed in quotes
> c("Huey","Dewey","Louie")
[1] "Huey" "Dewey" "Louie“

 Single quotes or Double quotes can be used for strings


> c(’Huey’,’Dewey’,’Louie’)
[1] "Huey" "Dewey" "Louie“

 "Huey", it is a string of four characters, not six.


 The quotes are not actually part of the string, they are just there so that the
system can tell the difference between a string and a variable name.
DATA STRUCTURES IN R- CHARACTER VECTORS
 If you print a character vector, it usually comes out with quotes added to each
element. There is a way to avoid this, namely to use the cat() function.
 For instance,
> cat(c("Huey","Dewey","Louie"))
Huey Dewey Louie
DATA STRUCTURES IN R- CHARACTER VECTORS
 Quoting and escape sequences
 If the strings itself contains some quotations, new line characters.
 This is done using escape sequences

 Here, \n is an example of an escape sequence.


 The backslash (\) is known as the escape character

 If you want to insert quotes with in the string, the \” is used. For example
> cat("What is \"R\"?\n")
What is "R"?
DATA STRUCTURES IN R- CHARACTER VECTORS
 Logical vectors can take the value TRUE or FALSE
 In input, you may use the convenient abbreviations T and F
> c(T,T,F,T)
[1] TRUE TRUE FALSE TRUE
DATA STRUCTURES IN R- CHARACTER VECTORS
 Example of Character Vector: Indexing
DATA STRUCTURES IN R- CHARACTER VECTORS
 Missing values
 In many data sets, you may find missing values.
 We need to have some method to deal with the missing values
 R allows vectors to contain a special NA value.

 Result of computations done on NA will be NA


DATA STRUCTURES IN R- COMBINATION OF INT AND CHAR
 Example of c()

 It is also possible to assign names to the elements


DATA STRUCTURES IN R- MATRIX
 Matrix: It is 2 dimensional representation of numbers.
 Matrices and arrays are represented as vectors with dimensions
> x <- 1:12
> dim(x) <- c(3,4) #The dim assignment function sets or changes the
dimension attribute of x, causing R to treat the vector of 12 numbers as a 3 × 4
matrix
DATA STRUCTURES IN R- MATRIX
 Another way to create Matrix is simply by
using matrix() function
 Syntax
matrix(data = NA, nrow = 1, ncol = 1,
byrow = FALSE)
DATA STRUCTURES IN R- MATRIX
 You can “glue” vectors together, columnwise or rowwise, using the cbind and
rbind functions.

 The cbind() : Column bind


 The rbind() : Row bind

 Arrays are similar to matrices but can have more than two dimensions.
See help(array) for details
DATA STRUCTURES IN R- MATRIX
 Subsetting a matrix  Example
 We can extract the
elements from the
matrix – Matrix
Subsetting.
 Since it is a two
dimensional
representation of
numbers, we can
access it with two-
dimensional accessor
[,]
DATA STRUCTURES IN R- MATRIX
 Matrix Operations
 Addition
 Substraction
 Exp
 Element-wise *
 Mat Mult %*%
 rowsums()
 rowmeans()
 colsums()
 colmeans()
 t()
DATA STRUCTURES IN R - ARRAYS
 Arrays
 It is a vector that is represented and accessible in a given number of
dimensions (mostly more than two dimensions).
DATA STRUCTURES IN R-LISTS

 Lists: It is the collection of objects that fall under similar category.


 A list is not fixed in length and can contain other lists.
DATA STRUCTURES IN R – ACCESSING LISTS

 There are various ways to access the elements of a list.


 The most common way is to use a dollar-sign $ to extract the value of a list
element by name
DATA STRUCTURES IN R-DATA FRAMES
 Data Frame is also 2-dimensional object just like Matrix, for storing data
tables.
 Here, different columns can have different modes (numeric, character, factor,
etc).
 All data frames are rectangular and R will remove out any „short‟ using NA

 Creating Data Frame


DATA STRUCTURES IN R-DATA FRAMES
 Error: Here, in the second vector „e‟ , is a 3 element vector and „d‟ and „f‟ are 4
element vectors.
 It is a collection of vectors (Integer/Character) of equal lengths

 Each column in the Data Frame can be a separate type of data. In the previous
example „mydata‟ data frame, it is the combination of numerical, character and
factor data types.
ACCESSING DATA FRAMES
 There are a variety of ways to identify the elements of a data frame. Here are
few screenshots.

BUILD-IN DATA FRAMES IN R
 R has some build-in datasets. „mtcars‟ is one datasets
CREATING DATA SUBSETS
 R deals with huge data, not all of which is useful.
 Therefore, first step is to sort out the data containing the relevant information.

 Extracted data sets are further divided into small subsets of data.

 Function used for extracting the data is subset().

 The following operations are used for subset the data.


 $ (Dollar) : Used to select the single element of the data.
 [ ] (Single Square Brackets) : Used to extract multiple elements of data.
CREATING DATA SUBSETS
 We can extract (subset) the part of the data table based on some condition
using subset() function
 Syntax subset(dataset, function)
 Example

## Age.At.Death Age.As.Writer Name Surname Gender Death


## 1 22 16 Jane Doe FEMALE 2015-05-10
## 4 41 36 Jane Austen FEMALE 1817-07-18

writer_names_df <- subset(writers_df, Age.At.Death <= 40 & Age.As.Writer >= 18)


writer_names_df <- subset(writers_df, Name =="Jane")
male_writers <- writers_df[Gender =="MALE",]
writers_df[1,3] <- NULL #making null value
CREATING SUBSETS IN VECTORS
 To create subsets in vectors, subset() or [] can be used
## A simple vector
v<-c(1,5,6,4,2,4,2)

#Using subset function Creates the subset of numbers greater than 4 using
subset(v,v<4) subset() function

#Using square brackets


v[v<4] Creates the subset of numbers greater than 4 using []
brackets
#Another vector
t<-c(“one”, “one”, “two”, “three”, “four”, “two”)

# Remove “one” entries


subset(t, t!=“one”) Creates the subset of texts after removing the word, “one”
using subset() function
t[t!=“one”]
Creates the subset of texts after removing the word, “one”
using [] function
CREATING SUBSETS IN VECTORS
 Execution of code on R console
CREATING SUBSETS IN DATA FRAMES
 Data Frames subsets can also be done using subset() and [] function
CREATING SUBSETS IN DATA FRAMES
 Data Frames subsets can also be done using subset() and [] function
THANK YOU !!!

You might also like