R Programming
R Programming
Environment
Whenever we define a new or re-assign an existing
variable in RStudio, it's stored as an object in the
workspace and gets displayed, together with its value,
on the Environment tab in the top-right area of the
RStudio window. Try running greeting <- "Hello,
World!" in the console and see what happens on
the Environment tab.
In the example below, we created two variables in the
console: greeting <- "Hello, World!" and my_vector <-
c(1, 2, 3, 4). Note how they are displayed on
the Environment tab:
etc.
visualizations
In this article, we will take a look at the data structures in R. We will learn
what they are, and what are their usess. We will also explore a few features
and functions of these structures.
Introduction to Data
Structures in R
R has six types of basic data structures. We can organize these data
structures according to their dimensions(1d, 2d, nd). We can also
classify them as homogeneous or heterogeneous (can their
contents be of different types or not).
Homogeneous data structures are ones that can only store a single type of
data (numeric, integer, character, etc.).
Heterogeneous data structures are ones that can store more than one type
of data at the same time.
1. Vector
2. List
3. Matrix
4. Data frame
5. Array
6. Factor
1. Vectors
Vectors are single-dimensional, homogeneous data structures. To
create a vector, use the c() function.
For example:
> vec <- c(1,2,3) # creates a vector named vec
> vec
Output:
[1] 1 2 3
The assign() function is another way to create a vector.
For example:
> assign("vec2", c(4,5,6))
> vec2
Output:
[1] 4 5 6
Vectors can hold values of a single data type. Thus, they can be numeric,
logical, character, integer or complex vectors.
For example:
> numeric_vec <- c(1,2,3,4,5)
> integer_vec <- c(1L,2L,3L,4L,5L)
> logical_vec <- c(TRUE, TRUE, FALSE, FALSE,
FALSE)
> complex_vec <- c(12+2i, 3i, 4+1i, 5+12i, 6i)
> character_vec <- c("techvidvan", "this", "is", "a",
"character vector")
> numeric_vec
> integer_vec
> logical_vec
> complex_vec
> character_vec
Output:
[1] 1 2 3 4 5
[1] 1 2 3 4 5
3. Matrix
Matrices are two-dimensional, homogeneous data
structures. This means that all values in a matrix have to be of
the same type. Coercion takes place if there is more than one
data type. They have rows and columns.
By default, matrices are in column-wise order. The basic
syntax to create a matrix is:
>matrix( data, nrow, ncol, byrow, dimnames)
Where data is the input values in the matrix given as a vector,
nrow is the number of rows,
ncol is the number of columns,
byrow is a logical which tells the function to arrange the matrix row-wise,
by default it is set to FALSE,
dimnames is a list of the names of the rows/columns created.
The following code will create a matrix with 3 rows and values 1 to 9 in a
column-wise order.
For example:
> test_matrix1 <- matrix(c(1:9), ncol = 3)
> test_matrix1
Output:
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
4. Data Frames
Data frames are two-dimensional,
heterogeneous data structures. They are lists of vectors of
equal lengths. Data frames have the following constraints
placed upon them:
1. A data-frame must have column-names and
each row should have a unique name.
2. Each column should have the same number of
items.
3. Each item in a single column should be of the same
type.
4. Different columns can have different data types.
To create a data frame, use the data.frames() function.
For example:
> student_id <- c(1:5)
> student_name <- c("raj", "jacob", "iqbal", "shawn",
"hitesh")
> student_rank <- c("third", "fifth", "second", "fourth",
"first")
> student.data <- data.frame(student_id ,
student_name, student_rank)
> student.data
Output:
student_id student_name student_rank
1 1 raj third
2 2 jacob fifth
3 3 iqbal second
4 4 shawn fourth
5 5 hitesh first
5. Arrays
Arrays are three dimensional, homogeneous data
structures. They are collections of matrices stacked one on top
of the other in layers.
You can create an array using the array() function. The
following is the syntax of it:
Array_name = array(data,dim,dimnames)
Where array_name is the name of the array,
data is the data that is filled inside the array,
dim is a vector containing the dimensions of the array,
and dimnames is a list containing the names of the rows,
columns, and matrices inside the array.
Here is an example of the array() function:
> arr1 <- array(c(1:18),dim=c(2,3,3))
> arr1
Output:
, , 1[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6, , 2[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12, , 3[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18
6. Factors
Factors are vectors that can only store predefined values. They
are useful for storing categorical data. Factors have two
attributes:
Class – which has a value of “factor”, it makes it
behave differently than a normal vector.
Levels – which is the set of allowed values
You can create a factor using the factor() function.
For example:
> fac <- factor(c("a", "b", "a", "b", "b"))
> fac
Output:
[1] a b a b b
Levels: a b