0% found this document useful (0 votes)
69 views11 pages

R Programming

Visualization is an excellent medium to analyze, comprehend and share information because it allows large amounts of data to be interpreted visually through graphs or maps, making trends, patterns and outliers easier for the human mind to understand compared to raw data. Data visualization takes raw data, models it visually and delivers insights that enable conclusions to be reached. It communicates information universally, quickly and effectively to positively impact decision making.

Uploaded by

hell no
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
69 views11 pages

R Programming

Visualization is an excellent medium to analyze, comprehend and share information because it allows large amounts of data to be interpreted visually through graphs or maps, making trends, patterns and outliers easier for the human mind to understand compared to raw data. Data visualization takes raw data, models it visually and delivers insights that enable conclusions to be reached. It communicates information universally, quickly and effectively to positively impact decision making.

Uploaded by

hell no
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 11

Q.

1 Visualization is an excellent medium to analyze, comprehend and


share information. Justify this statement.
● With so much information being collected through data analysis
in the business world today, we must have a way to paint a
picture of that data so we can interpret it.
● Data visualization gives us a clear idea of what the information
means by giving it visual context through maps or graphs.
● This makes the data more natural for the human mind to
comprehend and therefore makes it easier to identify trends,
patterns, and outliers within large data sets.
● As one of the essential steps in the business intelligence process,
data visualization takes the raw data, models it, and delivers the
data so that conclusions can be reached.
● Specifically, data visualization uses visual data to communicate
information in a manner that is universal, fast, and effective.
● Data visualization positively affects an organization’s
decision-making process with interactive visual representations of
data. Businesses can now recognize patterns more quickly
because they can interpret data in graphical or pictorial forms.

Q.2 List and discuss basic features of R.


● R is a well-developed, simple and effective programming
language which includes conditionals, loops, user defined
recursive functions and input and output facilities.
● R has an effective data handling and storage facility,
● R provides a suite of operators for calculations on arrays, lists,
vectors and matrices.
● R provides a large, coherent and integrated collection of tools for
data analysis.
● R provides graphical facilities for data analysis and display either
directly at the computer or printing at the papers.

Q.3 The following table shows the number of units of different products
sold on different days:
Create five sample numeric vectors from this data
Product <- c("Bread","Milk","Cola Cans","Chocolate Bars","Detergent")
Monday <- c(12,21,10,6,5)
Tuesday <- c(3,27,1,7,8)
Wednesday <- c(5,18,33,4,12)
Thursday <- c(11,20,6,13,20)
Friday <- c(9,15,12,12,23)

df <- data.frame(Product,Monday, Tuesday, Wednesday,


Thursday,Friday)
print(df)

Q.4 Which function is used to concatenate text values in R. Write a


script to concatenate text and numerical values in R.
Text 1: Ram has scored
Text 2: 89
Text 3: marks
Text 4: in Mathematics
t1 <- "Ram has scored"
t2 <- "89"
t3 <- "marks"
t4 <- "in Mathematics"

t <- cbind(t1,t2,t3,t4)
tp <- paste(t1,t2,t3,t4, sep=" ")
print(tp)
Q.5 Which function is used to construct a vector in R. Write a script to
generate the following list of numerical values with spaces: 3 5 6 9 11
34
num_vec <- c(3,5,6,9,11,34)
print(num_vec)

Q.6 List and explain operators used to form data subsets in R.


Product <- c("Bread","Milk","Cola Cans","Chocolate Bars","Detergent")
Monday <- c(12,21,10,6,5)
Tuesday <- c(3,27,1,7,8)
Wednesday <- c(5,18,33,4,12)
Thursday <- c(11,20,6,13,20)
Friday <- c(9,15,12,12,23)
df <- data.frame(Product,Monday, Tuesday, Wednesday,
Thursday,Friday)

result <- data.frame(df$Product,df$Monday,df$Friday)

# Extract first two rows.


result1 <- df[1:2,]

# Extract 3rd and 5th row with 2nd and 4th column.
result2 <- df[c(3,5),c(2,4)]
print(result2)

Q7 List the functions provided by R to combine different sets of data.


Product <- c("Bread","Milk","Cola Cans","Chocolate Bars","Detergent")
Monday <- c(12,21,10,6,5)
Tuesday <- c(3,27,1,7,8)
Wednesday <- c(5,18,33,4,12)
Thursday <- c(11,20,6,13,20)
Friday <- c(9,15,12,12,23)
df1 <- data.frame(Product,Monday, Tuesday, Wednesday,
Thursday,Friday)
Saturday <- c(18,3,23,9,13)
Sunday <- c(20,4,6,23,16)
df1$Saturday <- Saturday
df1$Sunday <- Sunday

print(df1)

Q.8 Suppose you have two datasets A and B. Dataset A has the
following data: 1 2 4 5. Dataset B has the following data: 6 7 8 9.
Which function is used to combine the data from both datasets into
dataset C. Demonstrate the function with the input values and write
the output.
a1 <- c(1,2,4,5)
a <- data.frame(a1)

b1 <- c(6,7,8,9)
b <- data.frame(b1)

c <- merge(a, b)
c <- rbind(a, b)
print(c)

colnames(b) <- colnames(a)


print(b)

c <- rbind(a, b)
print(c)

Q.9 What are the advantages of using functions over scripts?


● In R, a function is essentially a piece of code that is executed
consecutively and without interruption.
● An R script is just a plain text file that you save R code in.
● Functions can work with variable input, so you use it with
different data.
● Functions return the output as an object, so you can work with
the result of that function.

Q.10 List and discuss various types of data visualizations.


A pie-chart is a representation of values as slices of a circle with
different colors. The slices are labeled and the numbers corresponding
to each slice is also represented in the chart.
In R the pie chart is created using the pie() function which takes
positive numbers as a vector input. The additional parameters are
used to control labels, color, title etc.
CODE
#Pie Chart
Product <- c("Bread","Milk","Cola Cans","Chocolate Bars","Detergent")
Monday <- c(12,21,10,6,5)
pie(Monday,Product)

A bar chart represents data in rectangular bars with length of the bar
proportional to the value of the variable. R uses the function barplot()
to create bar charts.
CODE
Monday <- c(12,21,10,6,5)
barplot(Monday)
Boxplots are a measure of how well distributed is the data in a data
set. It divides the data set into three quartiles. This graph represents
the minimum, maximum, median, first quartile and third quartile in
the data set. It is also useful in comparing the distribution of data
across data sets by drawing boxplots for each of them.
#BoxPlot
boxplot(Monday ~ Tuesday, data = df1, NOTCH = TRUE)

A histogram represents the frequencies of values of a variable


bucketed into ranges. Histogram is similar to bar chat but the
difference is it groups the values into continuous ranges. Each bar in
histogram represents the height of the number of values present in
that range.
hist(Monday)

A line chart is a graph that connects a series of points by drawing line


segments between them. These points are ordered in one of their
coordinate (usually the x-coordinate) value.

Scatterplots show many points plotted in the Cartesian plane. Each


point represents the values of two variables. One variable is chosen in
the horizontal axis and another in the vertical axis.
Q.11 Discuss any five applications of data visualizations.
Q.12 List and explain various functions that allow users to handle data
in R workspace with appropriate examples.
In R, we can read data from files stored outside the R environment.
We can also write data into files which will be stored and accessed by
the operating system. R can read and write into various file formats
like csv, excel, xml etc.
● Csv
○ data <- read.csv("input.csv")
● Excel
○ install.packages("xlsx")
○ library("xlsx")
○ data <- read.xlsx("input.xlsx", sheetIndex = 1)
○ print(data)
● JSON
○ install.packages("rjson")
○ library("rjson")
○ result <- fromJSON(file = "input.json")
○ print(result)
● Web Data
○ https://www.tutorialspoint.com/r/r_web_data.htm
● Database
○ install.packages("RMySQL")
○ mysqlconnection = dbConnect(MySQL(), user = 'root',
password = '', dbname = 'mynewdb', host = 'localhost')
○ dbListTables(mysqlconnection)
○ result = dbSendQuery(mysqlconnection, "select * from
actor")

Q.13 List and discuss various types of data structures in R.


● Vectors
○ Logical
○ Numeric
○ Integer
○ Complex
○ Character
○ Raw
■ Tuesday <- c(3,27,1,7,8)
● Lists
○ Can contain many different types of elements inside it like
vectors, functions and even another list inside it.
■ list1 <- list(c(2,5,3),21.3,sin)
● Matrices
○ Two-dimensional rectangular data set. It can be created
using a vector input to the matrix function.
■ M = matrix( c('a','a','b','c','b','a'), nrow = 2, ncol = 3,
byrow = TRUE)
■ print(M)
● Arrays
○ The array function takes a dim attribute which creates the
required number of dimension.
■ a <- array(c('green','yellow'),dim = c(3,3,2))
■ print(a)
● Factors
○ Factors are the r-objects which are created using a vector.
It stores the vector along with the distinct values of the
elements in the vector as labels.
○ The labels are always character irrespective of whether it is
numeric or character or Boolean etc. in the input vector
○ Factors are created using the factor() function
■ factor_apple <- factor(apple_colors)
● Data Frames
○ Data frames are tabular data objects. Unlike a matrix in
data frame each column can contain different modes of
data.
○ The first column can be numeric while the second column
can be character and third column can be logical. It is a list
of vectors of equal length.

Q.14 Discuss the syntax of defining a function in R


An R function is created by using the keyword function. The basic
syntax of an R function definition is as follows −

function_name <- function(arg_1, arg_2, ...) {


Function body
}

The different parts of a function are −


● Function Name − This is the actual name of the function. It is
stored in R environment as an object with this name.
● Arguments − An argument is a placeholder. When a function is
invoked, you pass a value to the argument. Arguments are
optional; that is, a function may contain no arguments. Also
arguments can have default values.
● Function Body − The function body contains a collection of
statements that defines what the function does.
● Return Value − The return value of a function is the last
expression in the function body to be evaluated.
# Create a function to print squares of numbers in sequence.
new <- function(a) {
for(i in 1:a) {
b <- i^2
print(b)
}
}
new(6)

You might also like