Statistics With R-Programming Lab Manual

RAGHU ENGINEERING COLLEGE
Dakamarri(vill), Bheemunipatnam Mandal,

Visakhapatnam Dist, Andhra Pradesh, PIN 531162
(Approved by AICTE, New Delhi, and Affiliated to Jawaharlal Nehru Technological
University: Kakinada (AP), Accredited by NBA and NAAC ‘A’ Grade)
2022-23
IV B.Tech. - I-Semester (AR17)
STUDENT LABORATORY MANUAL
for
STATISTICS WITH R-PROGRAMMING LAB
Prepared by
Dr. P. Mallikharjuna Rao,
Professor, B.S.&H.
Statistics with R - Programming Lab Manual(AR17)

RAGHU ENGINEERING COLLEGE
IV-B.Tech.,I-Sem.,(ME)-AR17
Lab Programs for Statistics with R-Programming
Sub NAME OF THE EXPERIMENT

Week
Exp.
1 Installation of R in Windows and Linux environment
2 a Write a R program to find measures of central tendency
b Write a R program to perform different operations on Matrices
Write a R program to store data into a List and perform different
3 a
operations
Write a R program to store data into Data frame and perform
b
different operations
4 a Write a R program to find biggest of three elements
b Write a R program to find roots of a quadratic equation
c Write a R program to find sum of elements of vector
5 a Write a R program to find factorial of a number using recursion
b Write a R program to find gcd of two numbers using recursion
Write a R program to mean, variance, standard deviation for the
6 a
given discrete probability distribution
Write a R program to mean, variance, standard deviation for the
b
given continuous probability distribution
Write a R program to represent the given data in the form of
7
graphs using built in functions
8 a Write a R program to fit Binomial distribution to the given data
b Write a R program to fit Poisson distribution to the given data
9 a Write a R program for Z test
b Write a R program for t test
c Write a R program for F test
d Write a R program for Chi-square test
10 a Write a R program to fit a linear regression
b Write a R program to fit multiple linear regression

WEEK #1
Experiment #1:
Installation of R software in Windows and Linux environments
Requirements Analysis:
Installation of R in Windows OS: The Comprehensive R Archive Network (CRAN)
is a network of websites that host the R program and that mirror the original R website.
The benefit of having this network of websites is improved download speeds. For all
intents and purposes, CRAN is the R website and holds downloads (including old
versions of software) and documentation. R can be installed in Windows7/8/10/Vista
and supports both the 32-bit and 64-bit versions. Go to the CRAN website and select
the latest installer R 4.2.1 for Windows and download the .exe file. Double click on the
download file and select Run as Administrator form the popup menu. Select the
language to be used for installation and follow the directions. The installation folder for
R can be found in C:\Programs\R. The steps for installing R:
1. Click on the link https://cran.r-project.org/bin/windows/base/ which redirects
you to the download page.
2. Select the latest installer R-4.2.1 for installation and download the same. After
download, clicking on the setup file opens the dialog box.
3. Click on the ‘Next’ button starts the installation process. This redirects you to
the license window and selecting ‘Next’.
4. After selecting the Next button from the previous step the installation folder path
is required. Select the desired folder for installation; it is advisable to select the
C directory for smooth running of the program.
5. Next select the components for installation based on the requirements of your
operating system to avoid unwanted use of disk space.
6. In the next dialog box, we need to select the start menu folder. Here, it is better
to go with the default option given by the installer.
7. After setting up the Start menu folder, check the additional options for
completing the setup.
8. After clicking next from the previous step, the installation procedure ends and
the window is displayed. Click ‘Finish’ to exist from the installation window.

Installing and Configuring R-Studio in Windows: The Integrated Development
Environment(IDE) for R is R Studio and it provides a variety of features such as an
editor with direct code execution and syntax highlighting, a console, tools for plotting
graphs, history lookup, debugging, and an environment for workspace creation. R
Studio can be installed in any of the Windows platforms such as Windows 7/8/10/Vista
and can be configured within a few minutes. The basic requirement is R 2.11.1+
version. The following are the steps involved to setup R Studio:
1) Download the latest version of R Studio just by clicking on the link provided
here https://www.rstudio.com/products/rstudio/download/ and it redirects you to
download page. There are two versions of R Studio available – desktop and
server. Based on your usage and comfort, select the appropriate version to
initiate your download.
2) Download the .exe file and double click on it to initiate the installation.
3) Click on the ‘Next’ button and it redirects you to select the installation folder.
Select ‘C:\’ as your installation directory since R and R Studio must be installed
in the same directory to avoid path issues for running R programs.
4) Click ‘Next’ to continue and a dialog box asking you to select the Start menu
folder opens. It is advisable to create your own folder to avoid any possible
confusion and click on Install button to install R Studio.
After completion of installation, clicking ‘Next’ from the previous step, the installation
procedure ends and the window is displayed. Click ‘Finish’ to exist from the
installation window
Installation of R in Ubuntu: Go to software center and search for R Base and install.
Then open terminal and enter R to get R command prompt in terminal.
Installation of R-studio in Ubuntu: Open terminal and type the following commands

5

WEEK #2
Experiment #2-A:
Question: Write a R program to find the measures of central tendency (mean, median,
and mode).
Aim: R program to find mean of the given data
Code:
m=function()
{
print("Enter the elements of vector:")
x=scan()
n=length(x)
sum=0
for(i in 1:n)
{
sum=sum+x[i]
}
mean1=sum/n
cat("Mean of the vector is ",mean1)
}
Output:
> m()
[1] "Enter the elements of vector:"
1: 1
2: 4
3: 6
4: 3
5: 5
6:
Read 5 items
Mean of the vector is 3.8
Experiment #2-B:
Aim: R program to find mean of the frequency distribution
Code:
m1=function()
{
x=scan()
print(table(x))
f=as.numeric(table(x))
x1=sort(unique(x))
sum=0
mean1=sum(f*x1)/sum(f)
cat("Mean of the vector is ",mean1)
6

}
Output:
> m1()
1: 1
2: 1
3: 1
4: 2
5: 3
6: 4
7: 5
8: 4
9: 6
10: 7
11:
Read 10 items
x
1234567
3112111
Mean of the vector is 3.4
Experiment #2-C:
Aim: R program to find median of the given data
Code:
med=function()
{
x=scan()
n=length(x)
x1=sort(x)
print(x1)
if(n%%2==0)
{
me=(x1[n/2+1]+x1[(n+1)/2])/2
} else {
me=x1[n/2+1]
}
cat("Median of the vector is ",me,"\n")
}
Output:
> med()
1: 1
2: 2
3: 3

4: 3
5: 4
6: 5
7: 6
8:
Read 7 items
[1] 1 2 3 3 4 5 6
Median of the vector is 3
Experiment #2-D:
Aim: R program to find mode of the given data
Code:
mod=function()
{
x=scan()
print(table(x))
f=as.numeric(table(x))
x1=sort(unique(x))
mf=max(f)
for(i in 1:length(f))
{
if(f[i]==mf)
cat("\nMode is ",x1[i])
}
}
Output:
mod()
1: 1
2: 1
3: 2
4: 3
5: 4
6: 5
7: 5
8: 6
9: 6
10: 7
11: 7
12:
Read 11 items
x
1234567
2111222
Mode is 1
Mode is 5

Mode is 6
Mode is 7
Experiment #2-E:
Aim: R program to perform different operations on matrices
Code:
read=function()
{
A=matrix(c(1:9),nrow=3,ncol=3,byrow=T)
B=matrix(c(10:18),nrow=3,ncol=3,byrow=T)
m1=nrow(A)
n1=ncol(A)
m2=nrow(B)
n2=ncol(B)
cat("Matrix A:\n")
print(A)
cat("Matrix B:\n")
print(B)
if(m1==m2 && n1==n2)
{
cat("Sum of the matrices is A+B=\n")
print(A+B)
} else
cat("\n Addition of matrices is not possible")
if(n1==m2)
{
cat("Product of the matrices is A*B=\n")
print(A%*%B)
} else
cat("\n Multiplication of matrices is not possible")
cat("Transpose of the Matrix A is:\n")
print(t(A))
cat("Transpose of the Matrix B is:\n")
print(t(B))
}
Output:
read()
Matrix A:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
Matrix B:
[,1] [,2] [,3]
[1,] 10 11 12
[2,] 13 14 15
[3,] 16 17 18

Sum of the matrices is A+B=
[,1] [,2] [,3]
[1,] 11 13 15
[2,] 17 19 21
[3,] 23 25 27
Product of the matrices is A*B=
[,1] [,2] [,3]
[1,] 84 90 96
[2,] 201 216 231
[3,] 318 342 366
Transpose of the Matrix A is:
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
Transpose of the Matrix B is:
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
10

WEEK#3
Experiment #3-A:
Aim: R Program to create a list containing a vector, a matrix and a list and write
a code for the following.
# 1) Give names to the elements in the list
# 2) Add element at the end of the list
# 3) Remove the second element
Code:
# Creating a list
a=c(23,4,5,56)
b=matrix(data=1:9,nrow=3)
c=list(35,"ravi","Male")
lst=list(a,b,c)
print(lst)
# Giving names to the elements

print("Give names to the elements:")
names(lst)=c("vector","matrix","info")
print(lst)
#Adding element at the end of the list

print("Add element at the end of the list:")
lst[[4]]=c(1,2,3)
print(lst)
# Removing the second element of the list

cat("After removing the second element the list is:\n")
lst[[2]]=NULL
print(lst)
Output:
[[1]]
[1] 23 4 5 56
[[2]]
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
[[3]]
[[3]][[1]]
[1] 35
[[3]][[2]]
[1] "ravi"
11

[[3]][[3]]
[1] "Male"
[1] "Give names to the elements:"

$vector
[1] 23 4 5 56
$matrix
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
$info
$info[[1]]
[1] 35
$info[[2]]
[1] "ravi"
$info[[3]]
[1] "Male"
[1] "Add element at the end of the list:"

$vector
[1] 23 4 5 56
$matrix
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
$info
$info[[1]]
[1] 35
$info[[2]]
[1] "ravi"
$info[[3]]
[1] "Male"
[[4]]
[1] 1 2 3
After removing the second element the list is:

$vector
12

[1] 23 4 5 56
$info
$info[[1]]
[1] 35
$info[[2]]
[1] "ravi"
$info[[3]]
[1] "Male"
[[3]]
[1] 1 2 3
Experiment #3-B:
Aim: R program to create a data frame of student with four given vectors and write a
code
# 1) to get the structure of a given data frame.
# 2) to get the statistical summary and nature of the data of a given data frame.
# 3) to extract specific column from a data frame using column name.
# 4) to extract first two rows from a given data frame.
# 5) to extract 3rd and 5th rows with 1st and 3rd columns from a given data frame.
# 6) to add a new column in a given data frame.
# 7) to add new row(s) to an existing data frame.
# 8) to drop column(s) by name from a given data frame.
# 9) to drop row(s) by number from a given data frame.
# 10) to extract the records whose grade is greater than 9.
Code:
# creating a data frame
r.no=c("17981A0461","17981A0462","17981A0463","17981A0464","17981A0465","1
7981A0466")
name=c("ramu","ahmed","samuel","singh","begum","prasanthi")
grade=c(8.4,9.9,7.5,8.7,9.1,6.8)
sex=c("M","M","M","M","F","F")
df_stud=data.frame(r.no,name,grade,sex)
print(df_stud)
# 1) Getting the structure of data frame

print("The structure of the data frame is :")
print(str(df_stud))
# 2) Statistical summary and nature of the data

print("The statistical summary and nature of the data is :")
print(summary(df_stud))
# 3) Extracting the column heading "name"

13

print("The list of names in the column 'name' are :")
print(df_stud$name)
# 4) Extracting first two rows of data frame

print("The first two rows of the data frame are:")
print(df_stud[1:2,])
# 5) Extracting 3rd and 5th rows with 1st and 3rd columns
print("The 3rd and 5th rows with 1st and 3rd columns are:")
print(df_stud[c(3,5),c(1,3)])
# 6) Adding new column to data frame

print("Adding new column named "Date of Birth :")
df_stud$dob=c("14-01-1999","4-6-1999","8-12-1998","25-7-1999","20-9-1998","1-2-
1999")
print(df_stud)
# 7) Adding new row to the existing data frame

print("Adding new row to the data frame:")
new_df_stud=data.frame(r.no="17981A467",name="lavanya",grade=8.9,sex="F",dob="
4-6-7-1999")
print(rbind(df_stud,new_df_stud))
# 8) Dropping a column from the data frame

print("Dropping a column named r.no:")
df_stud$r.no=NULL
print(df_stud)
# 9) Dropping a row by number from the data frame

print("Dropping a row number 4 from the data frame:")
print(df_stud[-4,])
# 10) Subset of data frame

print("Data frame with grade>9")
print(subset(df_stud,grade>9))
Output:
r.no name grade sex
1 17981A0461 ramu 8.4 M
2 17981A0462 ahmed 9.9 M
3 17981A0463 samuel 7.5 M
4 17981A0464 singh 8.7 M
5 17981A0465 begum 9.1 F
6 17981A0466 prasanthi 6.8 F
[1] "The structure of the data frame is :"
'data.frame': 6 obs. of 4 variables:
$ r.no : Factor w/ 6 levels "17981A0461","17981A0462",..: 1 2 3 4 5 6
$ name : Factor w/ 6 levels "ahmed","begum",..: 4 1 5 6 2 3
$ grade: num 8.4 9.9 7.5 8.7 9.1 6.8
$ sex : Factor w/ 2 levels "F","M": 2 2 2 2 1 1
NULL
14

[1] "The statistical summary and nature of the data is :"
r.no name grade sex
17981A0461:1 ahmed :1 Min. :6.800 F:2
17981A0462:1 begum :1 1st Qu.:7.725 M:4
17981A0463:1 prasanthi:1 Median :8.550
17981A0464:1 ramu :1 Mean :8.400
17981A0465:1 samuel :1 3rd Qu.:9.000
17981A0466:1 singh :1 Max. :9.900
[1] "The list of names in the column 'name' are :"
[1] ramu ahmed samuel singh begum prasanthi
Levels: ahmed begum prasanthi ramu samuel singh
[1] "The first two rows of the data frame are:"
r.no name grade sex
1 17981A0461 ramu 8.4 M
2 17981A0462 ahmed 9.9 M
[1] "The 3rd and 5th rows with 1st and 3rd columns are:"
r.no grade
3 17981A0463 7.5
5 17981A0465 9.1
[1] "Adding new column named "
[1] "Date of Birth :"
r.no name grade sex dob
1 17981A0461 ramu 8.4 M 14-01-1999
2 17981A0462 ahmed 9.9 M 4-6-1999
3 17981A0463 samuel 7.5 M 8-12-1998
4 17981A0464 singh 8.7 M 25-7-1999
5 17981A0465 begum 9.1 F 20-9-1998
6 17981A0466 prasanthi 6.8 F 1-2-1999
[1] "Adding new row to the data frame:"
r.no name grade sex dob
1 17981A0461 ramu 8.4 M 14-01-1999
2 17981A0462 ahmed 9.9 M 4-6-1999
3 17981A0463 samuel 7.5 M 8-12-1998
4 17981A0464 singh 8.7 M 25-7-1999
5 17981A0465 begum 9.1 F 20-9-1998
6 17981A0466 prasanthi 6.8 F 1-2-1999
7 17981A467 lavanya 8.9 F 4-6-7-1999
[1] "Dropping a column named "
[1] "r.no:"
name grade sex dob
1 ramu 8.4 M 14-01-1999
2 ahmed 9.9 M 4-6-1999
3 samuel 7.5 M 8-12-1998
4 singh 8.7 M 25-7-1999
5 begum 9.1 F 20-9-1998
6 prasanthi 6.8 F 1-2-1999
[1] "Dropping a row number 4 from the data frame:"
name grade sex dob
1 ramu 8.4 M 14-01-1999
2 ahmed 9.9 M 4-6-1999
3 samuel 7.5 M 8-12-1998
15

5 begum 9.1 F 20-9-1998
6 prasanthi 6.8 F 1-2-1999
[1] "Data frame with grade>9"
name grade sex dob
2 ahmed 9.9 M 4-6-1999
5 begum 9.1 F 20-9-1998
16

WEEK#4
Experiment #4-A:
Aim:R program to find biggest of 3 numbers
Code:
big=function()
{
x=as.numeric(readline("Enter x value:"))
y=as.numeric(readline("Enter y value:"))
z=as.numeric(readline("Enter z value:"))
t=0
if(x>y)
t=x else
t=y
if(t>z)
cat(t," is big") else
cat(z," is big")
}
big()
Output:
Enter x value:2
Enter y value:1
Enter z value:5
5 is big
Experiment #4-B:
Aim: R program to find roots of a quadratic equation
Code:
roots=function()
{
a=as.numeric(readline("Enter a value:"))
b=as.numeric(readline("Enter b value:"))
c=as.numeric(readline("Enter c value:"))
t=b^2-(4*a*c)
if(t<0)
{
cat("Roots are imaginary and roots are ",(-b/(2*a)),"+i",
((sqrt(-t))/(2*a)),"and",(-b/(2*a)),"-i",((sqrt(-t))/(2*a)))
} else
if(t==0)
{
cat("Roots are real and equal and root is ",(-b/(2*a)))
} else
{
cat("Roots are real and unequal\n")
cat("Root1=",(-b+sqrt(t))/(2*a),"\nRoot2=",(-b-sqrt(t))/(2*a))
17

}
}
Output:
> roots()
Enter a value:1
Enter b value:4
Enter c value:1
Roots are real and unequal
Root1= -0.2679492
Root2= -3.732051
> roots()
Enter a value:1
Enter b value:2
Enter c value:1
Roots are real and equal and root is -1
> roots()
Enter a value:1
Enter b value:1
Enter c value:1
Roots are imaginary and roots are
-0.5 +i 0.8660254 and
-0.5 -i 0.8660254
Experiment #4-C:
Aim: R program to find sum of elements of vector and to find minimum and maximum
elements of vectors
Code:
vec=function()
{
x=scan()
n=length(x)
sum=0
for(i in 1:n)
sum=sum+x[i]
max=min=x[1]
for(i in 1:n)
{
if(x[i]<min)
min=x[i]
if(x[i]>max)
max=x[i]
}
cat(" sum of vector elements=",sum,"\n","Minimum element of vector
is:",min,"\n","Maximum element of vector is:",max,"\n")
}
18

Output:
1: -5
2: -4
3: 0
4: 1
5: 5
6:
Read 5 items
sum of vector elements= -3
Minimum element of vector is: -5
Maximum element of vector is: 5
19

WEEK#5
Experiment #5-A:
Aim: R program to find Factorial of a number using recursive function
Code:
fact=function()
{
n=as.numeric(readline("Enter n value:"))
f=fact1(n)
if(n>=0)
cat("Factorial of ",n," is ",f,"\n")
}
fact1=function(n)
{
if(n>=0)
{
if(n==0)
return(1) else
return(n*fact1(n-1))
} else
print("Factorial of negetive number is not possible to compute")
}
Output:
> fact()
Enter n value:-1
[1] "Factorial of negetive number is not possible to compute"
> fact()
Enter n value:0
Factorial of 0 is 1
> fact()
Enter n value:8
Factorial of 8 is 40320
Experiment #5-B:
Aim: R program to find GCD of two numbers
Code:
gcd=function()
{
x=as.numeric(readline("Enter x value:"))
y=as.numeric(readline("Enter y value:"))
g=gcd1(x,y)
cat("GCD of ",x," and ",y," is ",g,"\n")
}
gcd1=function(x,y)
{
if(y!=0)
return(gcd1(y,x%%y))
20

else
return(x)
}
Output:
> gcd()
Enter x value:5
Enter y value:7
GCD of 5 and 7 is 1
> gcd()
Enter x value:125
Enter y value:35
GCD of 125 and 35 is 5
21

WEEK#6
Experiment #6-A:
Aim: R program to mean, variance, standard deviation for the given discrete probability
distribution.
Code:
discrete=function()
{
print("Enter the values of x")
x=scan()
print("Enter the values of p")
p=scan()
y=DiscreteDistribution(supp=x,prob=p)
cat("Mean of the probability distribution is ",E(y))
cat("\nVariance of the probability distribution is ",var(y))
cat("\nStandard Deviation of the probability distribution is ",sd(y))
cat("\n The Distribution function is \n","x ",x,sep="\t","\n","F(x) ",cumsum(p))
}
Output:
> discrete()
[1] "Enter the values of x"
1: 0
2: 1
3: 2
4:
Read 3 items
[1] "Enter the values of p"
1: 0.3
2: 0.5
3: 0.2
4:
Read 3 items
Mean of the probability distribution is 0.9
Variance of the probability distribution is 0.49
Standard Deviation of the probability distribution is 0.7
The Distribution function is
x 0 1 2
F(x) 0.3 0.8 1
22

Experiment #6-B:
Aim: R program to mean, variance, standard deviation for the given continuous
probability distribution
# for the given probability density function f(x)=3*x^2,0<x<1
Code:
contin=function()
{
f=function(x) 3*x^2
p=integrate(f,lower=0.14,upper=0.71)
print("The probability of x lies between 0.14 to 0.17 is ")
print(p)
x=AbscontDistribution(d=f,low1 =0,up1 =1)
cat("Mean of the probability distribution is ",E(x))
cat("\nVariance of the probability distribution is ",var(x))
cat("\nStandard Deviation of the probability distribution is ",sd(x))
#cat("\n The Distribution function is \n","x ",x,sep="\t","\n","F(x) ",cumsum(p))
}
Output:
> contin()
[1] "The probability of x lies between 0.14
to 0.17 is "
0.355167 with absolute error < 3.9e-15
Mean of the probability distribution is
0.7496337
Variance of the probability distribution is
0.03768305
Standard Deviation of the probability
distribution is 0.1941212
23

WEEK#7
Experiment #7:
Aim: R Program to print data in different graph formats
Code:
#Scatter plot
plot(iris$Sepal.Length,iris$Sepal.Width,type="p")
#Histogram
par(mfrow=c(1,2))
hist(iris$Sepal.Length,main="First")
hist(iris$Sepal.Width,main="Second")
par(mfrow=c(1,2))
hist(iris$Petal.Length,main="Third")
hist(iris$Petal.Width,main="Fourth")
#Pie-chart
pie(table(iris$Species))
#Box-plot
boxplot(iris$Sepal.Length,iris$Sepal.Width,iris$Petal.Length,iris$Petal.Width)
#Bar-plot
barplot(head(iris$Sepal.Length),xlab="Sepal lenght")
Output:
24

25

WEEK#8
Experiment #8-A:
Aim: R program to fit Binomial distribution to the given data
Code:
fit_binom=function()
{
n=as.integer(readline("enter the no. of coins tossed: "))
x=0:n
print("enter the values of f ")
f=scan()
ex_freq=0
cat("\n The given Distribution is \n x",x,sep="\t","\n f",f,"\n")
N=sum(f)
if(length(x)==length(f))
{
a=as.logical(readline(prompt="Is the coin unbiased ? :Enter T for TRUE
or F for FALSE \n"))
if(a==T)
p=0.5
else
{
meen=sum(x*f)/N
p=meen/n
}
for(i in 1:(n+1))
ex_freq[i]=N*(dbinom(i-1,n,p))
cat("The expected frequencies are \n",ex_freq)
cat("\n The fitted Binomial Distribution is \n x",x,sep="\t","\n
f",round(ex_freq),"\n")
}else
print("No. of observations in x and f must be equal")
}
Output:
> fit_binom()
enter the no. of coins tossed: 3
[1] "enter the values of f "
1: 2
2: 4
3: 5
4: 6
5:
Read 4 items
The given Distribution is

x 0 1 2 3
f 2 4 5 6
Is the coin unbiased ? :Enter T for TRUE or F for FALSE
T
26

The expected frequencies are
2.125 6.375 6.375 2.125
The fitted Binomial Distribution is
x 0 1 2 3
f 2 6 6 2
Experiment #8-B:
Aim: R program to fit Poisson distribution to the given data
Code:
fit_poisson=function()
{
print("enter the values of x:")
x=scan()
print("enter the values of f:")
f=scan()
cat("\n The Given Distribution is \nx:",x,sep="\t","\nf:",f,"\n")
ex_freq=0
N=sum(f)
meen=sum(x*f)/N
for(i in 1:length(x))
ex_freq[i]=N*(dpois(i-1,meen))
cat("The expected frequencies are \n",ex_freq)
cat("\n The fitted Poisson Distribution is
\nx:",x,sep="\t","\nf:",round(ex_freq),"\n")
}
Output:
[1] "enter the values of x:"
1: 0
2: 1
3: 2
4: 3
5: 4
6:
Read 5 items
[1] "enter the values of f:"
1: 10
2: 9
3: 8
4: 7
5: 6
6:
Read 5 items
The Given Distribution is

x: 0 1 2 3 4
f: 10 9 8 7 6
The expected frequencies are
6.950958 12.16418 10.64365 6.208798 2.716349
27

The fitted Poisson Distribution is
x: 0 1 2 3 4
f: 7 12 11 6 3
28

WEEK#9
9-a Z-test
Experiment #9-A:
A manufacturer claims that the mean lifetime of a light bulb is more than 10,000 hrs. In
a sample of 30 light bulbs, it was found that they only last 9,900 hrs on average. Assume
that the population standard deviation is 120 hrs. at 0.05 significance level can we reject
the claim by the manufacturer.
Aim: To test the claim
H0: mu=10000
H1: mu>10000
Alpha=0.05=5%
Critical value from the z-table is 1.645
Code:
xbar=9900
mu=10000
n=30
sigma=120
z=(xbar-mu)/(sigma/sqrt(n))
Output:
>z
[1] -4.564355
Conclusion: Since z=-4.5<1.645 we accept null hypothesis H0
Experiment #9-B:
Suppose the mean weight of king penguins found in an Antarctic colony last year was
15.4 kg. in a sample of 35 penguins same time this year in the same colony, the mean
penguin weight is 14.6 kg . Assume that the population standard deviation is 2.5kg. at
0.05 significance level, can we reject the null hypothesis that the mean penguin weight
does not differ from last year.
Aim: To test the claim at given level of significance using z-test

H0:mu=15.4
H1:mu≠15.4
Alpha=0.05=5%
Critical values from the z-table are ±1.96
Code:
xbar=14.5
mu=15.4
n=35
sigma=2.5
z=(xbar-mu)/(sigma/sqrt(n))
Output:
[1] -2.129789
29

Conclusion: Since |z|=2.12>1.96 we reject null hypothesis H0
Experiment #9-C:
t-test
Consider the following data from immer table

Loc Var Y1 Y2
1 UF M 81.0 80.7
2 UF S 105.4 82.3
3 UF V 119.7 80.4
4 UF T 109.7 87.2
5 UF P 98.3 84.2
6 W M 146.6 100.4
Assume that the above data follows the normal distribution; find the 95% confidence
interval estimate of the difference between the mean barley yields between years 1931
and 1932
Aim: To test is there any significant difference between the mean barley yields between
years 1931 and 1932
H0 : =0
H1 : ≠0
Alpha=0.05=5%
Critical value from the t-table is
Note:To get the critical value of t from R-console type the following command
qt(1-(alpha/2),df=n-1)
Code:
# creating a data frame
Y1=c(81.0,105.4,119.7,109.7,98.3,146.6)
Y2=c(80.7,82.3,80.4,87.2,84.2,100.4)
immer=data.frame(Y1,Y2)
t.test(immer$Y1,immer$Y2,paired=TRUE)
Output:
Paired t-test
data: immer$Y1 and immer$Y2
t = 3.324, df = 29, p-value = 0.002413
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
6.121954 25.704713
sample estimates:
mean of the differences
15.91333
Experiment #9-D:
Five measurements of tar content of certain kind of cigarette yielded 14.5, 14.2, 14.4,
14.3, 14.6 milligrams per cigarette. Show that the difference between the mean of this
30

sample and the average tar claimed by the manufacturer μ=14.0 mg/cigarette is
significant at α=0.05.
Aim: To test claim using t-test

H0:  = 14.0
H1:   14.0
Level of significance: Appropriate level of significance is 5% (given)
Inference: The tabulated value of t at 5% level of significance for 4 degrees of freedom
in a two tailed test is 2.776 [t/2,n-1=t0.05/2,5-1=t0.025,4=2.776]
Here, tcal > t/2,n-1 . So, we reject H0. Hence we conclude that   14.0
Code:
data<-c(14.5, 14.2, 14.4, 14.3, 14.6)
t.test(data,mu=14.0)
Output:
One Sample t-test
data: data
t = 5.6569, df = 4, p-value = 0.004813
alternative hypothesis: true mean is not equal to 14
14.20368 14.59632
sample estimates:
mean of x 14.4
Experiment #9-E:
The heights of 6 randomly chosen sailors are 63,65,68,69,71,72 inches. Those of 10
randomly chosen soldiers are 61,62,65,66,69,69,70,71,72,73 inches. Discuss whether
this data gives a suggestion that the sailors are taller than soldiers.
Aim: To test the claim that sailors are taller than soldiers
H0: x = y
H1: x > y
Level of significance: Appropriate level of significance is 5% (chosen)
The tabulated value of t at 5% level of significance for 14 degrees of freedom in a right
tailed test is 1.761. [t,n1+n2-2=t0.05,14=t0.05,14=1.761]
Code:
sailors<-c(63,65,68,69,71,72)
soldiers<-c(61,62,65,66,69,69,70,71,72,73)
t.test(sailors,soldiers, alternative = "greater", conf.level = 0.95)
Output:
Welch Two Sample t-test
data: sailors and soldiers
t = 0.10388, df = 12.228, p-value = 0.4595
alternative hypothesis: true difference in means is greater than 0
-3.226071 Inf
sample estimates:
mean of x mean of y
68.0 67.8
31

Experiment #9-F:
Random samples from two normal populations are given below
Sample 1 16 26 27 23 24 22
Sample 2 33 42 35 32 28 31
Do the population variances differ significantly?
Aim: To check whether the population variances differ significantly

H0: σx2=σy2
H1: σx2 ≠ σy2
Level of significance: Appropriate level of significance is 5% (chosen)
The table value of F at 5% L.O.S for (5,5) d.f is 5.05.
Code:
data1<-c(16,26,27,23,24,22)
data2<-c(33,42,35,32,28,31)
F<-var.test(data1,data2)
F
Output:
F test to compare two variances
data: data1 and data2
F = 0.6696, num df = 5, denom df = 5, p-value = 0.6706
alternative hypothesis: true ratio of variances is not equal to 1
0.09369826 4.78524246
sample estimates:
Experiment #9-G:
In a large manufacturing factory, a survey was conducted regarding three types of bonus
schemes. Total employees were divided into four categories namely laborers, clerks,
technicians and executives. The results obtained by way of opinion survey are presented
in the form of contingency table as given below. Test the good ness of fit at 5% level of
significance.
EMPLOYEES BONUS SCHEMES
CATEGORY Type 1 Type 2 Type 3
Labour 190 243 197

Clerks 82 44 44
Technicians 23 78 34
Executives 5 12 8
Aim: To test goodness of fit at given level of significance.

H0: Factors in the contingency table are independent.
H1: Factors in the contingency table are dependent.
level of significance is 5%(chosen)
32

BONUS SCHEMES
EMPLOYEES Type 1 Type 2 Type 3
CATEGORY TOTAL
Labour 190 243 197 630

Expected Count 196.9 247.4 185.7
Clerks 82 44 44 170
Technicians 23 78 34 135
Executives 5 12 8 25
Total 300 377 283 960
Code:
M<-as.table(rbind(c(190,243,197),c(82,44,44),c(23,78,34),c(5,12,8)))
dimnames(M)<-list(empcategory=c("labour","clerks","technicians","executives"),
bonuschemes=c("type 1","type 2","type 3"))
xsq<-chisq.test(M)
xsq
Output:
Pearson's Chi-squared test
data: M
X-squared = 48.101, df = 6, p-value = 1.128e-08
Conclusion:
The calculated value of χ2=48.101
The table value of χ20.05,6 =12.59
33

WEEK#10
Experiment #10-A:
Find the Karl Pearson’s correlation coefficient to the given data
X 16 21 26 23 28 24 17 22 21
AY 33 38 50 39 52 47 35 43 41
Aim:To find the correlation coefficient for the given data
Code:
x<-c(16,21,26,23,28,24,17,22,21)
y<-c(33,38,50,39,52,47,35,43,41)
cor(x,y)
Output:
[1] 0.9471715
Experiment #10-B:
Find the Karl Pearson’s correlation coefficient for the following data on
heights(inches) of fathers (x) and their sons(y)
X 65 66 67 67 68 69 70 72
Y 67 68 65 68 72 72 69 71
Aim:To find the correlation coefficient for the given data
Code:
x<-c(65,66,67,67,68,69,70,72)
y<-c(67,68,65,68,72,72,69,71)
cor(x,y)
Output:
[1] 0.6030227
Experiment #10-C:
Fit a linear regression of y on x for the following data
X 1 2 3 4 5 6 7 8 9
y 11 12 13 14 15 16 17 18 19
Aim:To fit a linear regression equation of y on x
Code:
x=c(1:9)
y=c(11:19)
lm(y~x)
summary(lm(y~x))
34

Output:
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
10 1
> summary(lm(y~x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-9.006e-16 -2.472e-16 -2.031e-16 -1.370e-16 1.724e-15
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.000e+01 5.784e-16 1.729e+16 <2e-16 ***
x 1.000e+00 1.028e-16 9.729e+15 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.962e-16 on 7 degrees of freedom

Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 9.465e+31 on 1 and 7 DF, p-value: < 2.2e-16
Experiment #10-D:
Fit a multiple linear regression using iris data by considering the response variable as
Sepal.length
Code:
lm(iris$Sepal.Length~(iris$Sepal.Width+iris$Petal.Length+iris$Petal.Width))
Output:
Call:
lm(formula = iris$Sepal.Length ~ (iris$Sepal.Width + iris$Petal.Length +
iris$Petal.Width))
Coefficients:
(Intercept) iris$Sepal.Width iris$Petal.Length
1.8560 0.6508 0.7091
iris$Petal.Width
-0.5565
35

Statistics With R-Programming Lab Manual

Uploaded by

Statistics With R-Programming Lab Manual

Uploaded by

RAGHU ENGINEERING COLLEGE

Dakamarri(vill), Bheemunipatnam Mandal,

IV B.Tech. - I-Semester (AR17)

STUDENT LABORATORY MANUAL

STATISTICS WITH R-PROGRAMMING LAB

Dr. P. Mallikharjuna Rao,

Statistics with R - Programming Lab Manual(AR17)

Sub NAME OF THE EXPERIMENT

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Aim: R program to find mean of the given data

Aim: R program to find mean of the frequency distribution

Statistics with R - Programming Lab Manual(AR17)

Aim: R program to find median of the given data

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

# Giving names to the elements

#Adding element at the end of the list

# Removing the second element of the list

Statistics with R - Programming Lab Manual(AR17)

[1] "Give names to the elements:"

[1] "Add element at the end of the list:"

After removing the second element the list is:

Statistics with R - Programming Lab Manual(AR17)

# 1) Getting the structure of data frame

# 2) Statistical summary and nature of the data

# 3) Extracting the column heading "name"

Statistics with R - Programming Lab Manual(AR17)

# 4) Extracting first two rows of data frame

# 6) Adding new column to data frame

# 7) Adding new row to the existing data frame

# 8) Dropping a column from the data frame

# 9) Dropping a row by number from the data frame

# 10) Subset of data frame

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

The given Distribution is

Statistics with R - Programming Lab Manual(AR17)

The Given Distribution is

Statistics with R - Programming Lab Manual(AR17)

Statistics with R - Programming Lab Manual(AR17)

Conclusion: Since z=-4.5<1.645 we accept null hypothesis H0

Aim: To test the claim at given level of significance using z-test

Statistics with R - Programming Lab Manual(AR17)

Consider the following data from immer table

Statistics with R - Programming Lab Manual(AR17)

Aim: To test claim using t-test

Statistics with R - Programming Lab Manual(AR17)

Aim: To check whether the population variances differ significantly

Labour 190 243 197

Aim: To test goodness of fit at given level of significance.

Statistics with R - Programming Lab Manual(AR17)

Labour 190 243 197 630

Statistics with R - Programming Lab Manual(AR17)

Aim:To find the correlation coefficient for the given data

Aim:To fit a linear regression equation of y on x

Statistics with R - Programming Lab Manual(AR17)

Residual standard error: 7.962e-16 on 7 degrees of freedom

Statistics with R - Programming Lab Manual(AR17)

You might also like