100% found this document useful (9 votes)
3K views35 pages

Statistics With R-Programming Lab Manual

1) R programs were written to calculate measures of central tendency (mean, median, mode) for a dataset. Functions were defined to calculate the mean, median and mode of a vector as well as the mean of a frequency distribution. 2) A R program was written to perform basic operations on matrices such as defining matrices, printing matrices, and obtaining their dimensions. Matrices A and B were defined, printed and their row and column dimensions were obtained. 3) R programs were developed to calculate measures of central tendency, perform basic matrix operations and define functions for calculating statistics. Functions were used to modularize code and make it reusable.

Uploaded by

vijay nagireddy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
100% found this document useful (9 votes)
3K views35 pages

Statistics With R-Programming Lab Manual

1) R programs were written to calculate measures of central tendency (mean, median, mode) for a dataset. Functions were defined to calculate the mean, median and mode of a vector as well as the mean of a frequency distribution. 2) A R program was written to perform basic operations on matrices such as defining matrices, printing matrices, and obtaining their dimensions. Matrices A and B were defined, printed and their row and column dimensions were obtained. 3) R programs were developed to calculate measures of central tendency, perform basic matrix operations and define functions for calculating statistics. Functions were used to modularize code and make it reusable.

Uploaded by

vijay nagireddy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 35

RAGHU ENGINEERING COLLEGE

Dakamarri(vill), Bheemunipatnam Mandal,


Visakhapatnam Dist, Andhra Pradesh, PIN 531162
(Approved by AICTE, New Delhi, and Affiliated to Jawaharlal Nehru Technological
University: Kakinada (AP), Accredited by NBA and NAAC ‘A’ Grade)

2022-23

IV B.Tech. - I-Semester (AR17)

STUDENT LABORATORY MANUAL

for

STATISTICS WITH R-PROGRAMMING LAB

Prepared by

Dr. P. Mallikharjuna Rao,

Professor, B.S.&H.

Statistics with R - Programming Lab Manual(AR17)


RAGHU ENGINEERING COLLEGE
IV-B.Tech.,I-Sem.,(ME)-AR17
Lab Programs for Statistics with R-Programming

Sub NAME OF THE EXPERIMENT


Week
Exp.
1 Installation of R in Windows and Linux environment
2 a Write a R program to find measures of central tendency
b Write a R program to perform different operations on Matrices
Write a R program to store data into a List and perform different
3 a
operations
Write a R program to store data into Data frame and perform
b
different operations
4 a Write a R program to find biggest of three elements
b Write a R program to find roots of a quadratic equation
c Write a R program to find sum of elements of vector
5 a Write a R program to find factorial of a number using recursion
b Write a R program to find gcd of two numbers using recursion
Write a R program to mean, variance, standard deviation for the
6 a
given discrete probability distribution
Write a R program to mean, variance, standard deviation for the
b
given continuous probability distribution
Write a R program to represent the given data in the form of
7
graphs using built in functions
8 a Write a R program to fit Binomial distribution to the given data
b Write a R program to fit Poisson distribution to the given data
9 a Write a R program for Z test
b Write a R program for t test
c Write a R program for F test
d Write a R program for Chi-square test
10 a Write a R program to fit a linear regression
b Write a R program to fit multiple linear regression

Statistics with R - Programming Lab Manual(AR17)


WEEK #1
Experiment #1:
Installation of R software in Windows and Linux environments
Requirements Analysis:
Installation of R in Windows OS: The Comprehensive R Archive Network (CRAN)
is a network of websites that host the R program and that mirror the original R website.
The benefit of having this network of websites is improved download speeds. For all
intents and purposes, CRAN is the R website and holds downloads (including old
versions of software) and documentation. R can be installed in Windows7/8/10/Vista
and supports both the 32-bit and 64-bit versions. Go to the CRAN website and select
the latest installer R 4.2.1 for Windows and download the .exe file. Double click on the
download file and select Run as Administrator form the popup menu. Select the
language to be used for installation and follow the directions. The installation folder for
R can be found in C:\Programs\R. The steps for installing R:
1. Click on the link https://cran.r-project.org/bin/windows/base/ which redirects
you to the download page.
2. Select the latest installer R-4.2.1 for installation and download the same. After
download, clicking on the setup file opens the dialog box.
3. Click on the ‘Next’ button starts the installation process. This redirects you to
the license window and selecting ‘Next’.
4. After selecting the Next button from the previous step the installation folder path
is required. Select the desired folder for installation; it is advisable to select the
C directory for smooth running of the program.
5. Next select the components for installation based on the requirements of your
operating system to avoid unwanted use of disk space.
6. In the next dialog box, we need to select the start menu folder. Here, it is better
to go with the default option given by the installer.
7. After setting up the Start menu folder, check the additional options for
completing the setup.
8. After clicking next from the previous step, the installation procedure ends and
the window is displayed. Click ‘Finish’ to exist from the installation window.

Statistics with R - Programming Lab Manual(AR17)


Installing and Configuring R-Studio in Windows: The Integrated Development
Environment(IDE) for R is R Studio and it provides a variety of features such as an
editor with direct code execution and syntax highlighting, a console, tools for plotting
graphs, history lookup, debugging, and an environment for workspace creation. R
Studio can be installed in any of the Windows platforms such as Windows 7/8/10/Vista
and can be configured within a few minutes. The basic requirement is R 2.11.1+
version. The following are the steps involved to setup R Studio:

1) Download the latest version of R Studio just by clicking on the link provided
here https://www.rstudio.com/products/rstudio/download/ and it redirects you to
download page. There are two versions of R Studio available – desktop and
server. Based on your usage and comfort, select the appropriate version to
initiate your download.
2) Download the .exe file and double click on it to initiate the installation.
3) Click on the ‘Next’ button and it redirects you to select the installation folder.
Select ‘C:\’ as your installation directory since R and R Studio must be installed
in the same directory to avoid path issues for running R programs.
4) Click ‘Next’ to continue and a dialog box asking you to select the Start menu
folder opens. It is advisable to create your own folder to avoid any possible
confusion and click on Install button to install R Studio.
After completion of installation, clicking ‘Next’ from the previous step, the installation
procedure ends and the window is displayed. Click ‘Finish’ to exist from the
installation window

Installation of R in Ubuntu: Go to software center and search for R Base and install.
Then open terminal and enter R to get R command prompt in terminal.
Installation of R-studio in Ubuntu: Open terminal and type the following commands

Statistics with R - Programming Lab Manual(AR17)


5

Statistics with R - Programming Lab Manual(AR17)


WEEK #2
Experiment #2-A:
Question: Write a R program to find the measures of central tendency (mean, median,
and mode).

Aim: R program to find mean of the given data

Code:
m=function()
{
print("Enter the elements of vector:")
x=scan()
n=length(x)
sum=0
for(i in 1:n)
{
sum=sum+x[i]
}
mean1=sum/n
cat("Mean of the vector is ",mean1)
}

Output:
> m()
[1] "Enter the elements of vector:"
1: 1
2: 4
3: 6
4: 3
5: 5
6:
Read 5 items
Mean of the vector is 3.8

Experiment #2-B:

Aim: R program to find mean of the frequency distribution

Code:
m1=function()
{
print("Enter the elements of vector:")
x=scan()
print(table(x))
f=as.numeric(table(x))
x1=sort(unique(x))
sum=0
mean1=sum(f*x1)/sum(f)
cat("Mean of the vector is ",mean1)
6

Statistics with R - Programming Lab Manual(AR17)


}

Output:
> m1()
[1] "Enter the elements of vector:"
1: 1
2: 1
3: 1
4: 2
5: 3
6: 4
7: 5
8: 4
9: 6
10: 7
11:
Read 10 items
x
1234567
3112111
Mean of the vector is 3.4

Experiment #2-C:

Aim: R program to find median of the given data

Code:
med=function()
{
print("Enter the elements of vector:")
x=scan()
n=length(x)
x1=sort(x)
print(x1)
if(n%%2==0)
{
me=(x1[n/2+1]+x1[(n+1)/2])/2
} else {
me=x1[n/2+1]
}
cat("Median of the vector is ",me,"\n")
}

Output:
> med()
[1] "Enter the elements of vector:"
1: 1
2: 2
3: 3

Statistics with R - Programming Lab Manual(AR17)


4: 3
5: 4
6: 5
7: 6
8:
Read 7 items
[1] 1 2 3 3 4 5 6
Median of the vector is 3

Experiment #2-D:
Aim: R program to find mode of the given data

Code:
mod=function()
{
print("Enter the elements of vector:")
x=scan()
print(table(x))
f=as.numeric(table(x))
x1=sort(unique(x))
mf=max(f)
for(i in 1:length(f))
{
if(f[i]==mf)
cat("\nMode is ",x1[i])
}
}

Output:
mod()
[1] "Enter the elements of vector:"
1: 1
2: 1
3: 2
4: 3
5: 4
6: 5
7: 5
8: 6
9: 6
10: 7
11: 7
12:
Read 11 items
x
1234567
2111222

Mode is 1
Mode is 5

Statistics with R - Programming Lab Manual(AR17)


Mode is 6
Mode is 7

Experiment #2-E:
Aim: R program to perform different operations on matrices

Code:
read=function()
{
A=matrix(c(1:9),nrow=3,ncol=3,byrow=T)
B=matrix(c(10:18),nrow=3,ncol=3,byrow=T)
m1=nrow(A)
n1=ncol(A)
m2=nrow(B)
n2=ncol(B)
cat("Matrix A:\n")
print(A)
cat("Matrix B:\n")
print(B)
if(m1==m2 && n1==n2)
{
cat("Sum of the matrices is A+B=\n")
print(A+B)
} else
cat("\n Addition of matrices is not possible")
if(n1==m2)
{
cat("Product of the matrices is A*B=\n")
print(A%*%B)
} else
cat("\n Multiplication of matrices is not possible")
cat("Transpose of the Matrix A is:\n")
print(t(A))
cat("Transpose of the Matrix B is:\n")
print(t(B))
}

Output:

read()
Matrix A:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
Matrix B:
[,1] [,2] [,3]
[1,] 10 11 12
[2,] 13 14 15
[3,] 16 17 18

Statistics with R - Programming Lab Manual(AR17)


Sum of the matrices is A+B=
[,1] [,2] [,3]
[1,] 11 13 15
[2,] 17 19 21
[3,] 23 25 27
Product of the matrices is A*B=
[,1] [,2] [,3]
[1,] 84 90 96
[2,] 201 216 231
[3,] 318 342 366
Transpose of the Matrix A is:
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
Transpose of the Matrix B is:
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18

10

Statistics with R - Programming Lab Manual(AR17)


WEEK#3

Experiment #3-A:

Aim: R Program to create a list containing a vector, a matrix and a list and write
a code for the following.
# 1) Give names to the elements in the list
# 2) Add element at the end of the list
# 3) Remove the second element

Code:
# Creating a list
a=c(23,4,5,56)
b=matrix(data=1:9,nrow=3)
c=list(35,"ravi","Male")
lst=list(a,b,c)
print(lst)

# Giving names to the elements


print("Give names to the elements:")
names(lst)=c("vector","matrix","info")
print(lst)

#Adding element at the end of the list


print("Add element at the end of the list:")
lst[[4]]=c(1,2,3)
print(lst)

# Removing the second element of the list


cat("After removing the second element the list is:\n")
lst[[2]]=NULL
print(lst)

Output:
[[1]]
[1] 23 4 5 56

[[2]]
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9

[[3]]
[[3]][[1]]
[1] 35

[[3]][[2]]
[1] "ravi"

11

Statistics with R - Programming Lab Manual(AR17)


[[3]][[3]]
[1] "Male"

[1] "Give names to the elements:"


$vector
[1] 23 4 5 56

$matrix
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9

$info
$info[[1]]
[1] 35

$info[[2]]
[1] "ravi"

$info[[3]]
[1] "Male"

[1] "Add element at the end of the list:"


$vector
[1] 23 4 5 56

$matrix
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9

$info
$info[[1]]
[1] 35

$info[[2]]
[1] "ravi"

$info[[3]]
[1] "Male"

[[4]]
[1] 1 2 3

After removing the second element the list is:


$vector
12

Statistics with R - Programming Lab Manual(AR17)


[1] 23 4 5 56

$info
$info[[1]]
[1] 35

$info[[2]]
[1] "ravi"

$info[[3]]
[1] "Male"

[[3]]
[1] 1 2 3

Experiment #3-B:
Aim: R program to create a data frame of student with four given vectors and write a
code
# 1) to get the structure of a given data frame.
# 2) to get the statistical summary and nature of the data of a given data frame.
# 3) to extract specific column from a data frame using column name.
# 4) to extract first two rows from a given data frame.
# 5) to extract 3rd and 5th rows with 1st and 3rd columns from a given data frame.
# 6) to add a new column in a given data frame.
# 7) to add new row(s) to an existing data frame.
# 8) to drop column(s) by name from a given data frame.
# 9) to drop row(s) by number from a given data frame.
# 10) to extract the records whose grade is greater than 9.

Code:
# creating a data frame
r.no=c("17981A0461","17981A0462","17981A0463","17981A0464","17981A0465","1
7981A0466")
name=c("ramu","ahmed","samuel","singh","begum","prasanthi")
grade=c(8.4,9.9,7.5,8.7,9.1,6.8)
sex=c("M","M","M","M","F","F")
df_stud=data.frame(r.no,name,grade,sex)
print(df_stud)

# 1) Getting the structure of data frame


print("The structure of the data frame is :")
print(str(df_stud))

# 2) Statistical summary and nature of the data


print("The statistical summary and nature of the data is :")
print(summary(df_stud))

# 3) Extracting the column heading "name"


13

Statistics with R - Programming Lab Manual(AR17)


print("The list of names in the column 'name' are :")
print(df_stud$name)

# 4) Extracting first two rows of data frame


print("The first two rows of the data frame are:")
print(df_stud[1:2,])

# 5) Extracting 3rd and 5th rows with 1st and 3rd columns
print("The 3rd and 5th rows with 1st and 3rd columns are:")
print(df_stud[c(3,5),c(1,3)])

# 6) Adding new column to data frame


print("Adding new column named "Date of Birth :")
df_stud$dob=c("14-01-1999","4-6-1999","8-12-1998","25-7-1999","20-9-1998","1-2-
1999")
print(df_stud)

# 7) Adding new row to the existing data frame


print("Adding new row to the data frame:")
new_df_stud=data.frame(r.no="17981A467",name="lavanya",grade=8.9,sex="F",dob="
4-6-7-1999")
print(rbind(df_stud,new_df_stud))

# 8) Dropping a column from the data frame


print("Dropping a column named r.no:")
df_stud$r.no=NULL
print(df_stud)

# 9) Dropping a row by number from the data frame


print("Dropping a row number 4 from the data frame:")
print(df_stud[-4,])

# 10) Subset of data frame


print("Data frame with grade>9")
print(subset(df_stud,grade>9))

Output:
r.no name grade sex
1 17981A0461 ramu 8.4 M
2 17981A0462 ahmed 9.9 M
3 17981A0463 samuel 7.5 M
4 17981A0464 singh 8.7 M
5 17981A0465 begum 9.1 F
6 17981A0466 prasanthi 6.8 F
[1] "The structure of the data frame is :"
'data.frame': 6 obs. of 4 variables:
$ r.no : Factor w/ 6 levels "17981A0461","17981A0462",..: 1 2 3 4 5 6
$ name : Factor w/ 6 levels "ahmed","begum",..: 4 1 5 6 2 3
$ grade: num 8.4 9.9 7.5 8.7 9.1 6.8
$ sex : Factor w/ 2 levels "F","M": 2 2 2 2 1 1
NULL
14

Statistics with R - Programming Lab Manual(AR17)


[1] "The statistical summary and nature of the data is :"
r.no name grade sex
17981A0461:1 ahmed :1 Min. :6.800 F:2
17981A0462:1 begum :1 1st Qu.:7.725 M:4
17981A0463:1 prasanthi:1 Median :8.550
17981A0464:1 ramu :1 Mean :8.400
17981A0465:1 samuel :1 3rd Qu.:9.000
17981A0466:1 singh :1 Max. :9.900
[1] "The list of names in the column 'name' are :"
[1] ramu ahmed samuel singh begum prasanthi
Levels: ahmed begum prasanthi ramu samuel singh
[1] "The first two rows of the data frame are:"
r.no name grade sex
1 17981A0461 ramu 8.4 M
2 17981A0462 ahmed 9.9 M
[1] "The 3rd and 5th rows with 1st and 3rd columns are:"
r.no grade
3 17981A0463 7.5
5 17981A0465 9.1
[1] "Adding new column named "
[1] "Date of Birth :"
r.no name grade sex dob
1 17981A0461 ramu 8.4 M 14-01-1999
2 17981A0462 ahmed 9.9 M 4-6-1999
3 17981A0463 samuel 7.5 M 8-12-1998
4 17981A0464 singh 8.7 M 25-7-1999
5 17981A0465 begum 9.1 F 20-9-1998
6 17981A0466 prasanthi 6.8 F 1-2-1999
[1] "Adding new row to the data frame:"
r.no name grade sex dob
1 17981A0461 ramu 8.4 M 14-01-1999
2 17981A0462 ahmed 9.9 M 4-6-1999
3 17981A0463 samuel 7.5 M 8-12-1998
4 17981A0464 singh 8.7 M 25-7-1999
5 17981A0465 begum 9.1 F 20-9-1998
6 17981A0466 prasanthi 6.8 F 1-2-1999
7 17981A467 lavanya 8.9 F 4-6-7-1999
[1] "Dropping a column named "
[1] "r.no:"
name grade sex dob
1 ramu 8.4 M 14-01-1999
2 ahmed 9.9 M 4-6-1999
3 samuel 7.5 M 8-12-1998
4 singh 8.7 M 25-7-1999
5 begum 9.1 F 20-9-1998
6 prasanthi 6.8 F 1-2-1999
[1] "Dropping a row number 4 from the data frame:"
name grade sex dob
1 ramu 8.4 M 14-01-1999
2 ahmed 9.9 M 4-6-1999
3 samuel 7.5 M 8-12-1998
15

Statistics with R - Programming Lab Manual(AR17)


5 begum 9.1 F 20-9-1998
6 prasanthi 6.8 F 1-2-1999
[1] "Data frame with grade>9"
name grade sex dob
2 ahmed 9.9 M 4-6-1999
5 begum 9.1 F 20-9-1998

16

Statistics with R - Programming Lab Manual(AR17)


WEEK#4

Experiment #4-A:
Aim:R program to find biggest of 3 numbers

Code:
big=function()
{
x=as.numeric(readline("Enter x value:"))
y=as.numeric(readline("Enter y value:"))
z=as.numeric(readline("Enter z value:"))
t=0
if(x>y)
t=x else
t=y
if(t>z)
cat(t," is big") else
cat(z," is big")
}
big()

Output:
Enter x value:2
Enter y value:1
Enter z value:5
5 is big

Experiment #4-B:
Aim: R program to find roots of a quadratic equation

Code:
roots=function()
{
a=as.numeric(readline("Enter a value:"))
b=as.numeric(readline("Enter b value:"))
c=as.numeric(readline("Enter c value:"))
t=b^2-(4*a*c)
if(t<0)
{
cat("Roots are imaginary and roots are ",(-b/(2*a)),"+i",
((sqrt(-t))/(2*a)),"and",(-b/(2*a)),"-i",((sqrt(-t))/(2*a)))
} else
if(t==0)
{
cat("Roots are real and equal and root is ",(-b/(2*a)))
} else
{
cat("Roots are real and unequal\n")
cat("Root1=",(-b+sqrt(t))/(2*a),"\nRoot2=",(-b-sqrt(t))/(2*a))
17

Statistics with R - Programming Lab Manual(AR17)


}
}

Output:
> roots()
Enter a value:1
Enter b value:4
Enter c value:1
Roots are real and unequal
Root1= -0.2679492
Root2= -3.732051
> roots()
Enter a value:1
Enter b value:2
Enter c value:1
Roots are real and equal and root is -1
> roots()
Enter a value:1
Enter b value:1
Enter c value:1
Roots are imaginary and roots are
-0.5 +i 0.8660254 and
-0.5 -i 0.8660254

Experiment #4-C:
Aim: R program to find sum of elements of vector and to find minimum and maximum
elements of vectors

Code:
vec=function()
{
print("Enter the elements of vector:")
x=scan()
n=length(x)
sum=0
for(i in 1:n)
sum=sum+x[i]
max=min=x[1]
for(i in 1:n)
{
if(x[i]<min)
min=x[i]
if(x[i]>max)
max=x[i]
}
cat(" sum of vector elements=",sum,"\n","Minimum element of vector
is:",min,"\n","Maximum element of vector is:",max,"\n")
}

18

Statistics with R - Programming Lab Manual(AR17)


Output:
[1] "Enter the elements of vector:"
1: -5
2: -4
3: 0
4: 1
5: 5
6:
Read 5 items
sum of vector elements= -3
Minimum element of vector is: -5
Maximum element of vector is: 5

19

Statistics with R - Programming Lab Manual(AR17)


WEEK#5

Experiment #5-A:
Aim: R program to find Factorial of a number using recursive function

Code:
fact=function()
{
n=as.numeric(readline("Enter n value:"))
f=fact1(n)
if(n>=0)
cat("Factorial of ",n," is ",f,"\n")
}
fact1=function(n)
{
if(n>=0)
{
if(n==0)
return(1) else
return(n*fact1(n-1))
} else
print("Factorial of negetive number is not possible to compute")
}

Output:
> fact()
Enter n value:-1
[1] "Factorial of negetive number is not possible to compute"
> fact()
Enter n value:0
Factorial of 0 is 1
> fact()
Enter n value:8
Factorial of 8 is 40320

Experiment #5-B:
Aim: R program to find GCD of two numbers

Code:
gcd=function()
{
x=as.numeric(readline("Enter x value:"))
y=as.numeric(readline("Enter y value:"))
g=gcd1(x,y)
cat("GCD of ",x," and ",y," is ",g,"\n")
}
gcd1=function(x,y)
{
if(y!=0)
return(gcd1(y,x%%y))
20

Statistics with R - Programming Lab Manual(AR17)


else
return(x)
}

Output:
> gcd()
Enter x value:5
Enter y value:7
GCD of 5 and 7 is 1
> gcd()
Enter x value:125
Enter y value:35
GCD of 125 and 35 is 5

21

Statistics with R - Programming Lab Manual(AR17)


WEEK#6

Experiment #6-A:
Aim: R program to mean, variance, standard deviation for the given discrete probability
distribution.

Code:
discrete=function()
{
print("Enter the values of x")
x=scan()
print("Enter the values of p")
p=scan()
y=DiscreteDistribution(supp=x,prob=p)
cat("Mean of the probability distribution is ",E(y))
cat("\nVariance of the probability distribution is ",var(y))
cat("\nStandard Deviation of the probability distribution is ",sd(y))
cat("\n The Distribution function is \n","x ",x,sep="\t","\n","F(x) ",cumsum(p))
}

Output:
> discrete()
[1] "Enter the values of x"
1: 0
2: 1
3: 2
4:
Read 3 items
[1] "Enter the values of p"
1: 0.3
2: 0.5
3: 0.2
4:
Read 3 items
Mean of the probability distribution is 0.9
Variance of the probability distribution is 0.49
Standard Deviation of the probability distribution is 0.7
The Distribution function is
x 0 1 2
F(x) 0.3 0.8 1

22

Statistics with R - Programming Lab Manual(AR17)


Experiment #6-B:
Aim: R program to mean, variance, standard deviation for the given continuous
probability distribution
# for the given probability density function f(x)=3*x^2,0<x<1

Code:
contin=function()
{
f=function(x) 3*x^2
p=integrate(f,lower=0.14,upper=0.71)
print("The probability of x lies between 0.14 to 0.17 is ")
print(p)
x=AbscontDistribution(d=f,low1 =0,up1 =1)
cat("Mean of the probability distribution is ",E(x))
cat("\nVariance of the probability distribution is ",var(x))
cat("\nStandard Deviation of the probability distribution is ",sd(x))
#cat("\n The Distribution function is \n","x ",x,sep="\t","\n","F(x) ",cumsum(p))
}

Output:
> contin()
[1] "The probability of x lies between 0.14
to 0.17 is "
0.355167 with absolute error < 3.9e-15
Mean of the probability distribution is
0.7496337
Variance of the probability distribution is
0.03768305
Standard Deviation of the probability
distribution is 0.1941212

23

Statistics with R - Programming Lab Manual(AR17)


WEEK#7
Experiment #7:
Aim: R Program to print data in different graph formats

Code:
#Scatter plot
plot(iris$Sepal.Length,iris$Sepal.Width,type="p")
#Histogram
par(mfrow=c(1,2))
hist(iris$Sepal.Length,main="First")
hist(iris$Sepal.Width,main="Second")
par(mfrow=c(1,2))
hist(iris$Petal.Length,main="Third")
hist(iris$Petal.Width,main="Fourth")
#Pie-chart
pie(table(iris$Species))
#Box-plot
boxplot(iris$Sepal.Length,iris$Sepal.Width,iris$Petal.Length,iris$Petal.Width)
#Bar-plot
barplot(head(iris$Sepal.Length),xlab="Sepal lenght")

Output:

24

Statistics with R - Programming Lab Manual(AR17)


25

Statistics with R - Programming Lab Manual(AR17)


WEEK#8
Experiment #8-A:
Aim: R program to fit Binomial distribution to the given data

Code:
fit_binom=function()
{
n=as.integer(readline("enter the no. of coins tossed: "))
x=0:n
print("enter the values of f ")
f=scan()
ex_freq=0
cat("\n The given Distribution is \n x",x,sep="\t","\n f",f,"\n")
N=sum(f)
if(length(x)==length(f))
{
a=as.logical(readline(prompt="Is the coin unbiased ? :Enter T for TRUE
or F for FALSE \n"))
if(a==T)
p=0.5
else
{
meen=sum(x*f)/N
p=meen/n
}
for(i in 1:(n+1))
ex_freq[i]=N*(dbinom(i-1,n,p))
cat("The expected frequencies are \n",ex_freq)
cat("\n The fitted Binomial Distribution is \n x",x,sep="\t","\n
f",round(ex_freq),"\n")
}else
print("No. of observations in x and f must be equal")
}

Output:
> fit_binom()
enter the no. of coins tossed: 3
[1] "enter the values of f "
1: 2
2: 4
3: 5
4: 6
5:
Read 4 items

The given Distribution is


x 0 1 2 3
f 2 4 5 6
Is the coin unbiased ? :Enter T for TRUE or F for FALSE
T
26

Statistics with R - Programming Lab Manual(AR17)


The expected frequencies are
2.125 6.375 6.375 2.125
The fitted Binomial Distribution is
x 0 1 2 3
f 2 6 6 2

Experiment #8-B:
Aim: R program to fit Poisson distribution to the given data

Code:
fit_poisson=function()
{
print("enter the values of x:")
x=scan()
print("enter the values of f:")
f=scan()
cat("\n The Given Distribution is \nx:",x,sep="\t","\nf:",f,"\n")
ex_freq=0
N=sum(f)
meen=sum(x*f)/N
for(i in 1:length(x))
ex_freq[i]=N*(dpois(i-1,meen))
cat("The expected frequencies are \n",ex_freq)
cat("\n The fitted Poisson Distribution is
\nx:",x,sep="\t","\nf:",round(ex_freq),"\n")
}

Output:
[1] "enter the values of x:"
1: 0
2: 1
3: 2
4: 3
5: 4
6:
Read 5 items
[1] "enter the values of f:"
1: 10
2: 9
3: 8
4: 7
5: 6
6:
Read 5 items

The Given Distribution is


x: 0 1 2 3 4
f: 10 9 8 7 6
The expected frequencies are
6.950958 12.16418 10.64365 6.208798 2.716349
27

Statistics with R - Programming Lab Manual(AR17)


The fitted Poisson Distribution is
x: 0 1 2 3 4
f: 7 12 11 6 3

28

Statistics with R - Programming Lab Manual(AR17)


WEEK#9
9-a Z-test
Experiment #9-A:

A manufacturer claims that the mean lifetime of a light bulb is more than 10,000 hrs. In
a sample of 30 light bulbs, it was found that they only last 9,900 hrs on average. Assume
that the population standard deviation is 120 hrs. at 0.05 significance level can we reject
the claim by the manufacturer.
Aim: To test the claim
H0: mu=10000
H1: mu>10000
Alpha=0.05=5%
Critical value from the z-table is 1.645

Code:
xbar=9900
mu=10000
n=30
sigma=120
z=(xbar-mu)/(sigma/sqrt(n))

Output:
>z
[1] -4.564355

Conclusion: Since z=-4.5<1.645 we accept null hypothesis H0

Experiment #9-B:
Suppose the mean weight of king penguins found in an Antarctic colony last year was
15.4 kg. in a sample of 35 penguins same time this year in the same colony, the mean
penguin weight is 14.6 kg . Assume that the population standard deviation is 2.5kg. at
0.05 significance level, can we reject the null hypothesis that the mean penguin weight
does not differ from last year.

Aim: To test the claim at given level of significance using z-test


H0:mu=15.4
H1:mu≠15.4
Alpha=0.05=5%
Critical values from the z-table are ±1.96

Code:
xbar=14.5
mu=15.4
n=35
sigma=2.5
z=(xbar-mu)/(sigma/sqrt(n))

Output:
[1] -2.129789
29

Statistics with R - Programming Lab Manual(AR17)


Conclusion: Since |z|=2.12>1.96 we reject null hypothesis H0

Experiment #9-C:
t-test

Consider the following data from immer table


Loc Var Y1 Y2
1 UF M 81.0 80.7
2 UF S 105.4 82.3
3 UF V 119.7 80.4
4 UF T 109.7 87.2
5 UF P 98.3 84.2
6 W M 146.6 100.4
Assume that the above data follows the normal distribution; find the 95% confidence
interval estimate of the difference between the mean barley yields between years 1931
and 1932
Aim: To test is there any significant difference between the mean barley yields between
years 1931 and 1932
H0 : =0
H1 : ≠0
Alpha=0.05=5%
Critical value from the t-table is
Note:To get the critical value of t from R-console type the following command
qt(1-(alpha/2),df=n-1)

Code:
# creating a data frame
Y1=c(81.0,105.4,119.7,109.7,98.3,146.6)
Y2=c(80.7,82.3,80.4,87.2,84.2,100.4)
immer=data.frame(Y1,Y2)
t.test(immer$Y1,immer$Y2,paired=TRUE)

Output:
Paired t-test
data: immer$Y1 and immer$Y2
t = 3.324, df = 29, p-value = 0.002413
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
6.121954 25.704713
sample estimates:
mean of the differences
15.91333

Experiment #9-D:
Five measurements of tar content of certain kind of cigarette yielded 14.5, 14.2, 14.4,
14.3, 14.6 milligrams per cigarette. Show that the difference between the mean of this
30

Statistics with R - Programming Lab Manual(AR17)


sample and the average tar claimed by the manufacturer μ=14.0 mg/cigarette is
significant at α=0.05.

Aim: To test claim using t-test


H0:  = 14.0
H1:   14.0
Level of significance: Appropriate level of significance is 5% (given)
Inference: The tabulated value of t at 5% level of significance for 4 degrees of freedom
in a two tailed test is 2.776 [t/2,n-1=t0.05/2,5-1=t0.025,4=2.776]
Here, tcal > t/2,n-1 . So, we reject H0. Hence we conclude that   14.0

Code:
data<-c(14.5, 14.2, 14.4, 14.3, 14.6)
t.test(data,mu=14.0)
Output:
One Sample t-test
data: data
t = 5.6569, df = 4, p-value = 0.004813
alternative hypothesis: true mean is not equal to 14
95 percent confidence interval:
14.20368 14.59632
sample estimates:
mean of x 14.4

Experiment #9-E:
The heights of 6 randomly chosen sailors are 63,65,68,69,71,72 inches. Those of 10
randomly chosen soldiers are 61,62,65,66,69,69,70,71,72,73 inches. Discuss whether
this data gives a suggestion that the sailors are taller than soldiers.
Aim: To test the claim that sailors are taller than soldiers
H0: x = y
H1: x > y
Level of significance: Appropriate level of significance is 5% (chosen)
The tabulated value of t at 5% level of significance for 14 degrees of freedom in a right
tailed test is 1.761. [t,n1+n2-2=t0.05,14=t0.05,14=1.761]
Code:
sailors<-c(63,65,68,69,71,72)
soldiers<-c(61,62,65,66,69,69,70,71,72,73)
t.test(sailors,soldiers, alternative = "greater", conf.level = 0.95)

Output:
Welch Two Sample t-test
data: sailors and soldiers
t = 0.10388, df = 12.228, p-value = 0.4595
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
-3.226071 Inf
sample estimates:
mean of x mean of y
68.0 67.8

31

Statistics with R - Programming Lab Manual(AR17)


Experiment #9-F:
Random samples from two normal populations are given below

Sample 1 16 26 27 23 24 22
Sample 2 33 42 35 32 28 31
Do the population variances differ significantly?

Aim: To check whether the population variances differ significantly


H0: σx2=σy2
H1: σx2 ≠ σy2
Level of significance: Appropriate level of significance is 5% (chosen)
The table value of F at 5% L.O.S for (5,5) d.f is 5.05.

Code:
data1<-c(16,26,27,23,24,22)
data2<-c(33,42,35,32,28,31)
F<-var.test(data1,data2)
F

Output:
F test to compare two variances
data: data1 and data2
F = 0.6696, num df = 5, denom df = 5, p-value = 0.6706
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.09369826 4.78524246
sample estimates:

Experiment #9-G:
In a large manufacturing factory, a survey was conducted regarding three types of bonus
schemes. Total employees were divided into four categories namely laborers, clerks,
technicians and executives. The results obtained by way of opinion survey are presented
in the form of contingency table as given below. Test the good ness of fit at 5% level of
significance.
EMPLOYEES BONUS SCHEMES
CATEGORY Type 1 Type 2 Type 3

Labour 190 243 197


Clerks 82 44 44
Technicians 23 78 34
Executives 5 12 8

Aim: To test goodness of fit at given level of significance.


H0: Factors in the contingency table are independent.
H1: Factors in the contingency table are dependent.
level of significance is 5%(chosen)

32

Statistics with R - Programming Lab Manual(AR17)


BONUS SCHEMES
EMPLOYEES Type 1 Type 2 Type 3
CATEGORY TOTAL

Labour 190 243 197 630


Expected Count 196.9 247.4 185.7
Clerks 82 44 44 170
Expected Count 53.1 66.8 50.1
Technicians 23 78 34 135
Expected Count 42.2 53.0 39.8
Executives 5 12 8 25
Expected Count 7.8 9.8 7.4
Total 300 377 283 960

Code:
M<-as.table(rbind(c(190,243,197),c(82,44,44),c(23,78,34),c(5,12,8)))
dimnames(M)<-list(empcategory=c("labour","clerks","technicians","executives"),
bonuschemes=c("type 1","type 2","type 3"))
xsq<-chisq.test(M)
xsq

Output:
Pearson's Chi-squared test
data: M
X-squared = 48.101, df = 6, p-value = 1.128e-08

Conclusion:
The calculated value of χ2=48.101
The table value of χ20.05,6 =12.59

33

Statistics with R - Programming Lab Manual(AR17)


WEEK#10
Experiment #10-A:
Find the Karl Pearson’s correlation coefficient to the given data
X 16 21 26 23 28 24 17 22 21
AY 33 38 50 39 52 47 35 43 41
Aim:To find the correlation coefficient for the given data

Code:
x<-c(16,21,26,23,28,24,17,22,21)
y<-c(33,38,50,39,52,47,35,43,41)
cor(x,y)

Output:
[1] 0.9471715

Experiment #10-B:
Find the Karl Pearson’s correlation coefficient for the following data on
heights(inches) of fathers (x) and their sons(y)

X 65 66 67 67 68 69 70 72
Y 67 68 65 68 72 72 69 71

Aim:To find the correlation coefficient for the given data

Code:
x<-c(65,66,67,67,68,69,70,72)
y<-c(67,68,65,68,72,72,69,71)
cor(x,y)

Output:
[1] 0.6030227

Experiment #10-C:
Fit a linear regression of y on x for the following data
X 1 2 3 4 5 6 7 8 9

y 11 12 13 14 15 16 17 18 19

Aim:To fit a linear regression equation of y on x

Code:
x=c(1:9)
y=c(11:19)
lm(y~x)
summary(lm(y~x))

34

Statistics with R - Programming Lab Manual(AR17)


Output:
Call:
lm(formula = y ~ x)

Coefficients:
(Intercept) x
10 1

> summary(lm(y~x))

Call:
lm(formula = y ~ x)

Residuals:
Min 1Q Median 3Q Max
-9.006e-16 -2.472e-16 -2.031e-16 -1.370e-16 1.724e-15

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.000e+01 5.784e-16 1.729e+16 <2e-16 ***
x 1.000e+00 1.028e-16 9.729e+15 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.962e-16 on 7 degrees of freedom


Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 9.465e+31 on 1 and 7 DF, p-value: < 2.2e-16

Experiment #10-D:
Fit a multiple linear regression using iris data by considering the response variable as
Sepal.length

Code:
lm(iris$Sepal.Length~(iris$Sepal.Width+iris$Petal.Length+iris$Petal.Width))

Output:
Call:
lm(formula = iris$Sepal.Length ~ (iris$Sepal.Width + iris$Petal.Length +
iris$Petal.Width))

Coefficients:
(Intercept) iris$Sepal.Width iris$Petal.Length
1.8560 0.6508 0.7091
iris$Petal.Width
-0.5565

35

Statistics with R - Programming Lab Manual(AR17)

You might also like