0% found this document useful (0 votes)
4 views8 pages

R Programming Tutorial

The document discusses R programming including basic operations, vectors, matrices, functions, and graphics. R can be used for statistical analysis, data exploration and modelling. Key functions and operations in R like assigning objects, arithmetic operations, building vectors and matrices, applying functions, plotting and writing custom functions are explained through examples.

Uploaded by

gunthejagan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
4 views8 pages

R Programming Tutorial

The document discusses R programming including basic operations, vectors, matrices, functions, and graphics. R can be used for statistical analysis, data exploration and modelling. Key functions and operations in R like assigning objects, arithmetic operations, building vectors and matrices, applying functions, plotting and writing custom functions are explained through examples.

Uploaded by

gunthejagan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 8

R programming Tutorial

R is a software language for carrying out complicated (and simple) statistical analyses. It includes
routines for data summary and exploration, graphical presentation and data modelling. The aim of this
document is to provide you with a basic fl
uency in the language. It is suggested that you work through this document at the computer, having
started an R session. Type in all of the commands that are printed, and check that you understand how
they operate. Then try the simple exercises at the end of each section.

Basic operations
> 4+6
The result should be
[1] 10

 We can assign objects values for subsequent use. For example:


x<-6
y<-4
z<-x+y

 At any time we can list the objects which we have created:


> ls()
[1] "x" "y" "z"

 a function will operate on an object, for example


> sqrt(16)
[1] 4

 Objects can be removed from the current workspace with the rm function:
> rm(x,y)

 help facility. type help(functionname)

 Vectors can be created in R in a number of ways. C means concatenate.

> z<-c(5,9,1,0)

 Sequences can be generated as follows:


> x<-1:10

 more general sequences can be generated using the seq command. For example:
> seq(1,9,by=2)
[1] 1 3 5 7 9

 below example generates a random sequence.


> seq(8,20,length=6)
[1] 8.0 10.4 12.8 15.2 17.6 20.0

 Another useful function for building vectors is the rep command for repeating things. For
example
rep(1:3,6)
[1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
 R uses componentwise arithmetic on vectors.
> x<-c(6,8,9)
> x+2
[1] 8 10 11

> x<-c(6,8,9)
> y<-c(1,2,4)
> x+y
[1] 7 10 13

 Length() calculates the length of a vector and sum() sum of the elements of the vector.

Summaries and Subscripting


 Some simple summary statistics of these data can be produced:
> x<-c(7.5,8.2,3.1,5.6,8.2,9.3,6.5,7.0,9.3,1.2,14.5,6.2)

> mean(x)

[1] 7.216667
> var(x)
[1] 11.00879
> summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.200 6.050 7.250 7.217 8.475 14.500

 Summaries could be generated for sub vectors as well.


> summary (x[1:6])
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.100 6.075 7.850 6.983 8.200 9.300
> summary(x[7:12])
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.200 6.275 6.750 7.450 8.725 14.500
Matrices
Matrices can be created in R in a variety of ways. Perhaps the simplest is to create the columns and
then glue them together with the command cbind. For example,
> x<-c(5,7,9)
> y<-c(6,3,4)
> z<-cbind(x,y)
>z
xy
[1,] 5 6
[2,] 7 3
[3,] 9 4

 The dimension of a matrix can be checked with the dim command:


> dim(z)
[1] 3 2

 There is a similar command, rbind, for building matrices by gluing rows together.

> rbind(z,z)
[,1] [,2]
[1,] 5 6
[2,] 7 3
[3,] 9 4
[4,] 5 6
[5,] 7 3
[6,] 9 4

 Matrices can also be built by explicit construction via the function matrix.
z<-matrix(c(5,7,9,6,3,4),nrow=3)

 we could have specified the number of columns with the argument ncol=2 the matrix is filled
up' column-wise. If instead you wish to fill up row-wise, add the option byrow=T.

> z<-matrix(c(5,7,9,6,3,4),nr=3,byrow=T)
>z
[,1] [,2]
[1,] 5 7
[2,] 9 6
[3,] 3 4

 Matrix multiplication is expressed using notation %*%:


> y%*%x
[,1] [,2]
[1,] -3 22
[2,] -18 54
[3,] 17 14

 Other useful functions on matrices are t to calculate a matrix transpose and solve to
calculate inverses:
> t(z)
[,1] [,2] [,3]
[1,] 5 9 3
[2,] 7 6 4
and
> solve(x)
[,1] [,2]
[1,] 0.23076923 -0.1538462
[2,] 0.07692308 0.1153846

 to extract sub-components of matrices As before, the [ ] notation is used to subscript

> z[1,1]
[1] 5

> z[,2]
[1] 7 6 4

> z[1:2,]
[,1] [,2]
[1,] 5 7
[2,] 9 6

Attaching to objects
 R includes a number of datasets that it is convenient to use for examples. You can get a
description of what's available by typing

> data()

 To access any of these datasets, you then type data(dataset) where dataset is the name of
the dataset you wish to access. For example,

> data(trees)

 In order to easily work with columns of data, we can attach it to R. then R would remember
the column names and we can work with them directly. For example:

> trees[1:5,]
Girth Height Volume
1 8.3 70 10.3
2 8.6 65 10.3

> attach(trees)

> mean(Height)

[1] 76

> mean(trees[,2])
[1] 76

 Alternatively we can use below $ and below syntax to do the same without attachment

> trees$Height
 A common situation is where we want to apply the same function to every row or column of
a matrix.

 Find the mean of every column in the data set (dimension 2)


> apply(trees,2,mean)
Girth Height Volume
13.24839 76.00000 30.17097

 Find the mean of every row in the data set (dimension 1)


> apply(trees, 1,mean)

Statistical Computation and Simulation


 Many of the tedious statistical computations that would once have had to have been done
from statistical tables can be easily carried out in R. These functions are, respectively, dnorm
(density function), pnorm (distribution function) and qnorm (quantile function).
 N(3; 22), means normal distribution with average 3 and σ2= 4 or σ=2

> x<-seq(-5,10,by=.1)
> dnorm(x,3,2)

 dt, pt and qt for the t-distribution, though in this case it is necessary to give the degrees of
freedom rather than the mean and standard deviation.
 Other distributions available include the binomial, exponential, Poisson and gamma, though
care is needed interpreting the functions for discrete variables.
 R enables simulation from a wide range of distributions,
using a syntax similar to the above. For example, to simulate 100 observations from the N(3; 4)
distribution we write

> rnorm(100,3,2)

 Similarly, rt, rpois for simulation from the t and Poisson distributions, etc.

Graphics
R has many facilities for producing high quality graphics. A useful facility is to divide a page into
smaller pieces so that more than one figure can be displayed. For example:

> par(mfrow=c(2,2))

creates a window of graphics with 2 rows and 2 columns. With this choice the windows are filled up
row-wise. Use mfcol instead of mfrow to ¯ll up column-wise. The function par is a general function
for setting graphical parameters. There are many options: see help(par).

> par(mfrow=c(2,2))
> hist(Height)
> boxplot(Height)
> hist(Volume)
> boxplot(Volume)
> par(mfrow=c(1,1))

 We can also plot one variable against another using the function plot:
> plot(Height,Volume)
 To join the data via lines we would use:

> plot(1912:1971,temp,type='l')

 To get points and lines, use type='b' instead.

> plot(1912:1971,temp,type='b')

 R can also produce a scatterplot matrix (a matrix of scatterplots for each pair of variables)
using the function pairs:

> pairs(trees)

Writing functions
 An important feature of R is the facility to extend the language by writing your own
functions.
 Below defines a function named several.plots.

several.plots<-function(x){
par(mfrow=c(3,1))
hist(x[,1])
hist(x[,2])
plot(x[,1],x[,2])
par(mfrow=c(1,1))
apply(x,2,summary)
}

 This how you call a function

> several.plots(faithful)

Defining a Function
Let’s start by defining a function fahrenheit_to_celsius that converts temperatures
from Fahrenheit to Celsius:
fahrenheit_to_celsius <- function(temp_F) {
temp_C <- (temp_F - 32) * 5 / 9
return(temp_C)
}

 fahrenheit_to_celsius(37)

Other things
There are many other facilities in R. These include:
1. Functions for fitting statistical models such as linear and generalized linear models.
2. Functions for fitting curves to smooth data.
3. Functions for optimisation and equation solving.
4. Facilities to program using loops and conditional statements such as if and while.
5. Plotting routines to view 3-dimensional data.
There is also the facility to 'bolt-on' additional libraries of functions that have a specific utility.
Typing
> library()
will give a list and short description of the libraries available. Typing
> library(libraryname)
where libraryname is the name of the required library will give you access to the functions in that
library.
Task 1: The data y<-c(33,44,29,16,25,45,33,19,54,22,21,49,11,24,56) contain sales of milk in litres for
5 days in three different shops (the first 3 values are for shops 1,2 and 3 on Monday, etc.) Produce a
statistical summary of the sales for each day of the week and also for each shop.

Task 2: Create in R the matrices

x = |3 2|
|1 1|

y = |1 4 0|
|0 1 -1|

Calculate the following and check your answers in R:


(a) 2*x
(b) x*x
(c) x%*%x
(d) x%*%y
(e) t(y)
(f) solve(x)

Task 3- Attach to the dataset mtcars and find the mean weight and mean fuel consumption for
vehicles in the dataset (type help(mtcars) for a description of the variables available).

Task 4- Write a function that takes as its argument two vectors, x and y, produces a scatterplot, and
calculates the correlation coe±cient (using cor(x,y)).

Task 5. Write a function that takes a vector (x1; : : : ; xn) and calculates both ∑xi and ∑xi2
. (Remember the use of the function sum).

You might also like