2019 International Training Workshop on Scientific Big Data
Data analysis in
Dr. Lai Jiangshan
[email protected]
Field of research
Forest ecology
R workshop in China
Training thousand of participants for University and Institutes
R International workshop
2016 International Training Workshop on Big Data for Science
R International workshop
2017 International Training Workshop on Big Data for Science
Nepal
We’ll Cover
1. Introduction to R
2. Basic statistical analyses in R
3. Graphics in R
What is R?
• The R statistical programming language is a free open
source package based on the S language developed by Bell
Labs.
• The language is very powerful for writing programs.
• Many statistical functions are already built in.
• Contributed packages expand the functionality to cutting
edge research.
• Since it is a programming language, generating computer
code to complete tasks is required.
Creators of R
Ross Ihaka
University of Auckland, new Zealand
Since 1997, there was a core team, who was able to modify
the source code of R.
Robert Gentleman
The creators of the R
Name: Ross Ihaka Name: Robert Gentleman
Born: 1954 Born: 1959
He retired as an associate currently vice president of
professor of statistics at Computational biology at 23 and
the University of Auckland in Me
2017
Why Use R?
• It's free! Escape from commercial software
• It runs on different platforms including Windows, Linux and
MacOS.
• It provides an platform for programming new statistical
methods in an easy and straightforward manner.
• It contains advanced and newest statistical methods not
yet available in other software.
• It has state-of-the-art graphics capabilities.
peer-reviewed articles published in 30 top ecology
journals (IF>3) during a 10-year period (2008-2017)
20,395 articles (33.5% of 60,902) explicitly listed R as the
statistical software for analyses. from 11.4% in 2008 to
58.0% in 2017.
Getting Started
• Where to get R?
• Go to www.r-project.org
• Downloads: CRAN
• Set your Mirror: Anyone in the world is fine.
• Select base.
• Double click R-3.6.1-win.exe
Obtaining R
• Current Version: R-3.6.1
• Comprehensive R Archive Network:
http://cran.r-project.org
Getting Started
• The R GUI?
R Working Area
This is the area where all
commands are issued, and
non-graphical outputs
observed when run
interactively
R convenient interface : Rstudio
• Rstudio: a powerful graphical user interface for R
• allows the user to run R in a more user-friendly
( http://www.rstudio.com/ide/download/ )
• Install Rstudio
20
Basics
• Highly Functional
– Everything done through functions
– Strict named arguments
– Abbreviations in arguments OK (e.g. T for
TRUE)
• Object Oriented
– Everything is an object
– “<-” or “=” is an assignment operator
– “X <- 5”: X GETS the value 5
Getting Help in R
• From Documentation:
– ?t.test
– help(t.test)
– example(t.test)
– help.start()
• Documents: “Introduction to R”
Help website
• A typical user wiki.
• http://www.statmethods.net/
• Also called Quick-R. Gives very productive
• direct help. Also for users coming from other
• programming languages.
• http://mathesaurus.sourceforge.net/
• Dictionary for programming languages (e.g. R for
• Matlab users).
• Just using Google (type e.g. \R rnorm" in the
• search eld) can also be very productive.
Data Structures
• Supports virtually any type of data
• Numbers, characters, logicals (TRUE/ FALSE)
• Simplest: Vectors and Matrices
• Data Frame: Rectangular Data Set
• Lists: Can Contain mixed type variables
Packages install.views("Environmetrics")
39
https://cran.r-project.org/web/packages/
Practice in Adding Packages
#Adding Packages
#install packages
rda
install.packages("vegan")
library(vegan)
rda
Input and output in R
read.table Read a text file into a data frame
Like read.table with defaults appropriate for
read.csv
Comma-separated data:
Write a data frame (or matrix or table) to a text
write.table
file
Like write.table with defaults appropriate for
write.csv
comma-saved data
read.spss(foreign) Read a file in SPSS .sav format
sink Write R output to a file
Write the current plot to a file (formats include:
savePlot
"png", "jpeg", "bmp", "ps")
pdf Open a graphics device to write a PDF file
The current working folder
• Unless you provide filenames as absolute
pathnames (like C:\workspace\myproject\myfile) all
input and ouput is relative to the current working
folder.
• getwd() # What is the current working folder?
• The working folder can be set using
the setwd function, or using the RGui menu: File >
Change dir..