100% found this document useful (1 vote)
7K views11 pages

Statistical Computing by Using R

1. The document describes statistical computing using R by providing examples of various statistical tests and analyses. 2. Examples include descriptive statistics, one and two sample tests, ANOVA, regression, and analyses for count data. Codes to perform these analyses in R are provided along with explanations. 3. The document is intended as a reference for applying common statistical analyses and tests using the R programming language.

Uploaded by

Chen-Pan Liao
Copyright
© Attribution ShareAlike (BY-SA)
Available Formats
Download as PDF or read online on Scribd
Download as pdf
100% found this document useful (1 vote)
7K views11 pages

Statistical Computing by Using R

1. The document describes statistical computing using R by providing examples of various statistical tests and analyses. 2. Examples include descriptive statistics, one and two sample tests, ANOVA, regression, and analyses for count data. Codes to perform these analyses in R are provided along with explanations. 3. The document is intended as a reference for applying common statistical analyses and tests using the R programming language.

Uploaded by

Chen-Pan Liao
Copyright
© Attribution ShareAlike (BY-SA)
Available Formats
Download as PDF or read online on Scribd
Download as pdf
Download as pdf
You are on page 1/ 11

Statistical computing by using R

Examples from textbook of Biomatrics Lab (2011), Department of Life Science, Tunghai University.

Chen-Pan Liao ()
PhD student, Department of Life Science, Tunghai University, Taiwan; E-mail address: andrew.43@gmail.com March 10, 2013

Download a new version


It is available to download the newest version of this article on http:// www.scribd.com/doc/70943527/Statistical-computing-by-using-R.

License

Statistical computing by using R by Chen-Pan Liao () is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. You are free to share or remix this article under several conditions. Read the license to learn more.

Contents
1 Introduction 1.1 How to apply the example in R? . . . . . . . . . . . . . . . . 2 Descriptive statistics 2.1 Normal distribution and z-value . . . . . . . . . . . . . . . . . 2.2 Normality, skewness and kurtosis . . . . . . . . . . . . . . . . 3 One-/two-sample test 3.1 One-sample t-test . . . . . . . . . . . . . . 3.2 Two-sample t-test, f -test or Bartletts test 3.3 Paired t-test . . . . . . . . . . . . . . . . . 3.4 Two-sample Mannn-whitney U test . . . . 2 2 3 3 3 4 4 4 5 5 5 5 6

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

4 One-factor experimental design 4.1 Balanced/unbalanced one-way ANOVA . . . . . . . . . . . . 4.2 Kruskal-Wallis test . . . . . . . . . . . . . . . . . . . . . . . .
https://plus.google.com/117776983818354527306/about http://creativecommons.org/licenses/by-sa/3.0/

5 Two-way ANOVA & experimental designs 5.1 Completely randomized design (CRD) . . . . 5.2 Randomized Complete Block Design (RCBD) 5.3 RCBD (unbalanced design) . . . . . . . . . . 5.4 Latin square design . . . . . . . . . . . . . . . 5.5 Nested design . . . . . . . . . . . . . . . . . . 6 Regression & correlation 6.1 Simple linear regression . . . . . . 6.2 Replicated simple linear regression 6.3 Simple linear correlation . . . . . . 6.4 Multiple regression . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

6 6 7 7 7 8 8 8 9 9 9

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

7 Count data 10 7.1 Goodness of t test . . . . . . . . . . . . . . . . . . . . . . . . 10 7.2 Test of independence . . . . . . . . . . . . . . . . . . . . . . . 10 7.3 Test of homogeneity . . . . . . . . . . . . . . . . . . . . . . . 10 A Free resources for learning R 11

Introduction

R is a free computer language for statistical computing and graphics. Everyone can download and install R framework. In this article, several usual statistical tests computed by using R are demonstrated. Most of questions given in these demos are from textbook of Biomatrics Lab (2011), Department of Life Science, Tunghai University.

1.1

How to apply the example in R?

All of the R codes in this article were tested under R version 2.13. In general, a reader can just copy & paste the R code to reveal the results of statistical computations with the following two kinds of exceptions. The rst case always happens after the R code try to load a package by after inputing library(package_name) where package_name indicates the name of a package which R tried to load; If you did not installed that package in R system, then R always returns a error indicating the package is unavailable. To resolve such problem is to install the package before R load it as the following code. install.packages("package_name") library(package_name) Its quite usual to input data from a plain text le (usually a CSV le) and then ask R to load that le by inputting read.csv("filename.csv"). The second case is that R cannot nd or load that data le, indicating that le may not exist or you may specify a incorrect path of the le. By the way, you should generate the CSV le somewhere then load it by indicating its full path and lename (including its lename extension), e.g., D:\somewhere\mydata.csv. Another usual technic is change your working directory where that data le exists by inputting setwd("dir")
See

the ocial website of R: http://cran.r-project.org.

where dir is the path of a directory which contains the CSV le. For example, if there is a data le called data.csv located in D:/your/own/path/, the r codes to load this le could be read.csv("D:/your/own/path/data.csv") to indicate the full path or setwd("D:/your/own/path") read.csv("data.csv") to change your working path and then read the le.

2
2.1

Descriptive statistics
Normal distribution and z-value

The mean and standard deviation of intelligence quotient (a.k.a. IQ) are 100 and 15, respectively. Find 1) a z-value if someones IQ = 147, 2) a p-value Pr(IQ > 147), and 3) a IQy value if Pr(IQ > IQy ) = 0.01. (147-100)/15 # return 3.133333 (1) 1 - pnorm(147, 100, 15) # return 0.0008641652 (2) qnorm((1-0.01),100 ,15 ) # return 134.8952 (3)

2.2

Normality, skewness and kurtosis

The following table shows the range of IQ and the number of persons. Test the normality of IQ and nd the skewness and kurtosis of IQ distribution. IQ range 160169 150159 140149 130139 Frequency 3 16 55 120 IQ range 120129 110119 100109 9099 Frequency 330 610 719 592 IQ range 8089 7079 6069 Frequency 338 130 48

## input frequency & range iq <- seq(165, 65, -10) freq <- c(3, 16, 55, 120, 330, 610, 719, 592, 338, 130, 48) ## bulid raw data for (i in 1:length(iq)) { yy <- rep(iq[i], freq[i]) if (i==1) {y <- yy} else {y <- c(y, yy)} } ## several estimations length(y) # number of value mean(y) # mean median(y) # median library(modeest); mlv(y, method = "mfv") # mode names(sort(-table(y)))[1] # mode; not always correct min(y) # minial value max(y) # maximal value range(y) # range summary(y) # combination of estimations var(y) # variance of sample 3

sd(y) # standard deviation of sample sd(y)/length(y)^0.5 # standard error of sample ## skewness & kurtosis library(psych); skew(y); kurtosi(y) ## normality library(nortest); lillie.test(y) # Lilliefors test shapiro.test(y) # Shapiro-Wilk test

3
3.1

One-/two-sample test
One-sample t-test

Test the H0 : The mean of Y is equal to 24.3. Y = {25.8, 24.6, 26.1, 22.9, 25.1, 27.3, 24.0, 24.5, 23.9, 26.2, 24.3, 24.6, 23.3, 25.5, 28.1, 24.8, 23.5, 26.3, 25.4, 25.5, 23.9, 27.0, 24.8, 22.9, 25.4}. ## input data y <- c ( 25.8, 24.6, 26.1, 22.9, 25.1, 27.3, 24.0, 24.5, 23.9, 26.2, 24.3, 24.6, 23.3, 25.5, 28.1, 24.8, 23.5, 26.3, 25.4, 25.5, 23.9, 27.0, 24.8, 22.9, 25.4 ) ## normality test shapiro.test(y) ## t-test t.test(y, mu=24.3) # for HA: mu != 24.3 t.test(y, mu=24.3, alternative="g") # for HA: mu > 24.3 t.test(y, mu=24.3, alternative="l") # for HA: mu < 24.3

3.2

Two-sample t-test, f -test or Bartletts test

Test the H0 : The means of Y1 and Y2 are equal. Y1 = {8.8, 8.4, 7.9, 8.7, 9.1, 9.6}; Y2 = {9.9, 9.0, 11.1, 9.6, 8.7, 10.4, 9.5}. ## input data y.1 <- c (8.8, 8.4, 7.9, 8.7, 9.1, 9.6) y.2 <- c (9.9, 9.0, 11.1, 9.6, 8.7, 10.4, 9.5) ## normality test shapiro.test(y.1) shapiro.test(y.2) ## f-test var.test(y.1, y.2) ## Bartlett s test y <- c(y.1, y.2) group <- c ( rep("a",length(y.1)), rep("b",length(y.2)) ) bartlett.test(y~group) ## t-test 4

t.test(y.1, y.2, var.equal=T) # if equal variances t.test(y.1, y.2, var.equal=F) # if unequal variances

3.3

Paired t-test

Test the H0 which the means of Y1i and Y2i are equal. Note that each Yi is paired. Y1i ={142, 140, 144, 144, 142, 146, 149, 150, 142, 148}; Y2i ={138, 136, 147, 139, 143, 141, 143, 145, 136, 146}. ## input data y.1 <- c(142, 140, 144, 144, 142, 146, 149, 150, 142, 148) y.2 <- c(138, 136, 147, 139, 143, 141, 143, 145, 136, 146) ## normality test shapiro.test(y.1 - y.2) ## t-test t.test(y.1, y.2, paired=T) # HA: mu1 != mu2 t.test(y.1, y.2, paired=T, alternative="g") # HA: mu.1 > mu.2 t.test(y.1, y.2, paired=T, alternative="l") # HA: mu.1 < mu.2

3.4

Two-sample Mannn-whitney U test

Test the H0 which the distributions of Y1 and Y2 are equal by using Mannnwhitney U test. Y1 = {3, 3, 3, 6, 10, 10, 13.5, 13.5, 16.5, 16.5, 19.5}; Y2 = {3, 3, 7.5, 7.5, 10, 12, 16.5, 16.5, 19.5, 22.5, 22.5, 22.5, 22.5, 25}. ## input data y.1 <- c(3, 3, 3, 6, 10, 10, 13.5, 13.5, 16.5, 16.5, 19.5) y.2 <- c( 3, 3, 7.5, 7.5, 10, 12, 16.5, 16.5, 19.5, 22.5, 22.5, 22.5, 22.5, 25 ) ## Mann-Whitney U test wilcox.test(y.1, y.2) # HA: y.1 != y.2 wilcox.test(y.1, y.2, alternative="g") # HA: y.1 > y.2 wilcox.test(y.1, y.2, alternative="l") # HA: y.1 < y.2

4
4.1

One-factor experimental design


Balanced/unbalanced one-way ANOVA

Test the H0 which the means of Yj={1,2,3,4} are all equal, where Y1 = {60.8, 57, 65, 58.6, 61.7}, Y2 = {68.7, 67.7, 74, 66.3, 69.8}, Y3 = {102.6, 102.1, 100.2, 96.5}, and Y4 = {87.9, 84.2, 83.1, 85.7, 90.3}. ## input data weight <- c( 60.8, 57.0, 65.0, 58.6, 61.7, 68.7, 67.7, 74.0, 66.3, 69.8, 102.6, 102.1, 100.2, 96.5, 87.9, 84.2, 83.1, 85.7, 90.3 ) groups <- as.factor( c(rep(1,5), rep(2,5), rep(3,4), rep(4,5)) ) ## description

tapply(weight, groups, mean) # mean of each group tapply(weight, groups, sd) # sd of each group boxplot(weight ~ groups) # box plot ## normality tests for each group tapply(weight, groups, shapiro.test) ## test equal variances bartlett.test(weight ~ groups) ## ANOVA m <- aov(weight ~ groups); summary(m) ## post-hoc TukeyHSD(m) library(laercio); LDuncan(m, "groups") pairwise.t.test(weight, groups, p.adj="none") library(agricolae); LSD.test(m, "groups", group=F) # # # # Tukey s Duncan s LSD LSD

4.2

Kruskal-Wallis test

Test the H0 which the medians of Yi={1,2,3,4} are all equal, where Y1 = {2, 2, 3.5, 3.5, 8, 10, 10, 17}, Y2 = {6, 10, 13.5, 13.5, 20, 23.5, 23.5, 26}, Y3 = {13.5, 16, 18, 20, 23.5, 26, 28}, and Y4 = {6, 6, 13.5, 22, 26, 29, 30, 31}. ## input data y <- c( 2, 2, 3.5, 3.5, 8, 10, 10, 17, 6, 10, 13.5, 13.5, 20, 23.5, 23.5, 26, 13.5, 16, 18, 20, 23.5, 26, 28, 6, 6, 13.5, 22, 26, 29, 30, 31 ) groups <- as.factor( c(rep(1,8), rep(2,8), rep(3,7), rep(4,8) )) ## Kruskal-Wallis test kruskal.test(y ~ groups) ## post-hoc library(pgirmess); kruskalmc(y, groups)

5
5.1

Two-way ANOVA & experimental designs


Completely randomized design (CRD)
data.csv horm , sex , plconc 0 , f , 16.5 0 , f , 18.4 0 , f , 12.7 0 , f , 14 0 , m , 11 0 , m , 10.8 0 , m , 14.3 0 , m , 10 1 , f , 26.2 1 , f , 21.3 1 , f , 35.8 1 , f , 40.2 1 , m , 23.8 1 , m , 28.8 1 , m , 25 1 , m , 29.3

Determine the xed eects of hormone (horm), sex, and their interaction aecting on calcareous concentration (plconc). ## load data rawdata <- read.csv("data.csv") ## model I ANOVA mod <- aov( plconc ~ factor(horm) * ) summary(mod) # type library(car) Anova(mod, type=2) # type Anova(mod, type=3) # type

factor(sex), data=rawdata I II III 6

5.2

Randomized Complete Block Design (RCBD)


block 1 , 3 1 , 4 1 , 1 1 , 2 2 , 1 2 , 3 2 , 2 2 , 4 3 , 3 3 , 2 3 , 4 3 , 1 4 , 4 4 , 2 4 , 1 4 , 3 , , , , , , , , , , , , , , , , , data.csv diet , wt 4.9 8.8 7 5.3 9.9 7.6 5.7 8.9 5.5 4.7 8.1 8.5 3.3 3.5 5.1 2.8

Determine the xed eects of 4 dierent diets (diet) aecting on pigs weight. Five dierent places which pigs were fed in are set as block (block). ## load data rawdata <- read.csv("data.csv") ## model I ANOVA mod <- aov(wt ~ factor(diet) + factor(block), data=rawdata) summary(mod) ## post-hoc TukeyHSD(mod) library(laercio); LDuncan(mod,"diet") pairwise.t.test(rawdata$wt, rawdata$diet, p.adj="none") # Tukey s # Duncan s # LSD

5.3

RCBD (unbalanced design)


block 1 , a 2 , a 3 , a 4 , a 1 , b 2 , b 3 , b 4 , b 1 , c 2 , c 3 , c 4 , c 1 , d 2 , d 3 , d 4 , d , , , , , , , , , , , , , , , , , data.csv treat , obs 10 15 16 15 16 20 25 22 14 20 16 12 16 18 15

Determine the eect of treat aecting on variable obs with a block block. ## load data rawdata <- read.csv("data.csv") ## model I ANOVA mod <- aov(obs ~ treat + factor(block), data=rawdata) summary(mod) # type I library(car); Anova(mod, type=3) # type III ## post-hoc TukeyHSD(mod) library(laercio); LDuncan(mod,"treat") pairwise.t.test(rawdata$obs, rawdata$treat, p.adj="none") library(agricolae); LSD.test(mod, "treat", group=F) # # # #

Tukey s Duncan s LSD LSD alternative

5.4

Latin square design


row 1 , 1 , 1 , 1 , 2 , 2 , 2 , 2 , 3 , 3 , 3 , 3 , 4 , 4 , 4 , 4 , , 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 col , a , d , c , b , b , a , d , c , c , b , a , d , d , c , b , a data.csv , treat , obs , 16 , 10 , 14 , 12 , 16 , 18 , 15 , 20 , 20 , 25 , 18 , 16 , 14 , 20 , 22 , 16

Determine the eects of factor row, factor col, and factor treat aecting on variable obs, where all of factors are designed as Latin square. ## load data rawdata <- read.csv("data.csv") ## model I ANOVA mod <- aov( obs ~ treat + factor(row) + factor(col), data=rawdata ) summary(mod) # type I (fixed-model) library(car); Anova(mod, type=3) # type III ## post-hoc TukeyHSD(mod) # Tukey s library(laercio); LDuncan(mod,"treat") # Duncan s ## post-hoc (LSD)

pairwise.t.test(rawdata$obs, rawdata$treat, p.adj="none") library(agricolae); LSD.test(mod, "treat", group=F)

5.5

Nested design
A 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , B 1 1 1 2 2 2 3 3 3 4 4 4 1 1 1 2 2 2 3 3 3 4 4 4 1 1 1 2 2 2 3 3 3 4 4 4 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , data.csv obs 11 9 10 8 7 6 8 10 11 11 14 10 11 8 7 10 14 12 9 10 8 10 13 12 12 14 10 8 10 12 11 9 12 13 12 11

Determine the eects of factor A and factor B aecting on variable obs, where factor B is nested within factor A. ## load data rawdata <- read.csv("data.csv") ## fixed B mod.1 <- aov( obs ~ factor(A) + factor(B) %in% factor(A), data=rawdata ) mod.1 <- aov( obs ~ factor(A) / factor(B), data=rawdata ) # alternative summary(mod.1) ## random B (usual case) mod.2 <- aov( obs ~ factor(A) + Error(factor(A)/factor(B)), data=rawdata ) summary(mod.2) ms.A <- summary(mod.2)[[1]][[1]][[3]] # get 7.527778 df.A <- summary(mod.2)[[1]][[1]][[1]] # get 2 ms.res <- summary(mod.2)[[2]][[1]][[3]] # get 7.768519 df.res <- summary(mod.2)[[2]][[1]][[1]] # get 9 f.A <- ms.A / ms.res p.A <- pf (f.A, df.A, df.res, lower.tail=F); p.A

6
6.1

Regression & correlation


Simple linear regression
i

Test the H0 which means = 0 in the model Yi = + Xi + following sampling data. ## load data rawdata <- read.csv("data.csv") ## linear regression mod <- lm(Y ~ X, data=rawdata) summary(mod) ## prediction for X=c(14, 22.6) newdata <- data.frame(X=c(14, 22.6)) ## prediction for single X predict(mod, newdata, interval="prediction")

from the

data.csv X , Y 22 , 16 26 , 17 45 , 26 37 , 24 28 , 22 50 , 21 56 , 32 34 , 18 60 , 30 40 , 20

## prediction for mean predict(mod, newdata, interval="confidence", se.fit=T) 8

## residual resid(mod) rstandard(mod) influence.measures(mod) plot(mod)

# # # #

origin residuals standard residuals Cook s D plotting

6.2

Replicated simple linear regression


i

Test the H0 which means = 0 in the model Yi = + Xi + lack of t from the following sampling data. ## load data rawdata <- read.csv("data.csv") ## linear regression mod <- lm(Y ~ X, data=rawdata); summary(mod) ## lack of fit in a regression with replicated data mod.aov <- lm(Y ~ factor(X), data=rawdata) anova(mod, mod.aov)

and also the

data.csv Y , 108 110 106 125 120 118 119 132 137 134 148 151 146 147 144 162 156 164 158 159 X , , , , , , , , , , , , , , , , , , , , 30 30 30 40 40 40 40 50 50 50 60 60 60 60 60 70 70 70 70 70

6.3

Simple linear correlation


data.csv wingL , tailL 10.4 , 7.4 10.8 , 7.6 11.1 , 7.9 10.2 , 7.2 10.3 , 7.4 10.2 , 7.1 10.7 , 7.4 10.5 , 7.2 10.8 , 7.8 11.2 , 7.7 10.6 , 7.8 11.4 , 8.3

Determine the correlation coecient and test the H0 which means the coecient equaling to 0 from the following data by using Pearsons, Kendalls, and Spearmans correlation. ## load data rawdata <- read.csv("data.csv") ## Pearson s attach(rawdata) cor.test(wingL, tailL) # HA: rho != 0 cor.test(wingL, tailL, alternative="g") # HA: rho > 0 cor.test(wingL, tailL, alternative="l") # HA: rho < 0 ## Kendall s attach(rawdata); cor.test(wingL, tailL, method="kendall") ## Spearman s attach(rawdata); cor.test(wingL, tailL, method="spearman")

6.4

Multiple regression
data.csv J,A,B,C,D,E 1,6,9.9,5.7,1.6,2.12 2,1,9.3,6.4,3.0,3.39 3,-2,9.4,5.7,3.4,3.61 4,11,9.1,6.1,3.4,1.72 5,-1,6.9,6.0,3.0,1.80 6,2,9.3,5.7,4.4,3.21 7,5,7.9,5.9,2.2,2.59 8,1,7.4,6.2,2.2,3.25 9,1,7.3,5.5,1.9,2.86 10,3,8.8,5.2,0.2,2.32 11,11,9.8,5.7,4.2,1.57 12,9,10.5,6.1,2.4,1.50 13,5,9.1,6.4,3.4,2.69 14,-3,10.1,5.5,3.0,4.06 15,1,7.2,5.5,0.2,1.98 16,8,11.7,6.0,3.9,2.29 17,-2,8.7,5.5,2.2,3.55 18,3,7.6,6.2,4.4,3.31 19,6,8.6,5.9,0.2,1.83 20,10,10.9,5.6,2.4,1.69

Apply a multiple linear regression and consequent model selection for following data. ## load data rawdata <- read.csv("data.csv") ## full model mod.full <- lm(E ~ A + B + C + D, data=rawdata) summary(mod.full) ## stepwise mod.step <- step(mod.full, direction="both") 9

summary(mod.step) ## backward mod.backward <- step(mod.full, direction="backward") summary(mod.backward) ## forward mod.forward <- step(mod.full, direction="forward") summary(mod.forward)

7
7.1

Count data
Goodness of t test

Test whether the frequent ratio of 4 : 4 is equivalent to 3 : 1. ## input observed frequencies and expected proportions o <- c(4, 4) e <- c(3, 1) / (3+1) chisq.test(o, p=e) chisq.test(o, p=c(3,1), rescale.p=T) chisq.test(o, p=e, simulate.p.value=T, B=4000) # Pearson s # Pearson s alternative # permutation based on 4000 of replicates

7.2

Test of independence
sex M , M , M , M , F , F , F , F , data.csv , hair , freq Black , 32 Blond , 16 Brown , 43 Red , 9 Black , 55 Blond , 64 Brown , 65 Red , 16

Test the independence between factor sex and factor hair. ## load data rawdata <- read.csv("data.csv") ## Pearson s crosstable <- xtabs(freq ~ sex + hair, data=rawdata) summary(crosstable) chisq.test(crosstable) # alternative library(MASS); loglm(freq~sex+hair, rawdata) # alternative ## G-test library(MASS); loglm(freq ~ sex + hair, rawdata) source("http://www.psych.ualberta.ca/~phurd/cruft/g.test.r") g.test(crosstable) # alternative ## Fisher s exact test fisher.test(crosstable)

7.3

Test of homogeneity
data.csv grade,Black,Blond,Brown,Red 1 , 23 , 48 , 54 , 13 2 , 24 , 56 , 64 , 21 3 , 19 , 58 , 48 , 20 4 , 21 , 56 , 57 , 22 5 , 23 , 39 , 57 , 21 6 , 16 , 48 , 55 , 13

A researcher randomly sampled students hair in dierent grades from a elementary school and recored students hair color, then summarized the following table. Test the frequency ratio of hair color of students from dierent grades are all 0.125 : 0.375 : 0.375 : 0.125. ## load data rawdata <- read.csv("data.csv") rawdata$sum <- apply(rawdata[2:5], 1, sum)

10

rawdata["sum",] <- apply(rawdata, 2, sum) ## chi-square of each grade e <- c(0.125, 0.375, 0.375, 0.125) chi.each <- rep(NA, 6); df.each <- rep(3, 6) for (i in 1:6) { chi.each[i] <- as.numeric(chisq.test(rawdata[i,2:5], p=e)$statistic) } ## chi-square of all grades chi.all <- as.numeric(chisq.test(rawdata["sum",2:5], p=e)$statistic) df.all <- 3 ## testing homogeneity chi.homo <- sum(chi.each) - chi.all df.homo <- sum(df.each) - df.all pchisq(chi.homo, df.homo, lower.tail=F) ## since p = 0.8362961, not reject H0, ## then merged data from different grades chisq.test(rawdata["sum",2:5], p=e) For instance, following the previous example, if you test the homogeneity between grades without specifying the expected ratio, the statistical calculation will be mathematically equivalent to that of test of independence. ## load data rawdata <- read.csv("data.csv") freq <- rawdata[1:6, 2:5] ## test chisq.test(freq)

Free resources for learning R

Using R for psychological research by William Revelle. http://www.personality-project.org/r/r. guide.html R Graph Gallery by Romain Franois. http://addictedtor.free.fr/graphiques/ R Wiki by R Wiki Community. http://rwiki.sciviews.org/doku.php?id=start An Introduction to R by Bill Venables and David M. Smith. http://cran.r-project.org/doc/manuals/ R-intro.html R Data Import/Export by Douglas Bates, Saikat DebRoy and Brian Ripley. http://cran.r-project.org/doc/manuals/R-data.html Resources to help you learn and use R by UCLA Academic Technology Services. http://www.ats. ucla.edu/stat/R/ Statistics with R by Vincent Zoonekynd. http://zoonek2.free.fr/UNIX/48_R/all.html R: Statistical Computing and Programming Language by Chien-Fu Je Lin; In traditional Chinese. http://web.ntpu.edu.tw/~cflin/Teach/R/Rproj.htm R by Taiwans National Applied Research Laboratories; In traditional Chinese. https://sites.google.com/site/rprojectnotes/ or http://statlab.nchc.org.tw/rnotes/ R (videos) by Chen-Pan Liao; In traditional Chinese. http://apansharing.blogspot.tw/p/ r-demo.html or http://www.youtube.com/playlist?list=PL5AC0ADBF65924EAD

11

You might also like