0% found this document useful (0 votes)

45 views18 pages

T-Test and ANOVA (Analysis of Variance) in R Programming

This document serves as a beginner's guide to t-tests and ANOVA in R programming, detailing independent and paired t-tests, as well as one-way and two-way ANOVA. It explains the application of these statistical tests, their hypotheses, and provides examples using running times and car fuel efficiency data. Key takeaways highlight the appropriate usage of each test based on sample independence and the number of groups being compared.

Uploaded by

Deepak Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views18 pages

T-Test and ANOVA (Analysis of Variance) in R Programming

Uploaded by

Deepak Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

A BEGINNER

GUIDE TO

t-test and ANOVA

(Analysis of Variance)
in R Programming

Made by Habbee
Overview
T-test:
Independent t-test.
Paired t-test.

F-test:
One-way Analysis of Variance (ANOVA).
Two-way Analysis of Variance (ANOVA).
1. t-test:

To test difference in means for two small samples (n < 30) from
populations that are approximately normal.
(The two small samples are representatives of their parent
populations).

Mean A Mean B
different?

To test the linear dependence to check if the two small samples are
unrelated/independent.

Unrelated/Independent?
1. t-test:
1.1. Independent samples t-test
is applied when we want to test differences between the
means/averages of two completely independent groups (one does
not affect the other).

For instance, Ty goes on a three-mile run with his kids every

morning. He wanted to test if his son’s running time (in minutes) is
significantly lower than his daughter’s — meaning the boy can run
faster. To test the theory, he recorded their running times everyday
for a week as given in the following table:
First step, create the running time records in Rstudio.

We name the independent variables as “Son” and “Daughter”. Since R reads data
alphabetically , the daughter’s data is always processed before son’s, as the letter
D goes before S in the alphabet; Thus, our updated alternative hypothesis Ha
now has become μ (daughter) > μ (son), which is still equivalent to Ty’s theory —
“his son’s running time is significantly lower than his daughter’s” .
H0: μ (daughter) = μ (son)
Ha: μ (daughter) > μ (son)

From the result, t-statistic is 2.0337, and p-value = 0.03485, meaning it is less
than 0.05 (using the 0.05 significance level) ; therefore, H0 is rejected. There is
enough sufficient evidence to support Ha that the daughter has a higher mean
running time than the son.
In addition, R also calculates both the means of the daughter’s running time
(22.29 minutes) and son’s (18.14 minutes); hence, we can conclude that Ty’s son
is faster when he runs the three-mile route! Let’s view it in visualization!

Last but not least, sample sizes for the two groups *sometimes* are not equally
the same. For example, what if Ty’s daughter got busy one morning and could not
join the morning run with her brother and father during the week? The sample
size for her running data would be 6 instead of 7!

If groups sizes differ greatly (Homogeneity of Variance is violated), that can

cause the null hypothesis to be falsely rejected (type I error: reject H0 when it is
in fact true!)
1) t-test
1.2. Paired t-test:
is applied when we have two dependent (paired) samples from
just one population and want to see if they are significantly
different - useful for “before and after” situation.

Mean A (before) Mean A (after)

different?

Example, Ty wants to test the difference in means of his kids’

heart rates before and after the three-mile run.
Import the dataset into R for our paired t-test analysis.

H0: μ (before) = μ (after)

Ha: μ (before) ≠ μ (after)

With p-value = 0.05772 (that is greater than 0.05), we fail to reject H0 as we

do not have enough sufficient evidence to support Ty’s kids heart rates differ
significantly (statistically) before and after the 3-mile run.
However, the result also shows that the mean of the differences is 16.5 bpm, and
if we visualize our paired t-test, we can see the mean bpm from “after” running is
higher than “before”. Our hearts tend to beat faster per minute after we exercise!

As a final point, sample sizes for the two measurements in paired t-test are
always identical (equal variances), unlike independent t-test.
2. F-test:
Analysis of Variance (ANOVA)
works exactly like t-test but with more than two groups.

H0: μ (1) = μ (2) = μ (3)= … = μ (n)

Ha: at least two means are different.

Mean A Mean B Mean C

different? different?

different?

Assumptions of ANOVA: Each groups of samples are normally

distributed , have equal variances, and are independent.
2. F-test:
2.1. One-way ANOVA
is used to analyze the difference between the means of more than
two groups.

Assume the Dependent variable (DV) is how many miles that a car
can travel per gallon of fuel (mpg), and the Independent variable (IV)
is different brands of cars . Apply an analysis of variance to test if
the means are significantly different between them .
Let’s let R read our mpg data.

H0: μ (Toyota) = μ (Subaru) = μ (Lexus)

Ha: at least two means are different.

With our F-statistic is 77.17 and p-value is less than 0.05 (= 2.1e-10), we reject
null hypothesis, and there is enough evidence to claim that at least two means are
different.
.... but you may ask which means are different? The “TukeyHSD(model)” syntax
helps us clarify that. Since the “p-adj” values between each pair of cars are < 0.05,
we can state that there is a significant difference in average of mpg between
Subaru and Lexus, Toyota and Lexus, and Toyota and Subaru, with Toyota 4Runner
and Subaru differ the most in terms of mpg ( “diff”= 12.0).

Last but not least, if the confidence interval does not contain value 0 then there is
a significant difference between two variables’ averages.

For example, the lower bound (lwr) and upper bound (upr) of Subaru-Lexus’
confidence interval are (5.0402, 9.9598), which do not consist of 0.
2. F-test:
2.2. Two-way ANOVA:
is applied when we want to analyze how two Independent
variables (IV), in combination, affect a Dependent variable (DV)
because we want to study if there is an interaction between the
two IVs on our DV.

For instance, we want to know if the cars’ mpg values mentioned

above will differ when driven on highway and in the city.

The IVs now are car brands (Toyota, Subaru, and Lexus) and
where they are being driven (in the city or on the highway), with
our DV is mpg values.
Here is how to create a two-way ANOVA data frame in R.

We now have three different hypotheses to test, with the first one is:

H0: μ (Toyota) = μ (Subaru) = μ (Lexus)

Ha: at least two means are different.

P-value of “brand” is 4.12e-10; we can claim that there is a significant difference of

effect between driving the Toyota 4Runner, Subaru Crosstrek, and Lexus RX350 in
terms of mpg, at least for two of the brands.
Next, our second hypothesis is:

H0: μ (city) = μ (highway)

Ha: μ (city) ≠ μ (highway)

Similar to our variable “brand”, “where” we drive our cars is another factor
that does have a significant effect on the mean difference of our miles per
gallon because the p-value is less than 0.05 (= 2.28e-07).

In fact, we obtain higher mpg on highways than in the cities for majority of
cars out there in the market.

city highway

Last but not least, our last hypothesis is:

H0: there is no interaction between what brand of car you drive and
where you drive it.
Ha: there is an interaction between what brand of car you drive and
where you drive it.

Our test statistic value is 0.0304 and p-value is 0.743. We fail to reject the
null hypothesis, and there is not enough evidence to support the claim that
there is an interaction between the cars brands and where you drive your car.
Furthermore, the Tukey test helps us figure out where the differences are lying the
most, which specific groups’ means are different. It compares all possible pairs of
means (every single one of them).
The ggplot graph below also helps us understand the results better!
Key Takeaways:

Independent t-test: if samples are from two populations.

Paired t-test: if samples are from one population, useful in the

“before-after” scenario.

One-way ANOVA: compare means for more than two groups.

Two-way ANOVA: compare means for each factor and test if

there is an interaction between factors for more than two
groups.

Statistics Lab Exp 9 (T Test)
No ratings yet
Statistics Lab Exp 9 (T Test)
4 pages
T-Tests OneWayAnova English
No ratings yet
T-Tests OneWayAnova English
46 pages
Lecture 01-T Test and ANOVA
No ratings yet
Lecture 01-T Test and ANOVA
12 pages
Lecture 9 - T-Test
No ratings yet
Lecture 9 - T-Test
29 pages
Statistical Tests for Group Differences
No ratings yet
Statistical Tests for Group Differences
23 pages
SPSS Guide: Tests of Differences: One-Sample T-Test
No ratings yet
SPSS Guide: Tests of Differences: One-Sample T-Test
11 pages
Parametric Tests
100% (1)
Parametric Tests
57 pages
FDS Unit4
No ratings yet
FDS Unit4
28 pages
Application of Student T Test and Paired T Test
No ratings yet
Application of Student T Test and Paired T Test
13 pages
Unlocking The Power of One Way ANOVA and
No ratings yet
Unlocking The Power of One Way ANOVA and
12 pages
Chapter 6 Hypothesis Test Anova
No ratings yet
Chapter 6 Hypothesis Test Anova
63 pages
The Paired T-Test
No ratings yet
The Paired T-Test
71 pages
Data Analysis Final
No ratings yet
Data Analysis Final
61 pages
097 Palak Exp9
No ratings yet
097 Palak Exp9
5 pages
Week6 T-Test
No ratings yet
Week6 T-Test
50 pages
T-Test - Simply Explained - DATAtab
No ratings yet
T-Test - Simply Explained - DATAtab
13 pages
Case Study 2
No ratings yet
Case Study 2
6 pages
T Test, ANOVA, and MANOVA Explained
No ratings yet
T Test, ANOVA, and MANOVA Explained
19 pages
Univariate and Bivariate Analysis Overview
No ratings yet
Univariate and Bivariate Analysis Overview
21 pages
Module 012 - One Way ANOVA and Its
No ratings yet
Module 012 - One Way ANOVA and Its
12 pages
Notes For Chapter 4
No ratings yet
Notes For Chapter 4
24 pages
T Test
No ratings yet
T Test
32 pages
Understanding T-Tests and ANOVA
No ratings yet
Understanding T-Tests and ANOVA
5 pages
Annova
No ratings yet
Annova
4 pages
Stats Final Notes
No ratings yet
Stats Final Notes
18 pages
Test of Difference Between Means
No ratings yet
Test of Difference Between Means
29 pages
T-Test and ANOVA
No ratings yet
T-Test and ANOVA
12 pages
Session 10
No ratings yet
Session 10
10 pages
T Test by SW - Vidyamritananda
No ratings yet
T Test by SW - Vidyamritananda
13 pages
Application of Student's T Test, Analysis of Variance, and Covariance
No ratings yet
Application of Student's T Test, Analysis of Variance, and Covariance
5 pages
Understanding ANOVA and T Tests
No ratings yet
Understanding ANOVA and T Tests
5 pages
T Testparametrictest 240312054518 6094c5a0
No ratings yet
T Testparametrictest 240312054518 6094c5a0
9 pages
Lecture 11 Independent Sample T Test
No ratings yet
Lecture 11 Independent Sample T Test
34 pages
Presentation T Test
50% (2)
Presentation T Test
31 pages
Lecture 3 Hypothesis Test II - Updated2
No ratings yet
Lecture 3 Hypothesis Test II - Updated2
29 pages
T - Test
No ratings yet
T - Test
45 pages
Statistics for Analysts
No ratings yet
Statistics for Analysts
52 pages
Theory Hypothesis Design Data: To Answer / To Test Research Study Collect
No ratings yet
Theory Hypothesis Design Data: To Answer / To Test Research Study Collect
44 pages
One-Way Analysis of Variance: Using The One-Way
No ratings yet
One-Way Analysis of Variance: Using The One-Way
25 pages
Unit 4 - Notes
No ratings yet
Unit 4 - Notes
14 pages
Group Comparisons
No ratings yet
Group Comparisons
18 pages
T-Test - Simply Explained - DATAtab (Found On Web)
No ratings yet
T-Test - Simply Explained - DATAtab (Found On Web)
21 pages
10-11 T-Test & ANOVA
No ratings yet
10-11 T-Test & ANOVA
15 pages
Statistical Analysis Cont 1
No ratings yet
Statistical Analysis Cont 1
53 pages
Assignment Topic: T-Test: Department of Education Hazara University Mansehra
No ratings yet
Assignment Topic: T-Test: Department of Education Hazara University Mansehra
5 pages
Hypothesis Testing and T-tests in R
No ratings yet
Hypothesis Testing and T-tests in R
16 pages
Analysis of Continuous and Categorical Variables: January 28, 2020
No ratings yet
Analysis of Continuous and Categorical Variables: January 28, 2020
31 pages
How Do We Decide If The Medication Was Successful in Lowering The Patient's Concentration of Blood Glucose?
No ratings yet
How Do We Decide If The Medication Was Successful in Lowering The Patient's Concentration of Blood Glucose?
7 pages
Lecture 13 ANOVA
100% (1)
Lecture 13 ANOVA
36 pages
ANOVA For Class - Dec 6, 2018
No ratings yet
ANOVA For Class - Dec 6, 2018
37 pages
Intro to T-Tests for Researchers
No ratings yet
Intro to T-Tests for Researchers
55 pages
Statistical Analysis with SPSS Guide
No ratings yet
Statistical Analysis with SPSS Guide
55 pages
Final Exam
No ratings yet
Final Exam
47 pages
Chapter Iii - Measures of Central Tendency PDF
No ratings yet
Chapter Iii - Measures of Central Tendency PDF
19 pages
Analysis of Measured Data
No ratings yet
Analysis of Measured Data
77 pages
Pivot Table and Jamovi
No ratings yet
Pivot Table and Jamovi
48 pages
T-Tests, ANOVA and Regression Analysis: Agenda
No ratings yet
T-Tests, ANOVA and Regression Analysis: Agenda
19 pages
7 Free Tools To Build Your Personal Brand
No ratings yet
7 Free Tools To Build Your Personal Brand
9 pages
Support Vector Machines
No ratings yet
Support Vector Machines
114 pages
Deep Neural Networks and Tabular Data: A Survey
No ratings yet
Deep Neural Networks and Tabular Data: A Survey
20 pages
2202 10701
No ratings yet
2202 10701
19 pages
Course Catalog Sample
No ratings yet
Course Catalog Sample
48 pages
Data Leakage
No ratings yet
Data Leakage
9 pages
6 Practical Ways To Improve Your Profile
No ratings yet
6 Practical Ways To Improve Your Profile
8 pages
9 Hard Truths - Nicolas Cole
No ratings yet
9 Hard Truths - Nicolas Cole
11 pages
6 Things You Should Never Say During A Job Interview
No ratings yet
6 Things You Should Never Say During A Job Interview
8 pages
LangChain: Chat with Your Data Guide
No ratings yet
LangChain: Chat with Your Data Guide
32 pages
50 Most Important CNN Interview Questions
No ratings yet
50 Most Important CNN Interview Questions
18 pages
Supplement 6 Heizer Operations Management
No ratings yet
Supplement 6 Heizer Operations Management
74 pages
Guidance 006 Sample
No ratings yet
Guidance 006 Sample
1 page
2023 Cambridge Units 3 4 Solutions
No ratings yet
2023 Cambridge Units 3 4 Solutions
402 pages
Module 5 Psy002
No ratings yet
Module 5 Psy002
15 pages
Analysis of Variance
No ratings yet
Analysis of Variance
13 pages
Question Bank
No ratings yet
Question Bank
7 pages
ML - Unit 2
No ratings yet
ML - Unit 2
15 pages
Measures of Correlation
No ratings yet
Measures of Correlation
7 pages
Class 03 - Basic Concepts of Statistics and Probability (1 of 2)
No ratings yet
Class 03 - Basic Concepts of Statistics and Probability (1 of 2)
9 pages
Cambridge International AS & A Level: Mathematics 9709/62
No ratings yet
Cambridge International AS & A Level: Mathematics 9709/62
12 pages
Performance Task 2025
No ratings yet
Performance Task 2025
5 pages
Unit 2 3 Notes
No ratings yet
Unit 2 3 Notes
16 pages
Vce Chemistry Ga23
No ratings yet
Vce Chemistry Ga23
3 pages
Chapter 4 Thesis Likert Scale
100% (4)
Chapter 4 Thesis Likert Scale
6 pages
Time Study: Avg. Observed Time (Or Actual Time (AT) )
100% (1)
Time Study: Avg. Observed Time (Or Actual Time (AT) )
8 pages
Assignment 1 081403
No ratings yet
Assignment 1 081403
3 pages
Analysis of Covariance (Linear, Quadratic, Site Index As Covariables) - Dr. Rong-Cai Yang
No ratings yet
Analysis of Covariance (Linear, Quadratic, Site Index As Covariables) - Dr. Rong-Cai Yang
47 pages
Ujjaval Modi 1
No ratings yet
Ujjaval Modi 1
6 pages
Statistics Quiz for Students
No ratings yet
Statistics Quiz for Students
3 pages
Random Sampling (Definition, Types, Formula & Example)
No ratings yet
Random Sampling (Definition, Types, Formula & Example)
4 pages
CFN 9327 Maths, Logical Reasoning & Stats Ch-15 Measures of Central Tendency and Despersion Suggested Answers
No ratings yet
CFN 9327 Maths, Logical Reasoning & Stats Ch-15 Measures of Central Tendency and Despersion Suggested Answers
4 pages
One Way ANOVA Using SPSS
No ratings yet
One Way ANOVA Using SPSS
23 pages
92-Article Text (Manuscript File) - 758-2-10-20210912
No ratings yet
92-Article Text (Manuscript File) - 758-2-10-20210912
12 pages
Quiz 1 - DPHR19 (PA & PC)
No ratings yet
Quiz 1 - DPHR19 (PA & PC)
5 pages
MCQ M-IV Unit 6 Mechanical
No ratings yet
MCQ M-IV Unit 6 Mechanical
7 pages
Full Statistics
No ratings yet
Full Statistics
108 pages
Statistics For Business: Decision Making and Analysis 3rd Edition by Robert Stine (Ebook PDF) Download
100% (2)
Statistics For Business: Decision Making and Analysis 3rd Edition by Robert Stine (Ebook PDF) Download
49 pages
Statistical Measures and Data Analysis
No ratings yet
Statistical Measures and Data Analysis
50 pages
Exercise (Individual) : Student Name: Matric No
No ratings yet
Exercise (Individual) : Student Name: Matric No
5 pages
Sampling Methods in Research Explained
No ratings yet
Sampling Methods in Research Explained
3 pages

T-Test and ANOVA (Analysis of Variance) in R Programming

Uploaded by

T-Test and ANOVA (Analysis of Variance) in R Programming

Uploaded by

A BEGINNER

t-test and ANOVA

For instance, Ty goes on a three-mile run with his kids every

If groups sizes differ greatly (Homogeneity of Variance is violated), that can

Mean A (before) Mean A (after)

Example, Ty wants to test the difference in means of his kids’

H0: μ (before) = μ (after)

With p-value = 0.05772 (that is greater than 0.05), we fail to reject H0 as we

H0: μ (1) = μ (2) = μ (3)= … = μ (n)

Mean A Mean B Mean C

Assumptions of ANOVA: Each groups of samples are normally

H0: μ (Toyota) = μ (Subaru) = μ (Lexus)

For instance, we want to know if the cars’ mpg values mentioned

H0: μ (Toyota) = μ (Subaru) = μ (Lexus)

P-value of “brand” is 4.12e-10; we can claim that there is a significant difference of

H0: μ (city) = μ (highway)

Last but not least, our last hypothesis is:

Independent t-test: if samples are from two populations.

Paired t-test: if samples are from one population, useful in the

One-way ANOVA: compare means for more than two groups.

Two-way ANOVA: compare means for each factor and test if

You might also like