Statistics for Data Science - 2
Week 11 Notes
Hypothesis testing
1. Null hypothesis:
The null hypothesis is a kind of hypothesis which explains the population parameter
whose purpose is to test the validity of the given experimental data. It is denoted by
H0 . The null hypothesis is a default hypothesis that is assumed to remain possibly
true.
2. Alternative hypothesis:
The alternative hypothesis is a statement used in statistical inference experiment. It
is contradictory to the null hypothesis and denoted by HA or H1 .
3. Test statistic:
A test statistic is numerical quantity computed from values in a sample used in statis-
tical hypothesis testing.
4. Type I error:
A type I error is a kind of fault that occurs during the hypothesis testing process when
a null hypothesis is rejected, even though it is true.
5. Type II error:
A type II error is a kind of fault that occurs during the hypothesis testing process when
a null hypothesis is accepted, even though it is not true (HA is true).
6. Significance level (Size):
Significance level (also called size) of a test, denoted α, is the probability of type I
error.
α = P (Type I error)
7. β = P (Type II error)
8. Power of a test:
Power = 1 − β
9. Types of hypothesis:
(a) Simple hypothesis: A hypothesis that completely specifies the distribution of
the samples is called a simple hypothesis.
(b) Composite hypothesis: A hypothesis that does not completely specify the
distribution of the samples is called a composite hypothesis.
10. Standard testing method: z-test:
Consider a sample X1 , X2 , . . . , Xn ∼ i.i.d. X.
• Test statistic, denoted T , is some function of the samples. For example: sample
mean X
• Acceptance and rejection regions are specified through T .
(a) Right-tailed z-test:
• H0 : µ = µ0 , HA : µ > µ0
• Test: reject H0 if T > c.
• Significance level α depends on c and the distribution of T |H0 .
• α = P (T > c|H0 )
• Fix α and find c.
(b) Left-tailed z-test:
• H0 : µ = µ0 , HA : µ < µ0
• Test: reject H0 if T < c.
• Significance level α depends on c and the distribution of T |H0 .
• α = P (T < c|H0 )
• Fix α and find c.
(c) two-tailed z-test:
• H0 : µ = µ0 , HA : µ ̸= µ0
• Test: reject H0 if |T | > c.
• Significance level α depends on c and the distribution of T |H0 .
• α = P (|T | > c|H0 )
• Fix α and find c.
X − µ0
Note: In the test for mean (σ 2 known), T = X and when null is true, σ/√n
∼
Normal(0, 1).
11. P -value:
Suppose the test statistic T = t in one sampling. The lowest significance level α at
which the null will be rejected for T = t is said to be the P -value of the sampling.
Page 2