S4251512 Biostatistics for 2nd year Pharmacy students S.K.
Timmer
Important formulas Biostatistics WBFA011-05 2021-2022
Day 1: Introduction (chapter 1) & Descriptive Statistics (Chapter 2)
Arithmic Mean
Geometrical Mean
Dispersion width (difference largest & smallest observation)
Sum of Squares for a population
Mean Sum of Squares = POPULATION VARIANCE
Variance (s2)
Day 2: Probability & Probability Distributions (Uniform, Binomial, Poisson) Ch. 3/4
Addition rule: Pr(AB) = Pr(A) + Pr(B) - Pr(AB)
Pr(AB) = Probabilty that A occurs OR B occurs
Product rule: Pr (B|A) = Pr(B) Pr(A|B) = Pr(A)
Pr(A B) = Pr(A) · Pr(B|A) = Pr(B) · Pr(A|B)
Pr(A B) = Pr(A) · Pr(B) → When A and B are mutually independent
When NOT mutually independent → Conditional Probability:
Conditional probability Pr(A|B) is the probability of A if B occurs
Conditional probability Pr(B|A) is the probability of B if A occurs
Pr(A|B) = Pr(A B) / Pr(B) Pr(B|A) = Pr(A B) / Pr(A)
Diagnostic Test
Pr(positive | diseased) = a/(a+b) = sensitivity of the test
Pr(negative | healthy) = d/(c+d) = specificity of the test
Pr(diseased | positive) = a/(a+c) = predictive value given a positive test
Pr(healthy | negative) = d/(b+d) = predictive value given a negative test
Uniform: μ = Σ(pi · xi ), σ2= Σ pi (xi - μ) 2 = Σ (xi - μ)2 /n = Σ pixi2 – μ2, Pi = Pr(X=xi ) = Pr(x) = 1/n
Binomial: μ = Σ( pi · xi ) = n · π, σ2= Σ pi (xi - μ) 2 = n · π · (1 - π), Pi= Pr(X=xi ) = Pr(x): …
π < 0.5: distribution with top to the left; π >0.5: distribution with top to the right; π = 0.5: distribution is symmetric; n>>30 (rather n · π >>
15): distribution approximates the normal distribution
Poisson: σ2 = μ (Variance); σ = √μ (SD); Pi = Pr(X=xi ) = Pr(x):….
→ Prob that exactly x events occur per time- or space-unit, mean not integer. μ > 15, distr approx. Normal distr.
1
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
Day 3: Normal Distribution (4.4); Estimations & Tests (chapter 5)
Probability Density Function of a Standard Normal Distribution
Probability of Outcomes → Surface under pdf
Linear Transformation: X ~ N(μ, σ2 )
Y = aX + b → Y ~ N(aμ + b, a2σ2) Y = aX → Y ~ N(aμ, a2σ2)
Linear combi of X1 & X2: X1 ~ N(μ1,σ12); X2 ~ N(μ2,σ22 ); if INDEPENDENT
Y = aX1 + bX2 → Y ~ N(aμ1 + bμ1 , a σ1 + b σ2 ) Y = aX1 – bX2 → Y ~ N(aμ1 + bμ1 , a2σ12 + b2σ22)
2 2 2 2
General form: μY = a1μ1 + a2μ2 + … + ajμj + … + akμk
σy2 = a12σ12 + a22σ22 + ... + aj2σj2 + ... + ak2σk2
Z = distance from mean expressed in SD → a = 1/σ; b = - μ/σ X ~ N(μ,σ2 )
TABLE 2A for one-sided, SYMMETRICAL & LIM → 2-sided
Continuity Correction: μ>>15, Binom → Normal. F.e. Pr(X>30) = Pr(X≥30) → tale Pr (X > 30,5)
Central Limit Theorem: If X ~ N(μ , σ2 ), then X̄ ~ N(μ , σ2/n);
If n large and X ~ N(μ, σ2), X̄ ~ N(μ, σ2/n) is independent from the distribution type:
• for symmetric distribution n 10 • for asymmetric distribution n 30 (How large n have to be)
Variance of SAMPLE mean (X̄) = Var (X̄)
SD of the sample mean = Standard Error
Standard Error of difference 2 samples IF X1 and X2 are INDEPENDENT
Standard Error Proportion
Standard Error, difference of 2 samples proportion p1 & p2
Day 4: Reliability of the Mean; Two-sided, One-sided, Decision Error
Central Limit Theorem: If 𝑋 ~ 𝑁(𝜇 , 𝜎2) then X̄ ~ 𝑁(𝜇 , 𝜎2/𝑛)
When 𝑛 is large and 𝑋 ~ 𝑁(𝜇, 𝜎 2) still X̄ ~ 𝑁(𝜇 , 𝜎2/𝑛) indep. from type of distr. in population
If x̄ is close to 𝜇, then 𝜇 is close to x̄!!!
Z-transformation: Normal Distr. → Standard Normal Distribution
→
α = probability that μ is not in the Confidence-Interval = overschrijdingskans
Confidence Interval = quantification of expected deviation between estimate and population mean
𝐻0: 𝜇 = some hypothesized value in the population (𝜇0) 𝐻A: a negation of 𝐻0, i.e. 𝜇 ≠ 𝜇0
If P>α do NOT reject H0 If P<α accept HA & reject H0
Influence of n: If n increases, SE gets smaller, Z becomes bigger, P & CI becomes smaller, PRECISION
2
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
One-sided test: Min/Max, only on 1 side of hypoth-mu taken as evidence against H0 (1-side relevant)
𝐻A: 𝜇 < 𝜇0 if Z in area REJECT H0 → 95% 𝐶𝐼: X̄ + 𝑍𝛼 * 𝜎/√n
𝐻A: 𝜇 > 𝜇0 if Z in area REJECT H0 → 95% 𝐶𝐼: X̄ − 𝑍𝛼 · 𝜎/√n
𝜎 2 2
Sample size (n) → 𝑛 = (𝛥) ⋅ [𝑧𝑎 + 𝑧𝛽 ]
Decision error: Type-I: () – 𝐻0 is rejected, while 𝐻0 is true (false positive)
Type II (𝛽) – 𝐻0 is not rejected, while 𝐻0 is not true (false negative)
2
𝜎 2
Sample size (n) → 𝑛 = (𝛥) ⋅ (𝑧𝑎 + 𝑧𝛽 )
2 2
Decision Power ➔ 1 – β
Day 5: Goodness-of-Fit, Chi-square tests (5.4); Comparing Groups of data (6.1-6.3)
2 (Chi-squared) test of Goodness-of-fit (variable is likely to come from a specified distribution or not?)
OJ = observed; EJ = expected; df = k-1
obtained X -value (TABLE 5) < table: thus P > 0.05 → H0 > 0,05 → H0 NOT rejected (f.e. no sign dif)
2 2
conditions: All expected freq. > 1, 80% expected freq. >5
Student’s t-distribution (describes the standardized distances of sample means to the population mean when the population
standard deviation is not known) → use Sample-SD (s) instead of (σ)
(𝑋̄− 𝜇)
X̄ ~ N(μ; σ2/n) → Z-transformation but σ unknown → 𝑇 = (within 1-α; between -tdf;1/2α & +tdf;1/2α)
(𝑠√𝑛)
USE TABLE 3
Paired data: T = (d̄ - ) / (sd/n); T ~ tdf, df = n-1 → d̄ ~ N(, σ2d /n) d = before – after
H0 → Δ = μ1-μ2 = 0 then CI
Two-sample t-test (2 mutually independent samples)
H0: μ1-μ2 = 0 HA: μ1-μ2 0
Do the variances differ?? → F-test: df = n1 + n2 - 2
H0: F = s1 /s2 = 1 HA: F = s1 /s2 1 Fdf1;df2;α (table 4) > F → P > 0,05; H0 NOT rejected; Var pooled
2 2 2 2
(Assumption Check), if VAR pooled than fill in:
When T < TTABLE → P > 0,05 thus H0 NOT rejected H0→μ1-μ2 = 0 HA→μ1-μ2 0
Wilcoxon-matched pair test (Used to test differences in mean of 2 paired observations) Paired data Non-Parametric
If mean of differences (d-bar) is NOT N-distributed, use non-parametric
Step1: Determine Di (difference) between each pair of observations
Step2: Rank differences in absolute values → give rank 1 to smallest difference
(Di = 0; do NOT incl), Rank = tie? → give the mean of the ranks
Step3: Reassign Di to their ranks (Ri)
Step4: Calculate S+ = ∑Ri (where Di was +) & S- = ∑Ri (where Di was -)
Step 5: 2-tailed → reject H0 if absolute values of S+ or S- are less/equal than critical value in table n!!
1-tailed → Median popul.1 > popul.2 → reject H0 IF S- > Stable
Median popul.1 < popul.2 → reject H0 IF S+ < Ssatble
Wilcoxon-matched pair test for large n → n>25; μ = n(n+1)/4 σ2 = (n(n+1) (2n+1))/24
TABLE 6
3
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
Mann-Whitney-Test → Data of 2 Independent groups
H0 : μ̃1 – μ̃2 = 0 and HA : μ̃1 – μ̃2 0
Step1: Sort all data
Step2: Give rank number for the data
Step3: Sum the rank numbers per sample; T = sum of rank numbers with smallest n
Step4: T’ = n1 (n1 + n2 + 1) -T
Step5: TABLE 7: P-value with T if T>T’ OR T’>T T’ < TTABLE → P>0,10 → H0 NOT rejected
Day 6: One-way ANOVA (Chapter 6.4)
M1; M2; M3;…; Mn → Average per column MT → overall average
H0 → μ1 = μ2 = μ3 HA → at least 1 is different
Measurements Xi,j independent! Variances are equal! 𝑀𝑖 ~ 𝑁 (𝑖 , 2/𝑛𝑖 )
F-ratio under H0: s2between / s2within (this is tested in a 1-way ANOVA)
Step 1: Calculate Sum of Squares (SSTOTAL = SSBETWEEN + SSWITHIN)
Step 2: Calculate the Mean Squares (MS) √𝑀𝑆(𝑤𝑖𝑡ℎ𝑖𝑛) = 𝑠(𝑟𝑒𝑠)
(𝑋2) → sum of all values squared and added up 𝑛𝑖 (𝑀𝑖 2 ) → sum of number of measurements in column * average
If 𝑝 > 0.05: null hypothesis 𝐻0 (means are not different) is not rejected
If 𝑝 ≤ 0.05: null hypothesis 𝐻0 is rejected & 𝐻𝐴 is accepted: at least one mean is different
CI for Sample CI Δ betw. 2 means
Scheffé Confidence-Intervals: (applies to set of estimates of ALL possible contrast among factor level means)
FN-k; ½ alpha → √(𝑘 − 1) ∗ 𝐹𝑘−1;𝑁−𝑘;𝑎𝑙𝑝ℎ𝑎
4
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
Tukey-Kramer-Test:
(q-distribution; if k INDEPENDENT samples obtained from population with N-distribution of the same mean and variance; used to find if
means are significantly different from each other)
For Critical-q-value (qalpha; df; k) → TABLE 8
If the qCALC > critical q-value for k and df = N-k, then reject 𝐻0 and accept 𝐻A !!
Confidence-Interval Tukey-Kramer:
Honest-Significant Difference (HSD):
Kruskal-Wallis-Test: (same H0 and HA; at least 5 observations per group; Non-Parametric alternative for ANOVA; Wilcoxon XL)
If average ranks Ṝi between groups are similar, 𝐻0 accepted.
If Ṝi between groups are very different, we reject 𝐻0!
TABLE 5: Chi2
IF q > qalpha REJECT H0 !!
Day 7: Groups of Categorical Data (Chapter 7-1 t/m 7-4); Outliers (Chapter 7-5)
Binomial data 1 sample:
𝑛! 𝝁
𝑃𝑟(𝑥) = (𝑛−𝑥)!𝑥! 𝜋 𝑥 (1 − 𝜋)𝑛−𝑥 𝜇 = 𝑛𝜋 𝝅 = 𝒏 𝜎 2 = 𝑛𝜋(1 − 𝜋) Population
𝑥 1 1 2 (1 −𝑃) 𝑝(1−𝑝)
𝑃 = 𝑛 = (𝑛) 𝑥 𝑉𝐴𝑅 (𝑃) = (𝑛) 𝑛 ⋅ 𝑝(1 − 𝑃) = 𝑃 𝛱
𝑠𝑒(𝑃) = √ 𝑛
Sample
From Binom to N-distr; n is large, pi approx. 0,5 and mu >> 15
𝑝⋅(1−𝑃) 𝑝−𝜋 (1−𝑝) 1
𝑝~𝑁 (𝛱; ) 𝑧= ~ 𝑁(0; 1) Confidence Interval → 𝜋 = 𝑃 ± 𝑍1/2𝑎 ⋅ √ 𝑃+
𝑛 𝑝(1−𝑃) 𝑛 2𝑛
√
𝑛
Binomial distribution – 2-independent Groups
1 1 1
(𝑃1 −𝑃2 )−(𝜋1 −𝜋2 ) |(𝑃1 −𝑃2 )−(𝜋,−𝜋2 )|− (
+ )
2 𝑛1 𝑛2
𝑍= 𝑃 (1−𝑃 ) 𝑃 (1−𝑃 )
~𝑁(0,1) 𝑧𝑐 = 𝑃 (1−𝑃 ) 𝑃 (1−𝑃 )
Test → for P see Table 2B
√ 1𝑛 1+ 2𝑛 2 √ 1𝑛 1 + 2𝑛 2
1 2 1 2
1 1 1 𝑝1 (1−𝑝1 ) 𝑝2 (1−𝑝2 )
𝜋1 − 𝜋2 = 𝑝1 − 𝑝2 ± [2 (𝑛 + 𝑛 ) + 𝑍 1 ∗ √ 𝑛1
+ 𝑛2
] Confidence-Interval
1 2 2𝑎
p1 ~ N(1,Var(p1)); p2 ~ N(2,Var(p2)) p1 - p2 ~ N(1 - 2 , Var(p1) + Var(p2))
Pay attention to Continuity Correction!!
Poisson data 1 sample:
|𝑋−𝜇|−0,5 1
𝑋~𝑁(𝜇; 𝜇); 𝑍= ~𝑁(0; 1) 𝜇 = 𝑋 ± [2 + 𝑍1𝑎 √𝑥] CI-calculation use observed/measured value
√𝜇 2
Approximation from Poisson to N-distr; mu >>15 !
Poisson distribution – 2-independent samples
|𝑋1 −𝑋2 |−1
Test: H0 : µ1 = µ2 HA : µ1 µ2 𝑍𝑐 = P<0,001 (T2B, null-rejected)
√𝑋1 + 𝑋2
5
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
McNemar’s Test: Proportions – 2-matched groups (Two paired Binomial distributions)
f.e. 2 painkillers on same person → 4 groups: ++(a) +-(b) -+(c) –(d)
p1(Analg A +) = (a+b)/n p2(Analg B +) = (a+c)/n p1 – p2 = (a+b)/n - (a+c)/n = (b-c)/n
1 (𝑏−𝑐)2 (𝑃1 −𝑃2 )−(𝜋1 −𝜋2 )
𝑝1 − 𝑝2 ~𝑁 (𝜋1 − 𝜋2 ; (𝑏 +𝑐− ) 𝑧= ~ 𝑁 (0,1)
𝑛2 𝑛 2
1
√𝑏+𝑐 − (𝑏−𝑐)
𝑛 𝑛
1 1 (𝑏−𝑐)2
Confidence Interval: 𝜋1 − 𝜋2 = 𝑃1 − 𝑃2 ± [ + 𝑍 1 ⋅ √𝑏 + 𝑐 − ]
𝑛 2𝑎 𝑛 𝑛
|𝑏−𝑐|−1
Test statistics under H0 π1 – π2 = 0 with continuity correction: 𝑧𝑐 =
√𝑏+𝑐
Grubbs’ Test (Outliers): FOR 1 SUSPECT VALUE; ONLY CONTINIOUS; x N(,2)
|𝑠𝑢𝑠𝑝𝑒𝑐𝑡 𝑣𝑎𝑙𝑢𝑒−𝑥̅ |
𝐺= 𝑠
x̄ → Mean; s → SD; GLIM for α = 0,05 TABLE 9
G > GLIM (TABLE 9); Alpha → H0 rejected so SUSPECT NOT AN OUTLIER
Dixon’s test (Outliers): FOR 1 SUSPECT VALUE; ONLY CONTINIOUS; x N(,2)
Sort measurement values from low to high and give indexes accordingly. For sample with n sample size x 1 will be the lowest,
while xN the highest measurement
QLIM-value for α=0,05 and given n from TABLE 10
Q > QLIM (TABLE 10) → H0 REJECTED so SUSPECT NOT AN OUTLIER
Day 9: Categorical data; Tests of Independence (Chapter 7-6)
R x C Frequency Table: cross table that matches values of 2 VARAIABLES and counts nr of occasions that pairs occur
Oi j = observed freq/nr in cell i j
𝑐 𝑟
𝑂𝑖. = ∑ 𝑂𝑖𝑗 𝑂.𝑗 = ∑ 𝑂𝑖𝑗 → Margins of crosstabs are sums
𝑗=1 𝑖=1
Mutually Independent: Product rule: Pr(𝐴 ∩ 𝐵) = Pr(𝐴) · Pr(𝐵); Pr(𝐴 ∩ 𝐵) = Pr(𝐴) · Pr(𝐵|𝐴) → Pr 𝐵 𝐴 = Pr(𝐵)
Dependent: Pr(𝐵|𝐴) = Pr(𝐴 ∩ 𝐵) / Pr(𝐴) Pr(𝐴|𝐵) = Pr(𝐴 ∩ 𝐵) / Pr(𝐵)
Pr(𝐴 ∩ 𝐵) = Pr(𝐴) · Pr(𝐵|𝐴) = Pr(𝐵) · Pr(𝐴|𝐵)
H0: Variables I and II are INDEPENDENT 𝜋𝑖𝑗 = 𝜋𝑖. · 𝜋.𝑗 [Product rule]
HA: Variables I and II are DEPENDENT At least one cell 𝜋𝑖𝑗 ≠ 𝜋𝑖. · 𝜋.j
𝜋𝑖. = true probability of being in row I 𝜋.𝑗 = true probability of being in column j
𝑶𝒊 ⋅𝑶𝑱̇
Under H0 the expected cell frequency is: 𝑬𝒊𝒋 = (Put this values in new table)
𝑵
Test statistic formula: df = (r-1) (c-1) TABLE 5
X CALC > χ df;α than P < 0,05 (alpha); REJECT H0 & accept HA → Variables DEPENDENT
2 2
Conditions for Pearson-Chi-squared test: All expected freq. Eij > 1; 80% of Eij > 5
2 x 2 Crosstabs: special case of r x c crosstabs → r=2 c=2 → df = 1
Df → nr of values in calc. of test statistics that are free to vary
Yates Correction: approximate 𝜒2 better when 𝑁 is small
Conditions → 3 cells Eij > 5; and all =/> 1 (O-E) → |O-E| - ½
TABLE 5
Fischer’s Exact Test: If there are less than 3 cells Eij < 5
H0: Var I/II independent; HA: Var I/II dependent P < α → REJECT H0
6
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
2 x C crosstabs: column variable = ordinal variable with categories i = 1,.. ,c
𝑐 2
𝑟 𝑅2
∑ ( 𝑖 )−
𝑛𝑖 𝑁
REWRITTEN AS: 𝑋2 = 𝑖=1
𝑝(1−𝑝)
ni = numbers of objects in category i
ri = number of objects from ni with characteristic of interest
R = sum of ri; N = sum of ni and p = R/N
Df = (r-1) (c-1) If X2 > 2df;alpha → H0 REJECTED
Day 10: Relation between 2 Continuous Variables; Correlation Calculation;
Calculation of the Best Fitted Straight Line (chapter 8-1; 8-2)
Covariance → SXY →
Confidence Interval of correlation coefficient → Fischer Z-transformation
CI of ρ (population value of r); n > 10!!
→ →
Correlation-Coefficient-test:
Look up value in T-table (TABLE 3); P < 0,05 → H0 Rejected
Spearman Rank Order Correlation:
H0: Rho=0 HA: Rho is not 0 Condition: n>30
Calculation of best fitted straight line:
Y = a + b X; Σ(Y – YlLINE)2 minimum Σ(Y – a – b · X)2 min.
Derivation of this formula:
Regression line for Correlation Calculation:
Line Y from X:
Line X from Y:
With →
7
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
Day 11: Regression Analysis (Chapter 8-3)
X → INDEPENDENT VARIABLE Y → DEPENDENT VARIABLE
Conditions: Y-values at X-value N-distributed: 𝑌 ~ 𝑁(µ𝑦 , 𝜎 2); Measurement error only in Y-direction
Constant Variance (HOMOSCEDASTICITY); Y-values mutually independent; Relation X&Y linear:
Homoscedasticity → equal variances; Heteroscedasticity → unequal variances 𝑦 =∝ +𝛽𝑥 + 𝑒
Calculation Regression Line:
MAKE TABLE WITH: X Y X2 Y2 XY
SUM:
∑𝑥 ∑𝑌
𝑋̅ = 𝑌̅ = 𝒔𝒙𝒙 = (∑𝑋 2 ) − 𝑛𝑋̅ 𝒔𝒚𝒚 = (∑𝑌 2 ) − 𝑛𝑌̅ 𝒔𝒙𝒚 = (∑𝑋𝑌) − 𝑛 ⋅ 𝑋̅ ⋅ 𝑌̅
𝑛 𝑛
̂ = 𝒂 + 𝒃𝑿 (Y-hat means prediction)
𝒀 𝑒 = 𝑌 − 𝑌̅ → residual (distance between observation and regression)
𝑌-bar is estimate for Y; 𝑎 is estimate for 𝛼; 𝑏 is estimate for 𝛽; 𝑠2 is estimate for 2
CI: 1-α% for β: Hypothesis test for β:
CI: 1-α% for α: Hypo-test for α:
(If α=0 is in CI; the deviation from origin may be due to chance (sampling fluctuation)
CI of μY for value X0:
Where do 95% of (new) data points Y fall?; → Prediction Interval (PI) of Y0 for X0
2 2
Test of no regression (ANOVA) → SStot = SSregr + SSres → ∑(𝑌 − 𝑌̅)2 = ∑(𝑌̂ − 𝑌̅) + ∑(𝑌 − 𝑌̂)
𝑆𝑆𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙
Determination coefficient → 1 − 𝑟2 =
𝑆𝑆𝑡𝑜𝑡𝑎𝑙
8
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
Lack-Of-Fit-Test (LOF) Backreading Xo calib. Line with response Y0
Day 12: Two-Way Analysis-Of-Variance (2-Way ANOVA)
2-independent samples → T-test; >2-independent samples → 1-way ANOVA
2 matched samples → paired T-test; >2-matched samples → 2-way ANOVA
Two-way ANOVA
N = rows (r) * columns (c)
H0 (for both A and B): means are same; HA: at least 1 not
Confidence Interval according to Scheffé
(Difference between Means in the columns) (Difference in means between the rows)
Two-Way ANOVA with interaction (At least 2 measurement values per cell (n))
N=c*r*n
sum-XIJK = ALL DATA; sum-MIJ = Mean Cells; Extra Hypothesis: H0: NO INTERACTION Ha: INTERACTION
IF THERE’S NO INTERACTION → SS and df of interaction and residual combined!!
Conclusions: A: there is a difference between batches
B: there is a difference between analysts. There is no interaction: In table SSINT and
SSRES can be combined, just like dfINT and dfRES.
9
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
Post-Hoc Contrasts According to Scheffé
Difference in mean between the Columns Differences in mean between the rows
Two-Way ANOVA → One-Way ANOVA
- interaction and row are not significantly different based on two-way ANOVA analysis →
perform one-way ANOVA with column as factor.
- interaction and column are not significantly different based on twoway ANOVA analysis →
perform one-way ANOVA with row as factor.
Decision FlowChart:
Day 13: Multiple Linear Regression (Chapter 9)
Recapture Simple Regression Analysis:
X: adjusted value (Independent); Y: Response (observed value; Dependent)
Linear regression model → µY = α + βX Y = α + βX + e; e ~ N(0,2 )
See Page 8 for Confidence Intervals
Multiple Linear Regression (MLR) (more factors; 1 response)
Y: response; dependent; continuous XJ: factor; independent; predictor; covariate
Linear model: µy = 0 + 1 ·X1 + .. + j ·Xj + .. + k ·Xk k factors (j=1,…,k)
Y = 0 + 1 ·X1 + .. + j ·Xj + .. + k ·Xk + e e = error, resi-e~ N(0,2)
Sample: n independent observation Yi , X1i, X2i, …, Xji, …, Xki
Yfit = b0 + b1X1 + .. + bJ XJ + .. + bk Xk → Y-hat
H0: βJ = 0 HA: βJ not 0
Covariance Analysis: Categorical variables as factor
Use dummy variable k-1 !!!
Interaction in MLR:
add 1 extra interaction variable → f.e. X3 = X1 · X2 : interaction between dose and age (product!)
ANOVA in MLR: Determin. Coeff.
10
S4251512 Biostatistics for 2nd year Pharmacy students S.K. Timmer
11