Module 4
Module 4
Parametric Tests
Introduction:
In testing hypothesis, it is relatively important to select the most appropriate statistical
tools that will truly detect significant difference or relationship between parameters or
values. To understand better about inferential statistical tools for hypotheses testing,
this module offers you brief discussions to orient you on the nature and assumptions
for every statistical tool and later translate them into practice through some
exercises.
OBJECTIVES:
AFTER going through the module, it is expected that you will be able to do the
following with at least 80% proficiency level:
PARAMETRIC TESTS
To do this, we need to set a significance level (also called alpha) that allows
us to either reject or accept the alternative hypothesis. Most commonly, this value is
set at 0.05.
Having learned about the basic assumptions before selecting t-test, let us
explore further by applying the aforesaid principles in using t-test, this time through
the aid of Microsoft Excel. If you have your laptops, or any android device with this
program, turn it on, and let us understand better how t-test is run.
The following example teaches you how to perform a t-test in Excel. But, we
first need to install Data Analysis ToolPak in Excel before we can successfully run
the test.
How to Install the Data Analysis ToolPak in Excel?
Let us continue:
4. Click in the Variable 1 Range box and select the range A2:A7.
5. Click in the Variable 2 Range box and select the range B2:B6.
6. Click in the Hypothesized Mean Difference box and type 0 (H 0: μ1 - μ2 = 0).
7. Click in the Output Range box and select cell E1.
8. Click OK.
After performing the test, Microsoft Excel will give you this result:
Does the dependent t-test test for "changes" or "differences" between related
groups?
The dependent t-test can be used to test either a "change" or a "difference" in
means between two related groups, but not both at the same time. Whether you are
measuring a "change" or "difference" between the means of the two related groups
depends on your study design.
What are the assumptions of the dependent t-test?
(1) Samples which are randomly selected
(2) Data which are expressed in interval or ratio
(3) Data that are normally distributed (just the differences between the
groups);
(4) Data with equal variances.
If the p-value is less than your significance level (usually 0.05), reject the null
hypothesis. Your sample data support the hypothesis that the mean of at least one
population is different from the other population means.
The Summary table indicates that the mean strengths range from a low of
8.837952 for supplier 4 to a high of 11.20252 for supplier 1. Our sample means are
different. However, we need to determine whether our data support the notion that
the population means are not equal. The differences we see in our samples might be
the result of random sampling error.
In the ANOVA table, the p-value is 0.031054. Because this value is less than
our significance level of 0.05, we reject the null hypothesis. Our sample data provide
strong enough evidence to conclude that the four population means are not equal.
The stronger the association of the two variables, the closer the Pearson
correlation coefficient, r, will be to either +1 or -1 depending on whether the
relationship is positive or negative, respectively. Achieving a value of +1 or -1 means
that all your data points are included on the line of best fit – there are no data points
that show any variation away from this line. Values for r between +1 and -1 (for
example, r = 0.8 or -0.4) indicate that there is variation around the line of best fit. The
closer the value of r to 0 the greater the variation around the line of best fit.
Can you use any type of variable for Pearson's correlation coefficient?
No, the two variables have to be measured on either an interval or ratio scale.
However, both variables do not need to be measured on the same scale (e.g., one
variable can be ratio and one can be interval).
1. Your two variables should be measured on a continuous scale (i.e., they are
measured at the interval or ratio level). Examples of continuous variables include
revision time (measured in hours), intelligence (measured using IQ score), exam
performance (measured from 0 to 100), weight (measured in kg), driving speed
(measured in km/h) and so forth.
2. Your two continuous variables should be paired, which means that each case
(e.g., each participant) has two values: one for each variable. These "values" are
also referred to as "data points".
For example, imagine that you had collected the revision times (measured in
hours) and exam results (measured from 0 to 100) from 100 randomly sampled
students at a university (i.e., you have two continuous variables: "revision time" and
"exam performance"). Each of the 100 students would have a value for revision time
(e.g., "student #1" studied for "23 hours") and an exam result (e.g., "student #1"
scored "81 out of 100"). Therefore, you would have 100 paired values.
3. There should be independence of cases, which means that the two observations
for one case (e.g., the scores for revision time and exam performance for "student
#1") should be independent of the two observations for any other case (e.g., the
scores for revision time and exam performance for "student #2", or "student #3", or
"student #50", for example).
6. There should be homoscedasticity, which means that the variances along the line
of best fit remain similar as you move along the line. If the variances are not similar,
there is heteroscedasticity.
1. In Excel, click on an empty cell where you want the correlation coefficient to be
entered. Then enter the following formula:
=PEARSON (array1, array2)
Simply replace ‘array1‘ with the range of cells containing the first variable and
replace ‘array2‘ with the range of cells containing the second variable.
For the example above, the Pearson correlation coefficient (r) is ‘0.76‘.
Simply replace the ‘r‘ with the correlation coefficient value and replace the ‘n‘
with the number of observations in the analysis.
For the example in this guide, the formula used in Excel can be seen below.
Note, if your coefficient value is negative, then use the following formula:
=(ABS(r)*SQRT(n-2))/(SQRT(1-ABS(r)^2))
The addition of the ABS function converts the coefficient value to an absolute
(positive) number. Otherwise, a negative coefficient value will bring up an error.
The final step in the process of calculating the p-value for a Pearson
correlation test in Excel is to convert the t-statistic to a p-value.
Before this can be done, we just need to calculate a final piece of information:
the number of degrees of freedom (DF). The DF can be found by subtracting 2 from
n (n – 2).
Now we are ready to calculate the p-value. To do this, simply use
the =TDIST function in Excel.
=TDIST(x, deg_freedom, tails)
Replace the ‘x‘ with the t statistic
created previously and replace the
‘deg_freedom‘ with the DF. Finally, for
the tails, enter the number ‘1‘ for a one-
tailed analysis or a ‘2‘ for a two-tailed
analysis. If you are unsure about which to
use, use a two-tailed analysis (‘2‘).
Figure at the right is a screenshot
for how this looks in Excel by using the
example.
In the example, the p value is
‘0.006‘. Therefore, there is a significant
positive correlation (r=0.76) between
participant ages and their BMI.
EXERCISES
LINK:https://www.stthomas.edu/media/collegeofartsandsciences/biology/statspages/
FishCreekseedlings2015.xlsx
3. ANOVA