PSEM: Non Parametric Analyses
What does the highlighting mean?
You will notice that some phrases and sentences are highlighted throughout this document.
Yellow highlighting indicates important information, such as key words or important
details.
Red highlighting indicates information that may be a bit too detailed. You don’t
necessarily need to completely understand it right now
Green highlighting indicates an action to perform.
At the start of this class, your tutor will talk you through a short experiment to test the effects
of environmental noise on word recall. In this study participants are randomly allocated to one
of two conditions. Both conditions are given a list of 50 words, chosen randomly from the
dictionary, to study for 10 minutes. In both conditions participants wear noise cancelling
headphones. The Silence condition had nothing playing in their headphones, while the Static
condition had white noise played through their headphones. After the 10 minute study period,
all participants have the list of words removed and have to write down as many words as they
can remember. We are going to look over four different versions of this data, illustrating the
different results that can be found if we ignore the assumptions of parametric tests.
Why do we even need non-parametric tests? Is it because statisticians are sadistic and
enjoy inflicting confusion on research methods students?
Although it is a possible explanation, non-parametric tests do not only exist so that we can
torture research design and statistics students. (Notice I said not only). The main reason we
need non-parametric tests is because the data we collect in human behaviour research rarely
(if ever) actually conform with the distributional assumptions that are required for parametric
tests to function properly.
Page 1 of 14
Types of non-parametric tests
There are non-parametric alternatives to all of the parametric tests we have looked at in
this unit so far.
If a paired-samples test was called for, but the data didn’t meet the necessary
assumptions, the data could be analysed using Wilcoxon’s matched-pair signed
ranks test
An independent samples t-test could be replaced with a Mann-Whitney U test
A one-way, between-groups ANOVA can be replaced with a Kruskal-Wallis test.
Regardless of the scale of measurement in your design, Non-parametric tests transform the
DV into ordinal data
All non-parametric tests function by converting data onto an ordinal scale. The tests do
this internally. They rank the scores from smallest to biggest, assigning ‘ranks’ to each
value ignoring the gaps in scale between each data point.
Mann-Whitney U test
The Mann-Whitney U test is used instead of an independent-samples t-test. We will use
the "Independent Samples" data set to illustrate how to determine when to use a Mann
Whitney U test instead of an independent samples t-test and how results can differ when we
violate the assumptions of parametric tests.
Task 1: Checking the distributions
The data file
The first variable (Group) is a dichotomous IV identifying the group each participant is
assigned to. The next five variables are different versions of the DV (number of words
recalled). The first version of the DV, Recall, is the original. The remaining four versions
have been modified in some way, either to make them unsuitable for a standard parametric
Page 2 of 14
test, or to illustrate a difference between parametric and non-parametric tests more clearly.
Before describing the different sets of scores make a boxplot graph of the first four to
compare the distributions.
Select Graphs->Legacy dialogs->Boxplot. Select Clustered and Summaries of
separate variables, then click define. Move Group into the Category Axis box and the first
three data variables in to the Boxes Represent: slot. The dialog should like the one shown
below. Click OK to create the plots.
The figure will show three box and whisker plots for each group (silence and static). The
different colours indicate which variable the boxplot is associated with.
Page 3 of 14
Inspect the boxplots for each variable.
The blue boxplots show the data from the original data set that does not violate the
assumptions and could be analysed using an independent-samples t-test. The next two DVs
violate assumptions required for a t-test.
The green boxplots have one observation in each condition that is an extreme outlier
(case 3 in the silence condition and 13 in the static condition). Notice that the medians
(horizontal lines are unchanged).
The red boxplots don’t meet the assumption of homogeneity of variance. Notice that
the boxplot for silence is much wider than that for static.
Task 2: Running a Mann-Whitney U test
select Analyze->Nonparametric Tests->Independent Samples.
Page 4 of 14
This will bring up the following dialog box.
To run the test you need to work through each tab (Objective, Field, Settings) and set up
the options on each page. Notice the description at the bottom of the dialog box tells you what
this analysis can be used for. It includes both the Mann-Whitney U test and Kruskal-Wallis
test (non-parametric one-way ANOVA). We will look at the Kruskal-Wallis test shortly.
Objective tab
The default options are fine.
Fields tab
On this tab you choose the variables you want to include in the analysis; these get
moved into the Test Variables box. Although you could select all the variables at
once we will start simpler and choose only the first two, RecallPara and
RecallOutliers. Move these to the Test Variables box and move Group to the Groups
box. Then select the Settings tab.
Page 5 of 14
Settings tab
On this tab you can select the types of tests to conduct. This is shown in the dialog box
below. We will simply leave the defaults since they are appropriate for what we want, but, as
you can see, there are plenty of options available (explore in your own time).
Page 6 of 14
Task 3: Interpreting the output
We are now ready to run the analysis, so click the button. This will produce the
results shown in the table below.
The tables produced by non-parametric tests are one of the easiest things in SPSS to interpret
They explicitly tell you what test you have done, they provide a statement of the null
hypothesis for the test, they tell you the p-value for the test, and the decision based on the p
value. (You’re probably wondering why this sort of output wasn’t available for any other
tests. Honestly, I agree!).
There is more information available for the analysis. Scroll down to find the rest of the
information for the test.
Page 7 of 14
The information you need to report for this test can be found in the table.
Like any other test we need to report some necessary information from the table including
the test statistic (U = 24.5), the standardised test statistic (z = -1.95) and the p-value (there are
two options here, Exact is most accurate so we will report this one, p = .052). Finally, you
need to report the appropriate descriptive statistics and for non-parametric tests we report the
median (abbreviated Mdn). You can also calculate an effect size for this test, which can be
obtained by dividing the standardised test statistics (-2.88), by the square root of the total
sample size (N = 20).
Page 8 of 14
Scroll down to look over RecallOutliers * Group(Test2) to show the results for the data
set that includes outliers.
Task 4: Writing up the results for the data set that includes outliers. To get the medians
you will need to run Analyze->Descriptive statistics->Explore. Click the 'Statistics'
radio button in the bottom left, we don't need plots today.
When reporting Medians as the measure of central tendency, the interquartile range
(IQR) is the appropriate statistic for variability. This is given in the Explore output
table.
The median number of words recalled was for the Silence group was 19.0 (IQR =
3.25), while the Static group had a median recall of 13.5 words (IQR = 4.5). A
Mann-Whitney test revealed that the advantage for the Silence group was
statistically significant, U (N = 20) = 12.00, z = -2.88, p = .003, r = -.64.
Task 5: Conduct an independent-samples t-test on the same data set (RecallOutliers).
Compare the results of the t-test with the Mann-Whitney U test. Which test is more
powerful in this situation (i.e., when the assumptions for a parametric test are violated)?
The independent-samples t-test yields a non-significant result (p = .427). From the
previous task it is clearly evident the non-parametric test is more powerful. If the
assumptions for a parametric test are met, the parametric test will be more powerful
(the variable Number of words parametric will demonstrate this). However, if the
assumptions for a parametric test are violated, a non-parametric tests can be much
more powerful.
The non-parametric tests ranks the individual scores. As a result, the two extreme
outliers are brought much closer to rest of the scores in the groups. If the other
scores are consistently lower in one group their ranks will also be consistently lower
and the Mann-Whitney test will yield a significant result.
Page 9 of 14
Task 6: Conduct and compare the results of independent-samples t-tests and Mann-
Whitney U tests for the remaining variables.
The variable Recallhetero violates the homogeneity of variance assumption;
Recallpara satisfies the assumptions for the t-test, but has a small effect size
RecallRank is the rank transformation of the RecallOutliers variable, so
technically violates the requirement for normal distribution and interval/ratio
data.
RecallHetero violates the homogeneity of variance assumption (Levene's is
significant, p = .002). In addition, the t-test is non-significant regardless of whether
we variance assumption we choose to look at (equal variance p = .073, unequal p
= .088). But remember that it was significant when analysed using Mann-Whitney U
(p = .023).
The t-test conducted on RecallPara satisfies Levenes (p = .466) and is significant (p
= .043). Importantly, it is not significant when analysed using a Mann-Whitney U
test (p =.052). The take home message here is that when assumptions for
parametric tests are met, a parametric test is more powerful than a non-parametric
test.
RecallRank is the RecallOutlier data transformed to rank scores. Rank-transformed
data is ordinal and isn't technically appropriate for t-test since ordinal data is not
normally distributed. However, t-tests are quite robust violations of normality, much
more so than to violations of homogeneity of variance and outliers. So running a
t-test on these data isn't too egregious and error. The t-test on RecallRank is
significant (p = .002). Note that the p-value is very close to the p-value does
obtained with the Mann-Whitney U test (p =.003). The message here is that using a
parametric test on rank-transformed data is very similar to conducting a
non-parametric test.
Other types of non-parametric tests
You can follow the same basic steps to complete non-parametric versions of paired-
samples t-tests and one-way ANOVA. The final parts of the prac will be to conduct these tests
and briefly summarise the results.
Task 7: Conduct a Wilcoxon matched-pair signed-rank test using the data set
"Related samples" (paired-samples t-test used in Lab 6 t-tests).
To run this test you need to select Analyze->Non-parametric Tests->Related samples.
Page 10 of 14
In the dialog, select the Fields tab and move Normal_0 score and Normal_90 score into
the Test Fields section.
Make sure the Settings tab is set out as below.
Page 11 of 14
The name of this test statistic is T. Report the results following the same layout used for the
Mann-Whitney test. To get the medians for these conditions you can use the Explore function
(without a grouping variable). This will also provide the IQRs.
See slides.
Task 8: Conduct a Kruskall-Wallis test on the data set "More than 2 groups" (the
Lab 7 data, ANOVA). The test statistic is H and you need to report the degrees of
freedom (number of groups – 1), the value of the test statistic, and the p-value.
When running this test you need to select the Customize tests option and select Kruskal-
Wallis (see below). Leave the Multiple comparisons option at its default (All pairwise).
Page 12 of 14
Once the test is run, double-click the output table to see the results shown below.
Page 13 of 14
The top table and figure shows the results of the omnibus test. Scroll down to see the
Pairwise comparisons to see the results of the post-hoc comparisons (see screenshot below).
The pairwise comparisons reported in the table should be pretty straightforward to interpret.
Summarise the results below.
See slides.
Page 14 of 14