Example: One Way Analysis of Variance + Post Hoc Tests: WWW - Stars.ac - Uk
Example: One Way Analysis of Variance + Post Hoc Tests: WWW - Stars.ac - Uk
ONE WAY ANALYSIS OF VARIANCE + POST HOC
TESTS
This example uses material from the STARS project (www.stars.ac.uk).
This example uses data from a telephone survey of fast food consumers. The objective
of the analysis is to compare the number of times per month people in different age
groups buy fast food. The dependent variable is the number of times per month the
respondent buys fast food. The interviewees are divided into 5 age groups: 1517, 18
24, 25 35, 3654, 5570.
The analysis involves comparing data between 5 age groups. The dependent variable is
number of purchases per month and the independent variable, whose influence on the
dependent variable we want to study, is age group.
Null hypothesis (H0): Mean number of purchases of fast food per month is the same for
all age groups
Alternative hypothesis (H1): Mean number of purchases of fast food per month is not the
same for all age groups
Oneway analysis of variance is the first choice of technique because:
(1) 5 independent samples are being compared
(2) the dependent variable is quantitative
Using SPSS to carry out a oneway analysis of variance, produces the following
ANOVA table:
ANOVA
Purchases
Sum of
Squares df Mean Square F Sig.
Between Groups 147.000 4 36.750 8.013 .000
Within Groups 1751.935 382 4.586
Total 1898.935 386
The test statistic (F), degrees of freedom (the first two values in the column labelled df)
and the Pvalue (which SPSS labels ‘Sig.’) are circled.
The Pvalue is used to decide the conclusion of the test: SPSS displays the Pvalue to 3
decimal places, so that very small Pvalues appear as 0.000, which we report as ‘
P<0.001 ‘. As the Pvalue is below 0.05, we reject the null hypothesis.
The hypothesis test allows you to say:
‘The data provides statistically significant evidence that mean purchases per month of
fast food are not the same for all age groups (Oneway ANOVA, F = 8.013, df = 4,382,
P <0.001).’
Having found statistically significant evidence that the mean number of purchases per
month is not the same for all age groups, the next step is to explore where the
differences between age groups are found. There are several possibilities, for example,
four age groups might be similar, with just one group having a different mean, or there
could be differences between all five groups. If the ANOVA produces a statistically
significant test, we can carry out post hoc tests to see where differences between
groups occur. SPSS provides a number of post hoc tests, here the StudentNeumann
Keuls test has been used with results shown below:
Purchases
a,b
StudentNewmanKeuls
Subset for alpha = .05
Age N 1 2 3
5570 43 1.4186
3654 164 2.3476
2535 99 2.6162
1824 50 3.4800
1517 31 3.7097
Sig. 1.000 .513 .576
Means for groups in homogeneous subsets are displayed.
a. Uses Harmonic Mean Sample Size = 54.518.
b. The group sizes are unequal. The harmonic mean
of the group sizes is used. Type I error levels are
not guaranteed.
The footnotes give details related to the calculations that SPSS has used in carrying out
the tests, these do not need to be interpreted.
What do the post hoc tests show?
The post hoc tests compare the age groups two at a time. The results are shown in a
table, with age groups listed in order according to their mean value for the dependent
variable. Here, the 5570 age group are shown first as this group has the lowest mean
purchases and the 1517 year age group is shown last as they have the highest mean
purchases. The columns or subsets show the mean purchases made by each group
listed in different columns. The arrangement of the mean values in columns or subsets
show which age groups differ / do not differ significantly in terms of their mean
purchases of fast food.
If the means for two groups are shown in different columns, this indicates that there is
statistically significant evidence of a difference between their mean values. For
example, the mean for purchasers aged 5570 is shown in a column by itself and does
not appear in any other columns. This shows that the mean number of purchases made
by consumers aged 5570 is significantly different to the mean number of purchases in
all the other age groups.
If the means for two groups are shown in the same column, this indicates that there is
no statistically significant evidence that their mean values differ. For example, the mean
for purchasers aged 1824 and the mean for purchasers aged 1517 are shown together
in column/ subset 3. This shows that the mean number of purchases made by
consumers in these two age groups do not differ significantly.
Assumptions underlying the test(s)
Oneway ANOVA is a parametric method that assumes:
(1) The dependent variable is Normally distributed within each category being
compared.
(2) The dependent variable has the same variance in each of the categories being
compared.
The method is reasonably robust to small departures from these assumptions but is
sensitive to outliers.
The output below provides some information related to the assumptions. The box plot
shows that in each age group, the distribution of the number of purchases of fast food
has a slight skew, with one or two individuals who are shown as outliers. These features
are incompatible with Normally distributed data. Note that the spread or variation in
purchases seems to be greater in age groups with higher median purchases.
10.00
8.00
6.00
Purchases
4.00
2.00
0.00
Purchases
Age Mean N Std. Deviation
1517 3.7097 31 2.61015
1824 3.4800 50 2.14989
2535 2.6162 99 2.09814
3654 2.3476 164 2.17514
5570 1.4186 43 1.67946
Total 2.5685 387 2.21800
Overall the data does not seem to conform to the assumptions required for the
analyses. This means that the Pvalue may not be accurate. In this situation, some
steps need to be taken to ensure that the conclusions produced are ‘safe’. Potential
strategies would be to switch to a nonparametric method (KruskalWallis analysis of
ranks) or to look for a transformation that allows the assumptions underlying the one
way ANOVA to be met.