Methods Guide
Methods Guide
- Mean (average)
- Median (middle value of set)
- Mode (most recurring value in set)
Data set
You can only produce a frequency count for both ‘institution’ and ‘gender’
Measures of dispersion (how do data points vary from the mean).
Answers
2. 3, 4 ,6, 6, 6, 7, 8, 8, 8, 9 = 7
3. 6 & 8
4. Xmax - Xmin, 9 - 3 = 6
Variance
Sum of the distance of each value away from the mean squared, divided by
the number of data points.
Each data point has a distance from the mean, we square that distance and
divide by n-1 in the case of a sample.
Variance Formulas
Simple Example.
The sequence: 2, 3, 4, 5, 6, 7, 8
x̄ (mean) = 5
(2 - 5)2 = 9
(3 - 5)2 = 4
(4 - 5)2 = 1
(5 - 5)2 = 0
(6 - 5)2 = 1
(7 - 5)2 = 4
(8 - 5)2 = 9
9+4+1+0+1+4+9 = 28
28 / 7 = 4
Variance = 4
Standard Deviation
Properties of curve:
- Normal curve is symmetrical
- The average, median and mode are equal
- The tail ends do not touch the axis
- Given a mean (X with line on top), and a standard deviation (S) you can
draw the curve
Example
The previous example we found that the variance of that sequence was 4. So
find the standard deviation of the same sequence?
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
Standard Deviation = 2
Z-Score
Properties of curve:
- Normal curve is symmetrical
- The average, median and mode are equal
- The tail ends do not touch the axis
- Given a mean (X with line on top), and a standard deviation (S) you can
draw the curve
- Caution: normal curve assumptions need groups of around 50 or more
An example
Describe the Maltese population given that:
- The average age is 47
- The standard deviation of the population is 15
Sampling distribution
3 distributions
1. Population
2. Sample
3. Sampling distribution
Confidence intervals
Chi-Squared
- A chi-square test is a statistical test used to compare observed results
with expected results. The purpose of this test is to determine if a
difference between observed data and expected data is due to chance,
or if it is due to a relationship between the variables you are studying.
Example:
Suppose you are conducting a survey to determine whether there is a
relationship between gender and a preference for two different types of soft
drinks: "Soda A" and "Soda B." You survey 100 people and record their
preferences as follows:
- 40 males prefer Soda A
- 20 males prefer Soda B
- 30 females prefer Soda A
- 10 females prefer Soda B
Null Hypothesis (H0): There is no association between gender and soft drink
preference.
Alternative Hypothesis (H1): There is an association between gender and
soft drink preference
Male 40 20 60
Female 30 10 40
Total 70 10 40
2. Calculate the expected frequencies for each cell under the assumption
of independence. To do this, you can use the formula for expected
frequency:
3. Calculate the chi-squared value for each cell and sum them up:
Chi-squared = ((40 - 42)^2 / 42) + ((20 - 18)^2 / 18) + ((30 - 28)^2 / 28) + ((10 -
12)^2 / 12) = 0.57
If asked to find the degrees of freedom not to worry.