Data Analysis: Florenda F. Cabatit RN MA Facilitator
Data Analysis: Florenda F. Cabatit RN MA Facilitator
Florenda F. Cabatit RN MA
Facilitator
DATA ANALYSIS
A measurement in which
an attribute of a variable
is rank ordered on a scale
that has equal distances
between points on that
scale.
Ratio Scale
A quantitative measurement in which intervals
are equal and there is a true zero point.
The highest level of measurement
All arithmetic operations are permissible with
this measurement (add, subtract, multiply, and
divide numbers on this scale).
Descriptive Statistics
Three characteristics to fully
describe a set of data:
• shape of the distribution
values
• central tendency
• Variability
Review of Descriptive
Stats.
Descriptive Statistics are used to present
quantitative descriptions in a manageable
form.
This method works by reducing lots of data
into a simpler summary.
Example:
370 Centigrade as average adult body
temperature
SU’s quality-point system
Univariate Analysis
This is the examination across cases of one
variable at a time.
Frequency distributions are used to group
data.
One may set up margins that allow us to
group cases into categories.
Examples include
Age categories
Price categories
Temperature categories.
Distributions
Two ways to describe a univariate
distribution
A table
A graph (histogram, bar chart)
Distributions (con’t)
Category Percent
Under 35 9%
36-45 21
46-55 45
56-65 19
66+ 6
Distributions (cont.)
A Histogram
45
40
35
30
25
20
Percent
15
10
5
0
36-45
46-55
Under
56-65
66+
35
Central Tendency
An estimate of the “center” of a
distribution
Three different types of
estimates:
Mean
Median
Mode
Mean
The most commonly used method of
describing central tendency.
One basically totals all the results
and then divides by the number of
units or “n” of the sample.
Example: The NCM 104 Quiz mean
was determined by the sum of all the
scores divided by the number of
students taking the exam.
Median
The median is the score found at the
exact middle of the set.
One must list all scores in numerical
order and then locate the score in
the center of the sample.
Example: If there are 500 scores in
the list, score #250 would be the
median.
This is useful in weeding out outliers.
Mode
The mode is the most repeated score
in the set of results.
Lets take the set of scores:
15,20,21,20,36,15, 25,15
Again we first line up the scores
15,15,15,20,20,21,25,36
15 is the most repeated score and is
therefore labeled the mode.
Central Tendency
If the distribution is normal (i.e., bell-
shaped), the mean, median and mode
are all equal.
In our analyses, we’ll use the mean.
Dispersion
Two estimates types:
Range
Standard deviation
Standard deviation is more
accurate/detailed because an outlier can
greatly extend the range.
Range
The range is used to identify the
highest and lowest scores.
Lets take the set of
scores:15,20,21,20,36,15, 25,15.
The range would be 15-36. This
identifies the fact that 21 points
separates the highest to the lowest
score.
Standard Deviation
The standard deviation is a
value that shows the relation
that individual scores have to
the mean of the sample.
If scores are said to be
standardized to a normal curve,
there are several statistical
manipulations that can be
performed to analyze the data
set.
Standard Dev. (con’t)
Assumptions may be made about
the percentage of scores as they
deviate from the mean.
If scores are normally distributed,
one can assume that
approximately 69% of the scores in
the sample fall within one standard
deviation of the mean.
Approximately 95% of the scores
would then fall within two standard
deviations of the mean.
Standard Dev. (con’t)
The standard deviation calculates
the square root of the sum of the
squared deviations from the mean of
all the scores, divided by the number
of scores.
This process accounts for both
positive and negative deviations
from the mean.
RESEARCH QUESTION: DESCRIBE
Frequency distribution
Distribution Contingency Table
NOMINAL
Central Tendency
Mode
Central Tendency
Mode, Median
Frequency Distribution
Distribution Contingency Table
Scatterpoint
RATIO/INTERVAL
Central Tendency
Mode, Median, Mean
Variability
Range, Variance,
Standard Deviation
Inferential
statistics
Based on the law of probability
It provides a means for drawing
conclusions about a population,
given data from a sample
It estimates population parameters
from sample statistics
Inferential
Statistics
Statistical Inference consists of two
techniques:
2.Estimation of parameters
3.Hypothesis testing
Hypothesis Testing
Statistical hypothesis testing provides
objective criteria for deciding whether
hypotheses are supported by empirical
evidence.
It is a process of disproof or rejection.
Researchers seek to reject the null
hypothesis through various statistical
tests.
Hypothesis testing uses samples to draw
conclusions about relationships within the
population.
Type I and Type II
Errors
Type I Error - researchers make a type I
error when a true null hypothesis is
rejected.