Basic Concepts
STATISTICS
1. Set: A group or collection of objects
What is Statistics? or numbers considered as a single
from a Latin word “status” which means entity.
state 2. Universe (Target Population): The
set of all entities under study;
A science (branch of Mathematics) that 3. Population (Accessible
deals with collection, presentation, analysis, Population): The complete
and interpretation of data. collection of all possible values of
the variables;
BRANCHES OF STATISTICS 4. Sample drawn from a population.
1. Descriptive Statistics
POPULATION AND SAMPLE
Summarizes or describes important
characteristics of a data set. ● Parameter: Describes the
population;
Involves collection, organization, ● Statistic: Describes the sample;
summarization, and presentation of data.
Statistical Data
Examples:
Data: Results from experimentation,
The monthly income of a nurse in the observation, or investigation, often
Philippines is ₱22,500. appearing as numerical figures.
2. Inferential Statistics Refers to any information concerning a
population or sample.
Provides information about a population by
studying a sample.
Classification (Nature) of Data
Interprets and draws conclusions from data.
1. Types of Data According to
Uses hypothesis testing. Source
Example: Primary Data: Information gathered directly
from the original source.
90% of married men were alive at age 65.
Secondary Data: Information taken from a
Inference: Being married is associated with secondary source.
a longer life for men.
Examples:
Data gathered by an interviewer from an
interviewee.
Information from newspapers, books,
theses, etc.
2. Types of Data According to ● Values obtained by measuring.
Functional Relationship
Independent Data: Data not affected by 4. Classification of Data According to
other data. Scales of Measurement
Four Levels of Measurement
Dependent Data: Data affected by
controlling data. 1. Nominal
○ Data consists of names,
Examples of Independent and Dependent labels, or categories only.
Variables 2. Ordinal
○ Measurements deal with
● How does the amount of sleep order or rank
impact test scores? 3. Interval
○ Independent Variable: Time ○ Shows likeness, differences,
spent sleeping before the and gives meaningful
exam. amounts between data
○ Dependent Variable: Test 4. Ratio
Score. ○ Modified interval level to
include zero as a starting
3. Types or Categories of Data point.
1. Qualitative Data: Uses categories Important Notes
or attributes distinguished by
non-numeric characteristics. ● No addition involved in interval data
2. Quantitative Data: Consists of (e.g., temperature).
numbers representing counts or ● Can be added in ratio data (e.g.,
measurements. weight).
Examples:
Steps in Statistical Investigation
Qualitative: Gender, religion, race/ethnicity.
1. Identification of the Problem
○ Formulate a problem or
Types of Quantitative Data concept worth studying.
Discrete Data 2. Collection of Data
○ Different methods and
● Can assume a finite or countable
techniques for gathering
number of values.
data.
● Values obtained by counting.
3. Presentation of Data
Continuous Data ○ Tabulation and organization
of data in tables, graphs, and
● Can assume an infinite number of charts.
possible values corresponding to 4. Analysis of Data
points on a line interval.
○ Deriving relevant information
from gathered data using
Random Variables
statistical tools.
5. Interpretation of Data
Discussion
○ Drawing conclusions or
● Random Variable
inferences from analyzed
○ Also called a stochastic
data.
variable.
○ A set of possible values from
Methods of Collecting Data a random experiment.
○ Denoted as X or any capital
1. Direct or Interview Method letter because its value is not
2. Indirect or Questionnaire Method constant.
3. Registration Method ○ Assumes different values due
4. Observation Method to chance.
5. Experiment Method ● Definition
○ A numerical description of
the outcome of a statistical
Methods of Presenting Data experiment.
● Types of Random Variables
1. Textual Form
○ Discrete Random Variable:
○ Data is presented in
■ Obtained from a
paragraph form.
counting process.
2. Tabular Form
○ Continuous Random
○ Data presented in tables
Variable:
(rows and columns).
■ Obtained from a
○ Parts of the table: Title,
measurement.
Row’s name, Column’s
name, cell description. Random Variables Remember
3. Graphical Form
○ Data presented in visual ● A random variable is discrete if its
forms. set of possible outcomes is
○ Common Graphical countable.
Presentations: ○ Examples: Number of soda
■ Bar Chart cans, number of chairs,
■ Pie Chart number of students in a
■ Pareto Chart class.
■ Pictograph ● A random variable is continuous if it
■ Line Chart takes on values on a continuous
■ Map Graph or Map scale.
Chart ○ Examples: Height, weight,
volume of water, amount of
solution in alcohol, time
required to run a mile.
■ FINITE Sample
Space: Finite or
● Probability Distribution definite number of
○ Also known as probability outcomes.
mass function or discrete ■ NULL Sample
probability distribution. Space: No elements
○ A table that lists probability in the sample space.
values with their associated 4. Event (E)
values in the range of a ○ A subset of the sample
discrete random variable. space; set of all expected
outcomes.
Properties of Probability Distribution ○ SIMPLE Event: Contains
only one sample point.
1. The sum of all probabilities must
5. Experiment
equal 1.
○ A simple process of noting
2. Each probability must be between 0
an outcome.
and 1.
○ Outcome: A direct
measurement or answer
obtained after an experiment.
Random Variables & 6. n(S)
○ Total number of events in the
Probability sample space.
Distributions Sample Space and Events
Definition of Terms
● Finite Sample Space:
1. Probability ○ Example: Set A = {0, 1, 2, 3,
○ Describes the level of 4, 5}.
certainty (likelihood, chance, ● Infinite Sample Space:
or possibility). ○ Example:
○ Can be expressed in ○
decimal, fraction, or ● Simple Event:
percentage. ○ Example: Set B showing the
2. Probability Distribution Philippine National Fish.
○ A table, graph, formula, or ● Null Sample Space:
notation that supplies the ○ Example:
probability of a given ● Sets can be described using:
outcome’s occurrence. ○ SEMANTIC Form:
3. Sample Space (S) Statement describing
○ Set of all possible outcomes elements.
of an experiment. ○ ROSTER Form: Listing
○ Types of Sample Space: method.
■ INFINITE Sample ○ SET BUILDER Notation:
Space: Not finite. Rule method.
Random Variables Tossing Three Fair Coins
● A variable determined by chance, ● Sample Space: S = { HHH, HHT,
denoted by x THH, THT, TTH, HTT, HTH, TTT }
● Types of Random Variables: ● Number of Outcomes: n(S) = 2^3 =8
○ Discrete Random Variable:
Countable possible Histogram for the Probability Distribution of
outcomes. the Discrete Random Variable
○ Continuous Random
Variable: Takes values on a ● Probability P(X) vs. Number of
continuous scale. Tails (X).
Probability Distributions of
Discrete Random Variable Discrete Probability Distributions:
● A probability distribution provides the Mean, Variance, & Standard Deviation
probability for each value of the
Mean of Discrete Probability Distribution
random variable, denoted by P(X)
Formula: μ = ΣX· P(X)
● Conditions:
1. The probability of each value
Mean of Discrete
is between 0 and 1, inclusive.
■ 1 means the event is
very likely (100%).
■ 0 means it is not likely
(0%).
2. The sum of all probabilities is
equal to one.
Constructing a Discrete Probability
Distribution Random Variable Formula:
Let x be a discrete random variable with
possible outcomes
1. Make a frequency distribution for the
possible outcomes.
2. Find the sum of the frequencies.
3. Find the probability of each possible
outcome by dividing its frequency by
the sum of the frequencies.
4. Check that each probability is
between 0 and 1 and that the sum is
1. μ = X •P(X )+X •P(X) +.. + XP(X) Where: 1 1
2 2 ο X1, X2,..., Xn are values of the
Examples of Probability of a Simple Event random variable X.
P(X1), P(X2),..., P(Xn) are corresponding Characteristics of Normal Curve
probabilities.
1. Its highest point is directly above the
Steps to Calculate Mean: mean.
2. The distribution curve is bell-shaped,
1. Construct the probability distribution. varying in shape based on the mean
[Link] the value of the random variable and standard deviation.
X by the corresponding probability P(X) 3. The tails of the curve are asymptotic
to the horizontal axis.
3. Get the sum of the results obtained in 4. The curve is symmetric with respect
step 2. to the vertical line passing through
the mean.
Finding the Probability of Getting TAILS
5. The area under the normal curve
Using 4 Fair Coins
represents probability and equals 1
● n(S) = 2^4 = 16. or 100%.
6. The mean, median, and mode have
the same value and are located at
Statistics and Probability the same point (center).
Standard Scores (Z-scores)
Introduction to Normal Curve
● The areas under the normal curve
● A normal probability distribution is a are expressed in terms of z-values
bell-shaped frequency distribution or scores.
curve. ● The z-score indicates the position of
● Most data values cluster around the a single score under the normal
mean. curve, measured in standard
● The bell curve has a small deviations relative to the mean.
percentage of points on both tails
and a larger percentage in the
center.
● In graph form, normal distribution
appears as a bell curve.
Normal Probability Distributions
● The normal curve is often called the
Gaussian distribution, named after
Carl Friedrich Gauss.
● Gauss is recognized as one of the
greatest mathematicians and is
honored on Germany's 10
Deutschmark bill. Areas Under the Normal Curve
Properties of Normal Curve Example 2
To find the area under the normal curve: 2. Non-probability Sampling:
Non-equal chances (biased),
1. Compute the z-value. non-random sampling.
2. Graph the z-value.
3. Identify the shaded region/area.
4. Find the corresponding area of the Types of Probability
z-value in the Z-table. Sampling Techniques
5. Compute the shaded region/area.
1. Simple Random Sampling:
○ Elements are selected
SAMPLING & SAMPLING
through methods like "lottery"
DISTRIBUTION or "fish bowl".
○ Examples:
■ In a beauty contest,
Introduction to Random questions are
Sampling randomly selected
using a fish bowl.
● What is sampling? ■ A teacher picks a
1. A technique of drawing a name for recitation
sample from a population. using index cards.
2. Used when the entire 2. Systematic Random Sampling:
population is not available or ○ Adopts a skipping pattern in
is too large. selection.
● How to determine the number of ○ Better cross-section if the
samples? listing is linear but risks bias
1. By percentage: if periodicity exists.
■ 10% - 20% based on 3. Stratified Random Sampling:
the researcher's ○ Population is divided into at
discretion. least 2 subpopulations
■ At least 10% for large (strata).
populations, 20% for ○ Samples are drawn
small populations. proportionately from each
stratum.
○ Example: Determine the
number of students from
each college with a 5%
margin of error.
4. Cluster Sampling:
○ Population is divided into
● How to select a sample? clusters.
1. Probability Sampling: Equal ○ All elements of selected
chances (unbiased), random clusters are included in the
sampling. sample.
○ Example: A medical student ○ Mean, Variance, Standard
interviews all dengue Deviation.
patients in 10 randomly
selected barangays. Sampling Distribution of the
5. Multi-Stage Sampling: Population Mean
○ Selection of samples occurs
in several stages. ● Data: 1, 2, 3, 4, 5.
○ Example: Monthly surveys ● Sample Mean and Standard
before a presidential election Deviation:
for 5 months. ○ Example calculations for
sample mean and standard
deviation.
Types of Non-Probability
Sampling Techniques Properties of the Sampling
Distributions of Sample Mean
1. Purposive / Judgmental Sampling:
○ Samples are selected based
on specific criteria set by the
researcher.
2. Quota Sampling:
● Standard Error of the Mean:
○ Sample size is limited to a
○ where o is the population
required number of subjects.
standard deviation and n$is
3. Convenience Sampling:
the sample size.
○ Samples are selected from a
○ Measures accuracy of
specific place at a preferred
sample mean as an estimate
time.
of population mean.
4. Snowball Sampling:
○ Interpretation:
○ Samples are drawn from
■ Good estimate: small
different stages, useful in
standard error.
deviant studies and
■ Poor estimate: large
subcultures.
standard error.
SAMPLING Parameter vs Statistic
DISTRIBUTIONS: Mean, ● Definitions:
Variance, & Standard 1. Parameters: Descriptive
measures from a population.
Deviation 2. Statistics: Descriptive
measures from a sample.
Formulas for Sampling ● Examples:
Distributions 1. 60% of 550 churchgoers
preferred a barangay hall for
● Population vs Sample: Christmas.
2. 100% of 2,140 marathon
runners drank at least 3
bottles of water.
3. 30 students had a mean
score of 82% in a diagnostic
test.
4. 10 basketball games had a
mean total score of 152.
5. Average height of 25
students was 5’2”.
Sampling Distributions for
the Variance and Standard
Deviation
● Statisticians analyze variation of
individual data values about the
population mean.