STATISTICS AND PROBABILITY
LECTURE 1 | SECOND SEMESTER
INTRODUCTION TO STATISTICS
EXPLORING DATA TWO MAJOR AREAS OF STATISTICS
1. Descriptive Statistics
It comprises of those statistical tools
which deal on the presentation of the
observed data that can be done in
various forms such as tables, graphs
and diagrams or describing the data
through computation of measures that
Using data to find an answer or a solution summarize the characteristics of the set
to a problem or an inquiry is actually using of data.
the statistical process or doing it with
STATISTICS.
STATISTICS
2. Inferential Statistics
is a mathematical science pertaining to
This consists of generalizing from
the collection, presentation, analysis
samples to populations, performing
and interpretation of data.
characterize persons, objects, situations, estimations and hypothesis tests,
and phenomena; determining relationships among
explain relationships among variables; variables, and making predictions.
formulate objective assessments and Example:
comparisons; Allergy therapy makes bees go
make evidence-based decisions and away. (Source: Prevention)
predictions. Drinking decaffeinated coffee can
raise cholesterol levels by 7%.
IMPORTANCE OF STATISTICS
(Source: American Heart
Everybody watches weather Association).
forecasting. There are some computer
models build on statistical concepts. KNOW YOUR JARGONS
Statistics mostly used by the researcher. 1. Population
They use their statistical skills to collect Totality of observations
the relevant data. Otherwise, it results in 2. Sample
a loss of money, time and data. Subset of a population
News reporter makes a prediction of
3. Data
winner for elections based on political
Are the values (measurements or
campaigns.
observations) that the variable can
Statistics data allow us to collect the
consume
information around the world. The
internet is a devise which help us to 4. Data Set
collect the information. The A collection of data values
fundamental behind the internet is
based on statistics and mathematics
concepts.
ALLEYAH D. GARLAN 11 - XI
LEVELS OF MEASUREMENT TRIVIA
The word “statistics” actually comes from
the word “state”— because governments
have been involved in the statistical
activities, especially the conduct of
1. Nominal censuses either for military or taxation
Used for naming variables purposes.
Ex: Brands, Color, Blood Type, Tribe In the Christian Bible, particularly the Book
of Numbers, God is reported to have
2. Ordinal
instructed Moses to carry out a census.
Classifies data into categories that
can be ranked Another census mentioned in the Bible is the
Ex: Sizes, Rank, Educ. Attainment census ordered by Caesar Augustus
throughout the entire Roman Empire before
3. Interval
the birth of Christ.
It lacks an inherent zero starting points
or lacks absolute zero The Philippine Statistics Authority (PSA),
Absolute zero means the total conducts censuses to obtain information
absence of the characteristic being about socio-demographic characteristics of
measured the residents of the
Ex: Temperature, IQ country.
4. Ratio Census data are used by the government
There exists a true zero, meaning it to make plans, such as how many schools
has an inherent zero starting point and hospitals to build. Censuses of
Ex: Height, Length, Weight, Salary, population and housing are conducted
Age every 10 years on years ending in zero (e.g.,
1990, 2000, 2010)
STATISTICAL PROCESS IN SOLVING PROBLEM
to obtain population counts, and
Planning or designing the collection of demographic information about all Filipinos.
data to answer statistical questions in a DATA
way that maximizes information content
and minimizes bias. A collection of facts from experiments,
Collecting the data as required in the observations, sample surveys and
plan. censuses, and administrative reporting
Verifying the quality of the data after systems.
they were collected; Data are facts and figures that are
Summarizing the information extracted presented, collected and analyzed.
from the data; and Data are either numeric or non-numeric
Examining the summary statistics so that and must be contextualized.
insight and meaningful Information can CONTEXTUALIZATION OF DATA
be produced to support decision- To contextualize data, we must identify
making or solutions to the question or its six W’s or to put meaning on the data,
problem at hand. we must know the following W’s of the
NOTE: not all questions or problems could be data:
answered by a simple statistical process. There Who? Who provided the data?
are indeed problems that need complex What? What is the unit of --
statistical process. However, one can be assured When? When was the data
that logical decisions or solutions could be
collected?
formulated using a statistical process.
ALLEYAH D. GARLAN 11 - XI
CONTEXTUALIZATION OF DATA 3. Graphical Presentation
Where? Where was the data Histogram
collected? This closely resembles the bar
Why? Why was the data collected? chart with the basic difference
HoW? HoW was the data collected? that a bar chart uses class limits
Once the data are contextualized, there while histogram employs class
is now meaning to the collection of boundaries
number and symbols which may now Using the class boundaries
look like the following: eliminates the spaces between
rectangles, thus giving it a solid
appearance
It is constructed by plotting the
class boundaries against the
frequency
Frequency Ogive
METHODS OF PRESENTING DATA
It represents a cumulative
Presentation of data refers to the frequency distribution
organization of data into tables, graphs It is constructed by plotting the
or charts, so that logical and statistical class boundaries against the
conclusions can be derived from the
cumulative frequency less than
collected measurements
the upper class boundaries in the
1. Textual Presentation vertical scale
The data that are being collected
Pie Graph
are presented in sentence form
This is a circle divided into pie-
Ex: Ten of the respondents are male
shaped sections, which look like
and thirteen of the respondents are
slices of pizza.
female
The angle of a sector is a
2. Tabular Presentation proportional in size to the
A systematic and logical frequencies or relative
arrangement of data in the form of frequencies.
rows and columns Angle of a Sector = Rf x 360°
Frequency Distribution Table
is a tabular arrangement of data CONSTRUCTING FREQUENCY DISTRIBUTION
into appropriate categories 1. Find the range (R) of the raw data:
showing the number of R = (highest value) - (lowest value)
observations in each category or
2. Decide on the number of class interval
group
(or simply classes, k:)
3. Graphical Presentation k = 1 + 3.322 log N
Bar Graph N is the number of observations
are often used when comparing
values from two or more groups 3. Determine the class size/class width, c
or categories 4. Determine the class limits of the k classes
It is constructed by plotting the
5. Tally the observations in the frequency
class limits against the frequency
column
Frequency Polygon
It is constructed by plotting the
class marks against the frequency
ALLEYAH D. GARLAN 11 - XI
STATISTICS AND PROBABILITY
LECTURE 2 | SECOND SEMESTER
SOME MATHEMATICAL NOTATION
THE SUMMATION SYMBOL EXAMPLE
The Greek letter “Σ " (upper case sigma)
denotes the summation symbol.
It is a compact way to write the sum of a
set of data values.
A convenient way of writing a data
value in mathematical notation is the
subscripted variable 𝑥 , which is read as
‘x sub i’. DATA DESCRIPTION
When a set of data values are written as The word average is ambiguous, since
subscripted variables 𝑥 , 𝑥 , 𝑥 , …, 𝑥 several methods can be used to obtain an
average. Loosely stated, the average
means the center of the distribution or the
EXAMPLES
most typical case.
Measures of average are also called
measures of central tendency. However,
knowing the average of a data set is not
enough to describe the data entirely.
UNGROUPED VS GROUPED DATA
Ungrouped data which is also known as
raw data is data that has not been
placed in any group or category after
collection.
Grouped data is the type of data which
is classified into groups after collection.
MEASURES OF CENTRAL TENDENCY
THE FACTORIAL SYMBOL A measure of central tendency is a
single value that attempts to describe a
The symbol “!” denotes the factorial
set of data by identifying the central
symbol.
position within that set of data.
The factorial notation is a compact way
As such, measures of central tendency
of writing the product of a sequence of
are sometimes called measures of
positive integers.
central location.
The symbol n! is defined as
ARITHMETIC MEAN
UNGROUPED GROUPED
Note:
1. n! is the product of all positive integers
less than or equal to n.
2. By convention, 0! = 1
ALLEYAH D. GARLAN 11 - XI
THE MEAN SUBLESSON
The mean (or arithmetic average) is the Explain
most popular and well known measure of
central tendency. It can be used with both
discrete and continuous data, although its
use is most often with continuous data. The
mean is obtained by adding all the values
of the data and dividing the sum by the
total number of values.
ADVANTAGES OF THE MEAN
1. The mean is not often one of the actual
values that you have observed in your
data set. However, one of its important
properties is that it minimizes error in the
prediction of any one value in your data
set. That is, it is the value that produces
the lowest amount of error from all other
values in the data set.
2. It includes every value in your data set
as part of the calculation.
3. The mean is the only measure of central
tendency where the sum of the
deviations of each value from the mean
is always zero.
ALLEYAH D. GARLAN 11 - XI