0% found this document useful (0 votes)
151 views134 pages

History and Basics of Statistics

Uploaded by

ramdomingo679
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
151 views134 pages

History and Basics of Statistics

Uploaded by

ramdomingo679
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

STAT

Origins of Statistics: Ppt.1

❖Statistics is as old as man’s societal existence.

❖As a discipline it began when man started to


count and measure.

❖Historically, statistics dated back to the ancient


Egyptians and Chinese who used charts and
tables to keep state records.
STAT
Ppt.1

Statistics comes from the latin word “status”


meaning “state”.

It involves compilation of data and graphs


describing various aspects of the state or country.
STAT
Ppt.1

During Chou Dynasty, at around 2000 BC, they


have extensively recorded its revenue collection
and expenditure.

Thus in ancient days, state-istics, was the place


to find information on taxes, soldiers, crop yields
and athletic endeavors. It helped governments
investigate how many farms are taxable and
how many men are ripe for the military training.
It also determined birth and mortality rates.
STAT
Ppt.1
Statistics is the practice or science of
collecting and analyzing numerical data in
large quantities, especially for the purpose of
inferring proportions in a whole from those in
a representative sample.

Statistics is a science that helps us make


decisions and draw conclusions in the
presence of variability.
STAT
Ppt.1
Statistics is a branch of Mathematics
that deals with the scientific collection,
organization, presentation, analysis, and
interpretation of numerical data in order
to obtain useful and meaningful
information.

Statistics is simply the science of data.


STAT
Ppt.1
How do we apply Statistics?
Statistics supports the creative process.

- in making decisions
- in solving problems
- in designing products and
processes
STAT
Ppt.1

STEPS IN SCIENTIFIC METHOD


⬥ Develop a clear description
⬥ Identify the important factors
⬥ Propose a model
⬥ Refine a model
⬥ Conduct experiments
⬥ Manipulate the model
⬥ Confirm the solution
⬥ Conclusions and
recommendations
STAT
Ppt.1

There are two major areas of statistics:

Were you able to remember?

What are the those two?

Differentiate them.
STAT
The two major areas of statistics: Ppt.1

Descriptive Statistics – is a statistical method concerned


with the properties and characteristics of a set of data.
It includes the techniques which are concerned with organizing,
summarizing, and describing data.

Inferential Statistics – is a statistical method concerned


with the analysis of a sample data leading to prediction,
inferences, interpretation, or conclusion about the entire
population.
It includes making conclusions, generalizations, prediction, or
approximations in the face of uncertainty.
Which of these will use descriptive statistics and/or STAT
inferential statistics? Ppt.1

⮚ The DCS Instructors list the number of enrolled BSIT


students in A.Y. 2020-2021.
⮚ Maria makes a line graph to describe the net profit
Activity 1 for the year 2019.
⮚ Forecast of PAGASA on how many typhoons will come
to PAR this year.
⮚ The CvSU guards/civil security records the number of
students in the first day of classes.
⮚ Robert is recording the number of customers he had
this week in order to predict the number of
customers he would have for next week
STAT
Types of Data: Ppt.1

Quantitative Data – refers to numerical information


obtained from counting or measuring that which be
manipulated by any fundamental operation.

Qualitative Data – refers to descriptive attributes that


cannot be subjected to mathematical operations.
STAT
Ppt.1
Variability

Statistical methods are used to help us describe


and understand variability.

By variability, we mean that successive


observations of a system or phenomenon do not
produce exactly the same result
STAT
Ppt.1
Variability

We all encounter variability in our everyday


lives, and statistical thinking can give us a
useful way to incorporate this variability into
our decision-making processes.
STAT
Ppt.1
Consider the gasoline mileage performance of one’s car.
Do you always get exactly the same mileage performance on every tank of fuel?

Of course not — in fact, sometimes the mileage performance varies


considerably.

This observed variability in gasoline mileage depends on many factors:


*such as the type of driving that has occurred most recently (city versus
highway),
*the changes in the vehicle’s condition over time (which could include factors
such as tire inflation, engine compression, or valve wear),
*the brand and/or octane number of the gasoline used, or possibly even the
weather conditions that have been recently experienced.
STAT
Ppt.1

Suppose an engineer is designing a nylon connector to be used in an auto


engine application.

The engineer is considering establishing the design specification on wall


thickness at 3 32 inch but is somewhat uncertain about the effect of this
decision on the connector pull-off force.

If the pull-off force is too low, the connector may fail when it is installed in an
engine.
STAT
Ppt.1

Eight prototype units are produced and their pull-off forces measured,
resulting in the following data (in pounds): 12 6 12 9 13 4 12 3 13 6 13 5 12 6
13 1 . . .

As we anticipated, not all of the prototypes have the same pull-off force. We
say that there is variability in the pull-off force measurements.

We consider the pull-off force to be a random variable. A convenient way to


think of a random variable, say X, that represents a measurement is by using
the model.
STAT
Ppt.1

Types of Variables:

Quantitative Variables – it is numerical in nature and


can be ordered or ranked.

Qualitative Variables – it can be separated into different


categories that are distinguished by some non-numeric
characteristic.
STAT
Ppt.1

Types of Quantitative Variables

❑ Discrete Variable – a variable whose values can be


counted using integral value or are those whose
values cannot take the form of decimals.

❑ Continuous Variable – variables that can assume a


numerical value over an interval.
STAT
Ppt.1
A survey of students in a certain school is conducted. The survey
questionnaire details the information on the following variables.
For each of these variables, identify whether the variable is
qualitative or quantitative, and if the latter, state whether it is
discrete or continuous.
Activity 2. a. number of family members who are working

b. ownership of a cell phone among family members

c. length (in minutes) of longest call made on each cell phone


owned per month

d. ownership/rental of dwelling
STAT
Ppt. 1
Activity for the week.
Write a reflective paper regarding your views about the basic
concepts involving/ or related to statistics that you have
learned, and include the following also in your reflections .

1. Who are the prominent persons involved in the history of


statistics?

2. What is the importance of statistics? Is it also important


to you as a student?

See our G classroom for the rubrics


STAT
Ppt. 1

Thank you so much for attending the


class.

Have a nice and fruitful day.


STATISTICAL TERMS:
by A. Torres
Instructor
STATISTICAL TERMS:
Here are some of the statistical terms often used in the study of statistics.
1. Data – is any quantitative or qualitative information
a. Quantitative data - refers to numerical information
obtained from counting or measuring that which may be
manipulated by any fundamental operation.

b Qualitative data - refers to descriptive attributes that can


not be subjected to mathematical operations

2. Population – refers to the totality of all the elements or persons for


which one has interest at a particular time. The usual notation for
population is N

3. Sample – is a part of population determined by sampling procedures.


It is usually denoted by n
STATISTICAL TERMS:
Here are some of the statistical terms often used in the study of statistics.
4. Parameter – is any statistical information or attribute
taken from population.

5. Statistic – is any estimate of statistical attributes taken from a


sample

6. Variable – is a specific factor, property or characteristic of a


population or a sample which differentiates a sample or group of samples
from another group.

a. Discrete Variable – is a variable that can be obtained by counting.


b. Continuous Variable – is a variable that can be obtained by
measuring objects or attributes.
SCALES OF MEASUREMENTS
1. Nominal Measurement – a type of statistical data that
depicts the presence or absence of a certain attribute. This usually
involves the random assignment of numbers to represent the attribute .

2. Ordinal Measurement - provides the degree of the presence


of an attribute. Usually, data is classified according to order or ranks.

3. Interval Measurement – The measurement where data are


arranged in some order and the differences between data are
meaningful. Data at this level may lack inherent zero starting point.

4. Ratio – This measurement is an interval level modified to include the


inherent starting point.
▶ Write the following in sigma notation.
▶ Write the following in sigma notation.

1
▶ Write the following in sigma notation.
2
▶ Write the following in sigma notation.
3
ELEMENTARY
STATISTICS
BSEM 26
STAT
Ppt 3

Measure of Central Tendency or


Measure of Average

Prepared
by: Ms.
Analissa S.
Torres
STAT
Ppt 3

Measure of Central Tendency or


Measure of Average

- Is a quantitative representation data under


investigation. This statistic tends to lie within the
center of the set of data.

Prepared
by: Ms.
Analissa S.
Torres
STAT
Ppt 3

Shape of the Distribution


► Symmetrical (mean is about equal to
median)
► Skewed
► Negatively (example: years of education)
mean < median
► Positively (example: income)
mean > median

► Bimodal (two distinct modes)


► Multi-modal (more than 2 distinct modes)
STAT
Ppt 3

Distribution Shape
STAT
Ppt 3

Distribution Shape
STAT
Ppt 3

Considerations for Choosing a


Measure of Central Tendency

► For a nominal variable, the mode is the only measure


that can be used.
► For ordinal variables, the mode and the median may be
used. The median provides more information (taking
into account the ranking of categories.)
► For interval-ratio variables, the mode, median, and
mean may all be calculated. The mean provides the
most information about the distribution, but the median
is preferred if the distribution is skewed.
STAT
Ppt 3
Central Tendency
Measure Of Central
Tendency
Introduction:
• In statistics, a central tendency is a central value or a
typical value for a probability distribution.
• It is occasionally called an average or just the center
of the distribution.
• The most common measures of central tendency are
the arithmetic mean, the median and the mode.
• Measures of central tendency are defined for a
population(large set of objects of a similar nature) and
for a sample (portion of the elements of a population).
Some Definitions
Simpson and Kafka defined it as “ A measure of central
tendency is a typical value around which other figures
gather”
Waugh has expressed “An average stand for the whole
group of which it forms a part yet represents the whole”.
In layman’s term, a measure of central tendency is an
AVERAGE. It is a single number of value which can be
considered typical in a set of data as a whole.
Importance Of Central Tendency

• To find representative value


• To make more concise data
• To make comparisons
• Helpful in further statistical analysis
Mean
• The MEAN of a set of values or measurements is the
sum of all the measurements divided by the number
of measurements in the set.

• The mean is the most popular and widely used. It is


sometimes called the arithmetic mean.
Mean for Ungrouped data
• If we get the mean of the sample, we call it the sample mean
and it is denoted by (read “x bar”).

• If we compute the mean of the population, we call it the


parametric or population mean, denoted by μ (read “mu”).
Arithmetic Mean Calculated Methods for grouped
data:

• Direct Method :

• Short cut method :

• Step deviation Method :


Example Of A.M:

A sample of five executives received the


following bonus last year ($000):
14.0, 15.0, 17.0, 16.0, 15.0
Solution:
Weighted Mean
• Weighted mean is the mean of a set of values wherein
each value or measurement has a different weight or
degree of importance. The following is its formula:

where = mean
x = measurement or value
w = number of measurements
Example Of W.M
Harmonic Mean
• Harmonic mean is quotient of “number of the given values”
and “sum of the reciprocals of the given values”.
• For Ungrouped Data

• For grouped Data


Harmonic Mean Example
Calculate the harmonic mean of the numbers: 13.2, 14.2, 14.8, 15.2 and 16.1
Solution:
The harmonic mean is calculated as below:
AS

X
13.2 0.0758
14.2 0.0704
14.8 0.0676
15.2 0.0658
16.1 0.0621
Total
Example: Calculate the harmonic mean for the given below:

Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99


F 2 3 11 20 32 25 7

Solution: Now Marks x f


We’ll find H.M as:
30-39 34.5 2 0.0580
40-49 44.5 3 0.0674
50-59 54.5 11 0.2018
60-69 64.5 20 0.3101
70-79 74.5 32 0.4295
80-89 84.5 25 0.2959
90-99 94.5 7 0.0741
Total
Geometric Mean
• Geometric mean is a kind of average of a set of numbers that is
different from the arithmetic average.
• The geometric mean is well defined only for sets of positive real
numbers. This is calculated by multiplying all the numbers (call the
number of numbers n), and taking the nth root of the total.
• A common example of where the geometric mean is the correct
choice is when averaging growth rates.
• The geometric mean is NOT the arithmetic mean and it is NOT a
simple average.
• Mathematical definition: The nth root of the product of n numbers.
Formulas
Question 1: Find the geometric mean of the following
values:
15, 12, 13, 19, 10

x Log x
15 1.1761
12 1.0792
13 1.1139
19 1.2788
10 1.0000
Total 5.648
MEDIAN
Median
• The MEDIAN, denoted Md, is the middle value of the sample when the
data are ranked in order according to size.
• Connor has defined as “ The median is that value of the variable which
divides the group into two equal parts, one part comprising of all values
greater, and the other, all values less than median”
• For Ungrouped data median is calculated as:

• For Grouped Data:


Example OF Median
Example of median For Grouped
Data
Mode
• The MODE, denoted Mo, is the value which occurs most
frequently in a set of measurements or values. In other
words, it is the most popular value in a given set.

• Croxton and Cowden : defined it as “the mode of a


distribution is the value at the point armed with the item
tend to most heavily concentrated. It may be regarded as
the most typical of a series of value”
Example 2: In a crash test, 11 cars were tested to determine what impact
speed was required to obtain minimal bumper damage. Find the mode of
the speeds given in miles per hour below.
24, 15, 18, 20, 18, 22, 24, 26, 18, 26, 24

Solution: Ordering the data from least to greatest, we


get:
15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26
Answer: Since both 18 and 24 occur three times,
the modes are 18 and 24 miles per hour.
Formula:
Find the modal class and the actual mode of the
data set below
Number Frequency
1-3 7
4-6 6
7-9 4
10 - 12 2
13 - 15 2
16 - 18 8
19 - 21 1
22 - 24 2
25 - 27 3
28 - 30 2
Advantages of Mode :
• Mode is readily comprehensible and easily calculated
• It is the best representative of data
• It is not at all affected by extreme value.
• The value of mode can also be determined graphically.
• It is usually an actual value of an important part of the series.
Disadvantages of Mode :
• It is not based on all observations.
• It is not capable of further mathematical manipulation.
• Mode is affected to a great extent by sampling fluctuations.
• Choice of grouping has great influence on the value of mode.
Advantages of Median:

• Median can be calculated in all distributions.

• Median can be understood even by common people.

• Median can be ascertained even with the extreme items.

• It can be located graphically

• It is most useful dealing with qualitative data


Disadvantages of Median:
• It is not based on all the values.
• It is not capable of further mathematical treatment.
• It is affected fluctuation of sampling.
• In case of even no. of values it may not the value from
the data.
Properties of mode

• It is used when you want to find the value which occurs most
often.
• It is a quick approximation of the average.
• It is an inspection average.
• It is the most unreliable among the three measures of central
tendency because its value is undefined in some
observations.
Properties of Mean
1. Mean can be calculated for any set of numerical data, so it always exists.
2. A set of numerical data has one and only one mean.
3. Mean is the most reliable measure of central tendency since it takes into
account every item in the set of data.
4. It is greatly affected by extreme or deviant values (outliers)
5. It is used only if the data are interval or ratio.
Relations Between the Measures of Central
Tendency

• In symmetrical distributions, the


median and mean are equal
For normal distributions, mean = median =
mode
• In positively skewed distributions,
the mean is greater than the median

• In negatively skewed
distributions, the mean is
smaller than the median
Conclusion

• A measure of central tendency is a measure that tells us where the


middle of a bunch of data lies.
• Mean is the most common measure of central tendency. It is simply the
sum of the numbers divided by the number of numbers in a set of data.
This is also known as average.
• Median is the number present in the middle when the numbers in a set
of data are arranged in ascending or descending order. If the number of
numbers in a data set is even, then the median is the mean of the two
middle numbers.
• Mode is the value that occurs most frequently in a set of data.
Thank You…  

You might also like