Unit 1

Unit 1
Introduction and Forecasting Techniques

Basic Statistical Concepts
Statistics: Statistics is a science dealing with the collection, analysis, interpretation, and
presentation of numerical data.
Descriptive Statistics: Statistics used to describe or reach a conclusion about that same
group
Inferential Statistics: data gathered from a sample and uses statistics generated to reach
conclusions about the population from which the sample was taken.
Basic Statistical Concepts
Population: A collection of persons, objects, or items of interest. It can be narrowly or

widely defined.
Census: When the researcher gathers data from the whole population for a given
measurement of interest.
Sample: A sample is a portion of the whole.

Charts & Graphs
Ungrouped data: the data that has not been summarized in any way is referred to as
ungrouped data.
Grouped data: Data that have been organized into a frequency distribution are called
grouped data.
Quantitative Data Graphs: Histograms, frequency polygons, ogives, dot plots, stem and
leaf plots
Qualitative Data Graphs: Pie charts, Bar graphs, Pareto Charts, Scatter plots etc.
The following data represent the ages of patients admitted to a small hospital on
September 2023.
85 75 66 43 40
88 80 56 56 67
89 83 65 53 75
87 83 52 44 48
Construct a frequency distribution. Compute the sample mean from the frequency
distribution
Steps for designing frequency distribution:
1. Calculate Range
2. Determine the number of classes (usually 5-15)
3. Width of the class (Range/No. of classes)
4. The frequency distribution must start at a value equal to or lower than

the lowest number of the ungrouped data and end at a value equal to
or higher than the highest number.
Descriptive Statistics
1. Measure of Central Tendency
a. Arithmetic mean
b. Median
c. Mode
2. Measure of Dispersion
a. Absolute
b. Relative
Measures of Dispersion/Variability/Spread
A measure of variability is a summary statistic that represents the amount of dispersion in a

dataset.
to see the spread of the values across data.
While a measure of central tendency describes the typical value, measures of variability
define how far away the data points tend to fall from the center.
A low dispersion indicates that the data points tend to be clustered tightly around the
centre. High dispersion signifies that they tend to fall further away.
.
Graphical Presentation of Dispersion/Variability/Spread
.
Why Understanding of Variability is Important
Some degree of variation is unavoidable.
However, too much inconsistency can cause problems.
If our morning commute takes much longer than the mean travel
time, we will be late for work.
If the restaurant dish is much different than how it is usually, we

might not like it at all.
Let’s take an example
Let’s take a look at two hypothetical pizza restaurants. They both advertise a
mean delivery time of 20 minutes. When we’re ravenous, they both sound
equally good! However, this equivalence can be deceptive! To determine the
restaurant that you should order from when you’re hungry, we need to analyze
their variability.
.
Measures of Variance
Range
Inter-quartile Range
Variance/standard deviation
Inter-Quartile range Variance
But who is the Hero of the Story?
It’s standard deviation
The standard deviation (SD) is a single number that summarizes the variability in a dataset.
The standard deviation uses the original data units, simplifying the interpretation.
Suppose a pizza restaurant measures its delivery time in minutes and has an SD of 5. In that case, the
interpretation is that the typical delivery occurs 5 minutes before or after the mean time.
After calculating the standard deviation, you can use various methods to evaluate it. The graphs above
incorporate the SD into the normal probability distribution. Alternatively, you can use the Empirical Rule
or Chebyshev’s Theorem to assess how the standard deviation relates to the distribution of values.
Alternatively, you can calculate the coefficient of variation, which uses both the SD and the mean.
The Empirical Rule for the Standard Deviation of a Normal Distribution
Normal distribution is used to determine the proportion of the values that fall within a
specified number of standard deviations from the mean.
In pizza delivery example where we have a mean delivery time of 20 minutes and a
standard deviation of 5 minutes. Using the Empirical Rule, we can use the mean and
standard deviation to determine that 68% of the delivery times will fall between 15-25
minutes (20 +/- 5) and 95% will fall between 10-30 minutes (20 +/- 2*5).
Comparing Summary Statistics among groups

Unit 1

Uploaded by

Unit 1

Uploaded by

Unit 1

Introduction and Forecasting Techniques

Population: A collection of persons, objects, or items of interest. It can be narrowly or

Sample: A sample is a portion of the whole.

2. Determine the number of classes (usually 5-15)

3. Width of the class (Range/No. of classes)

4. The frequency distribution must start at a value equal to or lower than

1. Measure of Central Tendency

A measure of variability is a summary statistic that represents the amount of dispersion in a

to see the spread of the values across data.

Some degree of variation is unavoidable.

However, too much inconsistency can cause problems.

If the restaurant dish is much different than how it is usually, we

It’s standard deviation

You might also like