Module 1
Module 1
Statistics is a branch of applied mathematics that involves the collection, description, analysis, and
inference of conclusions from quantitative data. The mathematical theories behind statistics rely heavily
on differential and integral calculus, linear algebra, and probability theory. It is basically a collection
of quantitative data.
People who do statistics are referred to as statisticians. They're particularly concerned with
determining how to draw reliable conclusions about large groups and general events from the
behavior and other observable characteristics of small samples. These small samples represent a
portion of the large group or a limited number of instances of a general phenomenon.
Types of Statistics:
● Theoretical Statistics: Theoretical statistics concerns the study and development of the
mathematical, computational, and philosophical foundations of statistics. Pure or theoretical
statistics focus primarily on the numbers, math, and problems themselves.
● Applied Statistics: Applied
statistics is the use of statistical
techniques to solve real-world data
analysis problems. In contrast to
the pure study of mathematical
statistics, applied statistics is
typically used by and for non-
mathematicians in fields ranging
from social science to business.
Thus, applied statistics can be
thought of as “statistics-in-action”.
Statistics alone can be used
pragmatically. However, in general,
the emphasis of applied statistics tends to be more oriented toward practical benefits.
Typically, there are two general types of statistic that are used to describe data:
● Measures of central tendency: these are ways of describing the central position of a
frequency distribution for a group of data. We can describe this central position using a
number of statistics, including the mode, median, and mean.
● Measures of spread: these are ways of summarizing a group of data by describing how
spread out the scores are. Measures of spread help us to summarize how spread out these
scores are. To describe this spread, a number of statistics are available to us, including the
range, quartiles, absolute deviation, variance and standard deviation.
Sample vs Population
Population is the entire group one wants to draw conclusions about. It basically allows one
to make predictions by taking a small sample instead of working on the whole population.
Moreover, in statistics population is the entire set of items from which data is drawn in the
statistical study. It can be a group of individuals or a set of items. population mean is usually
denoted by the Greek letter ‘μ’.
A sample represents a group of the interest of the population which we will use to represent the
data. The sample is an unbiased subset of the population in which we represent the whole data. A
sample is a group of the elements actually participating in the survey or study. A sample is the
representation of the manageable size. samples are collected and stats are calculated from the
sample so one can make interferences or extrapolations from the sample. This process of
collecting info from the sample is called sampling. The sample is denoted by the n.
When you collect data from a population or a sample, there are various measurements and
numbers you can calculate from the data. Parameters are numbers that describe the properties of
entire populations. Statistics are numbers that describe the properties of samples.
● Parameter = Population
● Statistic = Sample
For example, the average income for the India is a population parameter. Conversely, the
average income for a sample drawn from the India is a sample statistic. Both values represent the
mean income, but one is a parameter vs a statistic.
Mean μ or Mu x̄ or x-bar
Standard deviation σ or Sigma s
Correlation ρ or rho r
1. A statistic is a characteristic of a small part of the population, i.e. sample. The parameter
is a fixed measure which describes the target population.
2. The statistic is a variable and known number which depend on the sample of the
population while the parameter is a fixed and unknown numerical value.
3. Statistical notations are different for population parameters and sample statistics, which
(examples) are given as under:
o In population parameter, µ (Greek letter mu) represents mean, P denotes
population proportion, standard deviation is labeled as σ (Greek letter sigma) etc.
o In sample statistics, x̄ (x-bar) represents mean, standard deviation is labeled as s,
etc.
Meaning Collection of all the units or elements that A subgroup of the members of the
possess common characteristics population
Includes Each and every element of a group Only includes a handful of units of
population
Meaning Primary data refers to the first hand Secondary data means data collected by
data gathered by the researcher someone else earlier.
himself.
Specific Always specific to the researcher's May or may not be specific to the
needs. researcher's need.