0% found this document useful (0 votes)

46 views21 pages

Data and Statistics for Business Insights

Uploaded by

ankur malviya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views21 pages

Data and Statistics for Business Insights

Uploaded by

ankur malviya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Chapter-1

Introduction to Data and Statistics

 Integration with Business: Modern businesses heavily rely on data and statistics for
operations and decision-making.
 Importance of Data: Collecting and analyzing data is crucial for business operations and
making informed decisions.
 Definition and Utility: Data, associated variables, and scales of measurement are
fundamental for management professionals.
 Statistics: Statistics transform numbers into useful information, aiding fact-based
decision-making and understanding variation.

Variables

 Definition: Variables represent numbers, amounts, or situations that can change.

 Types:
o Categorical (Qualitative): Variables with categorical values (e.g., yes/no, day of
the week).
o Numerical (Quantitative): Variables representing quantities.
 Discrete: Countable values (e.g., number of employees).
 Continuous: Measurable values (e.g., time waiting at an ATM).

Measurement Scales

 Definition: Determines the ordering, differences, and equivalence of values for a

variable.
 Types:
o Nominal: Categories without order (e.g., car brands).
o Ordinal: Categories with a meaningful order (e.g., grades).
o Interval: Differences between values are meaningful, no true zero (e.g.,
temperature in Celsius).
o Ratio: Differences and ratios are meaningful, true zero exists (e.g., salary).

Collecting Data

 Importance: Objective data collection is crucial for accuracy.

 Population and Sample:
o Population (N): Entire group of interest.
o Sample (n): Subset of the population used for analysis.
 Parameter vs. Statistic: Parameter describes a population; a statistic describes a sample.

Methods of Data Collection

 Primary Data: Collected directly by the researcher.

 Secondary Data: Collected by someone else, used by the researcher.
 Techniques:
o Data from Organizations: Collected and distributed by entities (e.g., financial
data).
o Designed Experiment: Controlled experiments to collect specific data.
o Surveys: Questionnaires collecting opinions and behaviors.
o Observational Studies: Directly observing behavior in a natural setting.

Sampling Methods

 Probability Sampling: Each element has a known chance of being selected.

o Simple Random Sampling: Equal chance for all elements.
o Systematic Sampling: Every nth element is selected.
o Stratified Sampling: Subsamples drawn from different strata.
o Cluster Sampling: Samples drawn from clusters of elements.
 Non-Probability Sampling: Selection probability is unknown.
o Convenience Sampling: Selecting the most easily available elements.
o Judgment Sampling: Selected based on the researcher's judgment.
o Quota Sampling: Ensuring subgroups are represented.
o Snowball Sampling: Initial respondents recruit further participants.

Survey Design

 Components: Designing a questionnaire, pretesting, and editing.

 Google Forms: An example tool for creating and distributing surveys.

Summary

 Statistics: Science of collecting, analyzing, presenting, and interpreting data.

 Data: Facts and figures used for analysis.
 Key Terms:
o Data: Collected information.
o Variable: A characteristic of interest.
o Nominal Scale: Identifies attributes.
o Ordinal Scale: Indicates order or rank.
Chapter-2

Basic Concepts

 Data: Facts and figures used for analysis.

 Statistics: Collection, organization, analysis, interpretation, and presentation of data.

Organizing Data

 Categorical Variables: Values that are names or labels (e.g., color, breed).
 Quantitative Variables: Numerical values that can be measured or counted.
o Discrete Variables: Countable values (e.g., number of heads in coin flips).
o Continuous Variables: Measurable values within a range (e.g., weight).

Frequency Distribution

 Definition: Tabular summary of data showing the number of items in each class.
 Example:
o Coke Classic: 9
o Pepsi: 8
o Diet Coke: 13
o Sprite: 9
o Dr. Pepper: 11

Relative and Percent Frequency

 Relative Frequency: Proportion of items in a class (Frequency of class / Total

frequency).
 Percent Frequency: Relative frequency expressed as a percentage (Relative frequency ×
100).

Visualizing Categorical Data

 Bar Graph: Summarizes frequency distribution with bars.

 Pie Chart: Represents data as slices of a circle.
 Pareto Chart: Bar chart in descending order with cumulative percentage line.

Visualizing Numerical Data

 Dot Plot/Scatter Plot: Plots data points to show trends.

 Histogram: Graphical representation of data distribution; helps identify skewness.
 Cumulative Distribution (Ogive): Plots cumulative frequency on y-axis.

Steps to Create Frequency Distribution (Example)

1. Determine Classes: Decide on 5-20 classes based on data size.

2. Class Width: Calculate using the formula: (Largest value - Smallest value) / Number of
classes.
3. Class Limits: Ensure each data item belongs to one class.

Creating Graphs in Excel

 Scatter Plot:
1. Select data cells.
2. Insert scatter plot from chart group.
 Histogram:
1. Select class and frequency data.
2. Insert column chart and format data series.

Best Practices for Visualization

 Use simple graphs/charts.

 Provide clear titles and labels.
 Avoid unnecessary decorative elements (chart junk).

Key Definitions

 Frequency Distribution: Number of data values in each class.

 Cumulative Frequency Distribution: Number of data values less than or equal to the
upper class limit.

Important Graph Types

 Bar Chart: For categorical data.

 Pie Chart: For proportional representation.
 Histogram: For numerical data distribution.
 Scatter Plot: For relationships between two numerical variables.
 Pareto Chart: For prioritizing categories based on frequency.

By focusing on these key points, you'll be well-prepared for questions on data organization and
visualization in your exam.
Chapter-3
Objectives

 Understand types of statistics

 Use measures of location (descriptive statistics)
 Comprehend measures of variability
 Grasp covariance and the coefficient of correlation
 Utilize Excel for descriptive statistics

Introduction

 Numerical Measures: Summarize data using measures of location, dispersion, shape,

and association.
 Sample Statistics vs. Population Parameters: Statistics for a sample are called sample
statistics; for a population, they are called population parameters.
 Point Estimator: Sample statistic used to estimate a population parameter.

Central Tendency

 Mean (Average):
o Population Mean (µ): Sum of all values divided by the total number of values.
o Sample Mean (𝑥̅): Sum of sample values divided by the sample size.
 Median: Middle value when data is ordered. For even number of observations, it's the
average of the two middle values.
 Mode: Most frequently occurring value.

Example Calculation

 Sample Mean in Excel: =AVERAGE(D4:D13) or =SUM(D4:D13)/10.

 Weighted Mean: =SUMPRODUCT(weights, values) / SUM(weights).

Variation and Shape

 Range (R): Difference between the largest and smallest values.

 Variance and Standard Deviation:
o Variance measures the average squared deviation from the mean.
o Standard Deviation is the square root of variance, showing average deviation from
the mean.

Exploring Numerical Data

 Percentile: Value below which a given percentage of observations fall.

 Interquartile Range (IQR): Difference between the third quartile (Q3) and the first
quartile (Q1).
 Five-Number Summary: Minimum, Q1, Median, Q3, Maximum.
 Boxplot: Visual representation of the five-number summary.

Covariance and Correlation

 Covariance: Measures the strength of the linear relationship between two variables.
 Coefficient of Correlation (r):
o Ranges from -1 to +1.
o Values close to 0 indicate no relationship.
o Positive values indicate a positive relationship; negative values indicate a negative
relationship.
o r = (∑(X - 𝑥̅)(Y - ȳ)) / (n - 1)

Key Formulas

 Sample Mean: 𝑥̅ = \frac{∑X}{n}

 Population Mean: µ=∑XNµ = \frac{∑X}{N}µ=N∑X
 Variance (Sample): s^2 = \frac{∑(X - 𝑥̅)^2}{n-1}
 Standard Deviation (Sample): s = \sqrt{\frac{∑(X - 𝑥̅)^2}{n-1}}
 Covariance: Cov(X, Y) = \frac{∑(X - 𝑥̅)(Y - ȳ)}{n-1}
 Correlation Coefficient: r = \frac{∑(X - 𝑥̅)(Y - ȳ)}{\sqrt{∑(X - 𝑥̅)^2 ∑(Y - ȳ)^2}}

Key Terms

 Mean: Average value.

 Median: Middle value.
 Mode: Most frequent value.
 Range: Spread between maximum and minimum values.
 Variance: Average squared deviation from the mean.
 Standard Deviation: Average deviation from the mean.
 Covariance: Measure of the linear relationship between two variables.
 Correlation Coefficient: Measure of the strength and direction of the linear relationship
between two variables.
Chapter-4
 Probability: Numerical measure of the likelihood that an event occurs, ranging from 0 to 1.
 Formula: Probability (P) = Number of favorable outcomes / Total number of possible outcomes.
 Types of Probability:
o A Priori Probability: Based on prior knowledge or logical deduction (e.g., January days in
a year).
o Empirical Probability: Based on observed data (e.g., interest in a class).
o Subjective Probability: Based on personal judgment or experience (e.g., predicting sales
of a new product).

Probability of Events

 Event: A set of outcomes (e.g., days in January).

 Complement of an Event: All outcomes not in the event (e.g., days not in January).
 Union of Events (A ∪ B): Probability of either event A or B occurring.
o Formula: P(A∪B)=P(A)+P(B)−P(A∩B)P(A ∪ B) = P(A) + P(B) - P(A ∩ B)P(A∪B)=P(A)+P(B)
−P(A∩B)
 Intersection of Events (A ∩ B): Probability of both events A and B occurring.
o For independent events: P(A∩B)=P(A)×P(B)P(A ∩ B) = P(A) × P(B)P(A∩B)=P(A)×P(B)
 Mutually Exclusive Events: No common outcomes (e.g., days in January and February).

Conditional Probability

 Definition: Probability of event A given that event B has occurred.

o Formula: P(A∣B)=P(A∩B)P(B)P(A|B) = \frac{P(A ∩ B)}{P(B)}P(A∣B)=P(B)P(A∩B)
 Example: Probability of promotion given that an officer is a man or a woman.

Ethical Issues in Probability

 Ensuring clarity and transparency in probability-related information to avoid public confusion

and mistrust, particularly in advertisements.

Bayes' Theorem

 Purpose: To update prior probability estimates based on new information.

 Application: Used for revising probabilities, especially when initial probabilities are known, and
additional data is obtained.

Key Formulas and Concepts

 Prior Probability: Initial probability estimate.

 Posterior Probability: Revised probability based on new information.
 Bayes' Theorem: P(A∣B)=P(B∣A)P(A)P(B)P(A|B) = \frac{P(B|A)P(A)}{P(B)}P(A∣B)=P(B)P(B∣A)P(A)
 Joint Probability: Probability of two events occurring simultaneously.
 Marginal Probability: Probability of a single event occurring, ignoring the other events.
Key Words

 Probability: Numerical value indicating the likelihood of an event.

 Conditional Probability: Probability of an event given another event has occurred.
 Joint Probability: Probability of two events happening together.
 Marginal Probability: Probability of an individual event occurring.
 Bayes' Theorem: Method for calculating revised probabilities.

This condensed summary captures the essential concepts and examples related to probability as
they apply to business decision-making and statistical analysis.

Chapter-5
5.0 Objectives

 Understand properties of probability distributions.

 Differentiate between discrete and continuous probability distributions.
 Compute expected value and variance.
 Calculate probabilities for Binomial and Poisson distributions.

5.1 Introduction

 Familiarize with probability distributions, especially Binomial and Poisson.

 Learn assumptions and applications through problems.

5.2 Definitions

 Random Variable: Numerical value representing outcomes of a statistical experiment.

 Discrete Random Variables: Countable outcomes (e.g., number of customers).
 Continuous Random Variables: Measurable outcomes over a range (e.g., time).

5.3 Probability Distributions

 Probability Distribution: Function providing probabilities of all possible outcomes.

 Discrete Probability Distributions: Probabilities for discrete random variables.
o Represented by Probability Mass Function (PMF) or Cumulative Distribution
Function (CDF).
o PMF calculates probability of exactly x successes in n trials.
o CDF calculates cumulative probability up to x successes.
 Continuous Probability Distributions: Probabilities for continuous random variables
defined as area under the curve of its PDF.

Properties of Discrete Probability Distributions

 Probabilities lie between 0 and 1.

 Outcomes are mutually exclusive.
 Total probabilities sum to 1.

5.4 The Importance of Expected Value in Decision-Making

 Expected Value (E[X]): Measure of the center of the distribution.

 Variance (Var(X)): Measure of spread around the expected value.
 Standard Deviation (SD(X)): Square root of variance, indicating spread.

Properties of Mean (Expected Value)

 E(X+Y)=E(X)+E(Y)E(X + Y) = E(X) + E(Y)E(X+Y)=E(X)+E(Y)

 E(aX)=a⋅E(X)E(aX) = a \cdot E(X)E(aX)=a⋅E(X)
 E(X+a)=E(X)+aE(X + a) = E(X) + aE(X+a)=E(X)+a
Properties of Variance

 V(aX+b)=a2⋅V(X)V(aX + b) = a^2 \cdot V(X)V(aX+b)=a2⋅V(X)

 V(X+Y)=V(X)+V(Y)V(X + Y) = V(X) + V(Y)V(X+Y)=V(X)+V(Y)
 For pairwise independent variables: V(a1X1+a2X2+...+anXn)=a12V(X1)+a22V(X2)+...
+an2V(Xn)V(a_1X_1 + a_2X_2 + ... + a_nX_n) = a_1^2 V(X_1) + a_2^2 V(X_2) + ... +
a_n^2 V(X_n)V(a1X1+a2X2+...+anXn)=a12V(X1)+a22V(X2)+...+an2V(Xn)

5.5 Binomial Probability Distribution

 Used for number of successes in n independent trials with probability p of success.

 PMF: P(X=k)=(nk)pk(1−p)n−kP(X = k) = \binom{n}{k} p^k (1-p)^{n-k}P(X=k)=(kn
)pk(1−p)n−k
 Mean: μ=np\mu = npμ=np
 Variance: σ2=np(1−p)\sigma^2 = np(1-p)σ2=np(1−p)

5.6 Poisson Distribution

 Used for number of events in a fixed interval of time/space.

 PMF: P(X=k)=λke−λk!P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}P(X=k)=k!λke−λ
 Mean and Variance: λ\lambdaλ

5.7 Let Us Sum Up

 Summary of key concepts: probability distributions, expected value, variance, and

specific distributions (Binomial, Poisson).

5.8 Key Words

 Random Variable, Discrete, Continuous, Probability Distribution, Expected Value,

Variance, Binomial, Poisson.

5.9 Case

 Practical application of probability distributions in decision-making.

Chapter-6
Introduction
 Objective: Understand and apply continuous distributions, including Uniform, Normal,
and Exponential distributions.
 Purpose: Solve practical problems using continuous distributions, with exercises for
practice.

6.2 Continuous Distributions: Introduction

 Continuous Random Variable: A variable with a range of possible values within an

interval.
 Common Example: Normal distribution.
 Use: Probability distributions help predict outcomes based on known properties.

Probability Distributions of Continuous Variables

 Definition: Continuous random variables take all values in an interval.

 Calculation: Probabilities found using calculus, not physical measurement.

6.3 Normal Distribution

 Type: Continuous and most commonly used distribution.

 Applications: Variables like weight, height, etc.
 Parameters: Mean (µ) and standard deviation (σ).
 Standard Normal Distribution: Mean of 0 and standard deviation of 1.
 Characteristics:
o Symmetric distribution.
o Uni-modal.
o Continuous range from –∞ to +∞.
o Total area under the curve is 1.
o Mean, median, and mode are equal.

Properties of Normal Distribution

 Symmetry: Curve is symmetric around the mean.

 Mean, Median, Mode: All are equal.
 Asymptotic: Curve never touches the x-axis.
 Unimodal: One peak point.
 Quartiles: Equidistant from mean.
 Linear Combination: If X and Y are independent normal variates, aX + bY is also
normal.

Importance of Normal Distribution

1. Sample size increase leads to normal properties.

2. Skewed variables can be transformed to normal.
3. Sampling distributions tend to normal.
4. Basis for hypothesis testing.
5. Statistical Quality Control relies on normal distribution.
6. Approximation to binomial and Poisson distributions.
7. Theoretical and applied usefulness.
8. Mathematically convenient.

Area Under the Normal Curve

 Total Area: 1 (50% on each side of the mean).

 Z-Scores: Standard normal table used for probabilities.
 Example: P(-2 ≤ z ≤ +2) = 0.9544 (approx 95%).

6.5 The Uniform Distribution

 Definition: Equal probability over an interval.

 Density Function: P(x)=x2−x1b−aP(x) = \frac{x_2 - x_1}{b - a}P(x)=b−ax2−x1.
 Mean: a+b2\frac{a + b}{2}2a+b.
 Standard Deviation: b−a12\frac{b - a}{\sqrt{12}}12b−a.

Example: Uniform Distribution

 Process time between 20 to 40 minutes.

 Probability for 25 to 30 minutes is 25%.

6.6 The Exponential Distribution

 Definition: Time between random occurrences.

 Characteristics:
o Continuous, right-skewed.
o Ranges from 0 to ∞.
o Apex at x = 0.
o Decreases gradually as x increases.
 Density Function: f(x)=λe−λxf(x) = \lambda e^{-\lambda x}f(x)=λe−λx.
 Parameter: λ (inverse of mean).

Example: Exponential Distribution

 Arrivals at ticket counter (Poisson distributed, 3 customers/minute).

 Probability of an interval of 2+ minutes is 85%.

6.7 The Normal Approximation to the Binomial Distribution

 Definition: Approximate binomial distribution using normal distribution for large sample
sizes.
 Conditions: n, p, and n(1-p) ≥ 10.
 Transformation:
o Mean: μ=np\mu = npμ=np.
o Standard Deviation: σ=np(1−p)\sigma = \sqrt{np(1-p)}σ=np(1−p).

Example: Normal Approximation

 Convert binomial parameters to normal.

 Use normal distribution properties to estimate probabilities.

6.8 Summary

 Key formulas, concepts, and distributions (Normal, Uniform, Exponential).

 Application of standard normal variables, mean, and standard deviation.
 Practical examples and exercises included.

Chapter-7
Introduction
 Objective: Understand the concepts of the sampling distribution, central limit theorem,
distribution of a sample’s mean, and sample proportions.
 Purpose: Solve practical problems related to sampling distributions, with exercises for
practice.

7.2 Sampling Distribution

 Definition: The probability distribution of a statistic.

 Concept: If all possible samples of size nnn are drawn from a population and a statistic is
computed for each sample, the probability distribution of this statistic is called a sampling
distribution.

7.3 Sampling Distribution of the Mean (X̄ )

 Definition: The sample mean is a random variable with its probability distribution.
 Example: Drawing a sample of size n=2n = 2n=2 from a uniformly distributed
population over the integers 1 to 6.
 Key Points:
o The distribution of the sample mean may differ from the population distribution.
o Probability calculations for sample means often involve z-scores and normal
distribution tables.

Central Limit Theorem (CLT)

 Definition: As the sample size increases, the sampling distribution of the mean tends to a
normal distribution.
 Conditions:
o Sample size n>30n > 30n>30 for non-normal populations.
o Any sample size if the population is normally distributed.
 Formulas:
o Mean of the sample means: μx=μ\mu_x = \muμx=μ.
o Standard deviation of the sample means (Standard Error): σx=σn\sigma_x = \
frac{\sigma}{\sqrt{n}}σx=nσ.

Sampling Distribution of the Difference of Means

 Definition: Comparing means from two different populations.

 Formulas:
o Mean of the difference: μxˉ1−xˉ2=μ1−μ2\mu_{x̄ 1 - x̄ 2} = \mu_1 - \
mu_2μxˉ1−xˉ2=μ1−μ2.
o Standard error of the difference: σxˉ1−xˉ2=σ12n1+σ22n2\sigma_{x̄ 1 - x̄ 2} = \
sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}σxˉ1−xˉ2=n1σ12+n2
σ22.
 Example: Probability calculations for comparing the lifetimes of products from two
manufacturers.
7.4 Sampling Distribution of the Proportion

 Definition: Distribution of sample proportions based on the binomial distribution.

 Formulas:
o Mean of the sample proportion: μp=p\mu_p = pμp=p.
o Standard deviation of the sample proportion: σp=p(1−p)n\sigma_p = \sqrt{\
frac{p(1-p)}{n}}σp=np(1−p).
 Example: Calculating the probability of a sample proportion deviating from the
population proportion.

Sampling Distribution of the Difference of Proportions

 Definition: Comparing proportions from two different populations.

 Formulas:
o Mean of the difference: μp1−p2=p1−p2\mu_{p1 - p2} = p1 - p2μp1−p2=p1−p2.
o Standard error of the difference: σp1−p2=p1(1−p1)n1+p2(1−p2)n2\sigma_{p1 -
p2} = \sqrt{\frac{p1(1-p1)}{n1} + \frac{p2(1-p2)}{n2}}σp1−p2=n1p1(1−p1)
+n2p2(1−p2).
 Example: Probability calculations for the difference in defect rates between products
from two companies.

7.5 Determining Sample Size

 Factors to Consider:
1. Tolerable error.
2. Desired confidence level.
3. Population variance.
 Formula: n=(Zα/2σE)2n = \left( \frac{Z_{\alpha/2} \sigma}{E} \right)^2n=(EZα/2σ)2
where Zα/2Z_{\alpha/2}Zα/2 is the z-score for the desired confidence level, σ\sigmaσ is
the population standard deviation, and EEE is the tolerable error.
 Example: Calculating the required sample size to estimate average income within a
specific confidence interval and error tolerance.

7.6 Summary

 Key formulas and concepts of sampling distribution and determining estimates within
samples.
 Application of central limit theorem, distribution of sample means, and sample
proportions.
 Examples and exercises to practice calculating sample sizes and understanding sampling
distributions.
Chapter-8
1. Basic Terms

 Null Hypothesis (H0): Statement of no effect or status quo.

 Alternative Hypothesis (Ha): Statement indicating the presence of an effect or
difference.

2. Types of Errors
 Type I Error (α): Rejecting a true null hypothesis (false positive).
 Type II Error (β): Accepting a false null hypothesis (false negative).

3. Significance Level

 Common levels: 1%, 5%, 10%.

 p-value: Probability of obtaining test results at least as extreme as the results observed,
under the assumption that the null hypothesis is correct.
 Decision Rule: Reject H0 if p-value < significance level.

4. Steps in Hypothesis Testing

1. Formulate Hypotheses:
o Example: H0: μ = μ0, Ha: μ ≠ μ0.
2. Choose the Test:
o Z-test for known population standard deviation (σ) or large samples (n > 30).
o t-test for unknown population standard deviation or small samples (n ≤ 30).
3. Calculate Test Statistic:
o Z-test: Z=Xˉ−μ0σnZ = \frac{\bar{X} - \mu_0}{\frac{\sigma}{\sqrt{n}}}Z=nσ
Xˉ−μ0
o t-test: t=Xˉ−μ0snt = \frac{\bar{X} - \mu_0}{\frac{s}{\sqrt{n}}}t=nsXˉ−μ0
4. Find Critical Value:
o Use statistical tables or software.
5. Make Decision:
o Compare test statistic with critical value or use p-value.

5. One-Tail vs. Two-Tail Tests

 One-Tail Test: Tests for effect in one direction (e.g., μ > μ0 or μ < μ0).
 Two-Tail Test: Tests for effect in both directions (e.g., μ ≠ μ0).

6. Example Formulas

 Z-Test: Z=Xˉ−μ0σnZ = \frac{\bar{X} - \mu_0}{\frac{\sigma}{\sqrt{n}}}Z=nσXˉ−μ0

 t-Test: t=Xˉ−μ0snt = \frac{\bar{X} - \mu_0}{\frac{s}{\sqrt{n}}}t=nsXˉ−μ0
o Degrees of Freedom (df): n - 1

Key Points to Remember

 Null Hypothesis (H0): Usually states no effect or no difference.

 Alternative Hypothesis (Ha): Indicates the presence of an effect or difference.
 Type I Error (α): Probability of rejecting true H0.
 Type II Error (β): Probability of accepting false H0.
 Significance Level (α): Commonly 0.05 (5%).
 p-value: If p < α, reject H0.
 Z-Test: Use when σ is known or n > 30.
 t-Test: Use when σ is unknown and n ≤ 30.
 One-Tail Test: Tests for a specific direction.
 Two-Tail Test: Tests for any differenc

Chapter-9
Objectives

1. Test hypothesis of difference in two means with known population standard

deviation.
2. Test hypothesis of difference in two means with unknown population standard
deviation.
3. Calculate Z test and t-test in the case of two dependent populations.
4. Test hypothesis of differences in two population proportions.
5. Test hypothesis of the average difference in two related populations.

Key Concepts and Steps

9.1 Introduction to Two-Sample Hypothesis Testing

Comparing Two Independent Populations

 Two-Sample Z-Test:
o Used when population standard deviations (σ1\sigma_1σ1 and σ2\sigma_2σ2) are
known or sample sizes are large (n>30n > 30n>30).
o Formula: Z=(Xˉ1−Xˉ2)−(μ1−μ2)σ12n1+σ22n2Z = \frac{(\bar{X}_1 - \
bar{X}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\
sigma_2^2}{n_2}}}Z=n1σ12+n2σ22(Xˉ1−Xˉ2)−(μ1−μ2)
o Example Steps:
1. Formulate H0:μ1=μ2H_0: \mu_1 = \mu_2H0:μ1=μ2 and Ha:μ1≠μ2H_a: \
mu_1 \ne \mu_2Ha:μ1=μ2.
2. Calculate Z-statistic.
3. Compare with critical value from Z-table.
 Two-Sample t-Test:
o Used when population standard deviations are unknown.
o Formula: t=(Xˉ1−Xˉ2)−(μ1−μ2)s12n1+s22n2t = \frac{(\bar{X}_1 - \bar{X}_2) -
(\mu_1 - \mu_2)}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}t=n1s12+n2
s22(Xˉ1−Xˉ2)−(μ1−μ2)
o Degrees of Freedom (df): Smaller of n1−1n_1 - 1n1−1 and n2−1n_2 - 1n2−1.

Comparing Two Related Populations

 Paired t-Test:
o Used when samples are related (e.g., before and after measurements).
o Formula: t=DˉsDnt = \frac{\bar{D}}{\frac{s_D}{\sqrt{n}}}t=nsDDˉ
o Dˉ\bar{D}Dˉ: Mean of the differences, sDs_DsD: Standard deviation of the
differences.

Comparing Two Population Proportions

 Z-Test for Proportions:

o Formula: Z=p1−p2p^(1−p^)(1n1+1n2)Z = \frac{p_1 - p_2}{\sqrt{\hat{p}(1 - \
hat{p}) \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}Z=p^(1−p^)(n11+n21)p1−p2
o p^\hat{p}p^ is the pooled proportion: p^=x1+x2n1+n2\hat{p} = \frac{x_1 + x_2}
{n_1 + n_2}p^=n1+n2x1+x2.

9.2 Detailed Steps for Z-Test and t-Test

Steps for Hypothesis Testing

1. Formulate Hypotheses:
o H0H_0H0: μ1=μ2\mu_1 = \mu_2μ1=μ2
o HaH_aHa: μ1≠μ2\mu_1 \ne \mu_2μ1=μ2 (two-tailed) or HaH_aHa: μ1>μ2\
mu_1 > \mu_2μ1>μ2 / HaH_aHa: μ1<μ2\mu_1 < \mu_2μ1<μ2 (one-tailed)
2. Select Significance Level (α):
o Common choices: 0.01, 0.05, 0.10
3. Choose the Test:
o Z-Test for known σ\sigmaσ or large samples (n>30n > 30n>30).
o t-Test for unknown σ\sigmaσ or small samples (n≤30n \le 30n≤30).
4. Calculate the Test Statistic:
o Z-Test Formula: Z=Xˉ1−Xˉ2σ12n1+σ22n2Z = \frac{\bar{X}_1 - \bar{X}_2}{\
sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}Z=n1σ12+n2σ22Xˉ1
−Xˉ2
o t-Test Formula: t=Xˉ1−Xˉ2s12n1+s22n2t = \frac{\bar{X}_1 - \bar{X}_2}{\
sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}t=n1s12+n2s22Xˉ1−Xˉ2
5. Find Critical Value or p-Value:
o Use Z-table or t-table based on the selected test.
6. Make Decision:
o Compare calculated value with critical value.
o If ∣calculated value∣>critical value|\text{calculated value}| > \text{critical
value}∣calculated value∣>critical value, reject H0H_0H0.

Examples

1. Comparing Means of Electric Bulbs:

o Given:
 n1=100n_1 = 100n1=100, Xˉ1=1300\bar{X}_1 = 1300Xˉ1=1300, σ1=82\
sigma_1 = 82σ1=82
 n2=100n_2 = 100n2=100, Xˉ2=1288\bar{X}_2 = 1288Xˉ2=1288, σ2=93\
sigma_2 = 93σ2=93
o Test Statistic: Z=1300−1288822100+932100=0.968Z = \frac{1300 - 1288}{\
sqrt{\frac{82^2}{100} + \frac{93^2}{100}}} = 0.968Z=100822+100932
1300−1288=0.968
o Decision:
 Critical value for α=0.05\alpha = 0.05α=0.05 is ±1.96.
 Since 0.968<1.960.968 < 1.960.968<1.96, do not reject H0H_0H0.
2. Comparing Proportions of Tea Consumption:
o Given:
 n1=100n_1 = 100n1=100, x1=60x_1 = 60x1=60, p1=0.60p_1 = 0.60p1
=0.60
 n2=200n_2 = 200n2=200, x2=100x_2 = 100x2=100, p2=0.50p_2 =
0.50p2=0.50
o Test Statistic: Z=0.60−0.500.55×0.45(1100+1200)=1.645Z = \frac{0.60 - 0.50}
{\sqrt{0.55 \times 0.45 \left(\frac{1}{100} + \frac{1}{200}\right)}} =
1.645Z=0.55×0.45(1001+2001)0.60−0.50=1.645
o Decision:
 Critical value for α=0.05\alpha = 0.05α=0.05 is ±1.96.
 Since 1.645<1.961.645 < 1.961.645<1.96, do not reject H0H_0H0.

Summary

1. Use Z-test for large samples or known σ\sigmaσ.

2. Use t-test for small samples or unknown σ\sigmaσ.
3. For paired samples, use the paired t-test.
4. For proportions, use the Z-test for proportions.
5. Follow standard steps: Formulate hypotheses, choose significance level, select test,
calculate statistic, find critical value, make a decision.

Chapter-10

Statistics Overview: Descriptive & Inferential
No ratings yet
Statistics Overview: Descriptive & Inferential
6 pages
Understanding Statistics Basics
No ratings yet
Understanding Statistics Basics
4 pages
EDA - Reviewer Midterm
No ratings yet
EDA - Reviewer Midterm
9 pages
Data Collection and Organization in Statistics
No ratings yet
Data Collection and Organization in Statistics
13 pages
Data Management and Statistical Analysis
No ratings yet
Data Management and Statistical Analysis
56 pages
EDA - Reviewer Midterm
No ratings yet
EDA - Reviewer Midterm
8 pages
Business Statistics for Decision Making
No ratings yet
Business Statistics for Decision Making
6 pages
Introduction to Statistics and Data Analysis
No ratings yet
Introduction to Statistics and Data Analysis
7 pages
QM Exam Guide
No ratings yet
QM Exam Guide
70 pages
Statistics - Material
No ratings yet
Statistics - Material
12 pages
PrepCourseStat Thanarak
No ratings yet
PrepCourseStat Thanarak
27 pages
Data Management (1) (1) - Compressed
No ratings yet
Data Management (1) (1) - Compressed
46 pages
Stats Reviewer
No ratings yet
Stats Reviewer
5 pages
Understanding Statistics Basics
No ratings yet
Understanding Statistics Basics
6 pages
Business Statistics and Computing Complete Ppts
No ratings yet
Business Statistics and Computing Complete Ppts
213 pages
Introduction To Statistics 2024-2025
No ratings yet
Introduction To Statistics 2024-2025
40 pages
Elementary Statistics Overview Guide
No ratings yet
Elementary Statistics Overview Guide
5 pages
Excel & Python Statistical Functions
No ratings yet
Excel & Python Statistical Functions
44 pages
Understanding Probability and Statistics
No ratings yet
Understanding Probability and Statistics
50 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
13 pages
Business Statistics I BBA 1303: Muktasha Deena Chowdhury Assistant Professor, Statistics, AUB
100% (1)
Business Statistics I BBA 1303: Muktasha Deena Chowdhury Assistant Professor, Statistics, AUB
54 pages
BRM Chapter 6
No ratings yet
BRM Chapter 6
8 pages
Text 15
No ratings yet
Text 15
2 pages
Psychology 117 Statistics Study Guide
100% (3)
Psychology 117 Statistics Study Guide
41 pages
Understanding Statistics and Data Analysis
No ratings yet
Understanding Statistics and Data Analysis
41 pages
Data Management: Types and Techniques
No ratings yet
Data Management: Types and Techniques
43 pages
Week1 Introduction
No ratings yet
Week1 Introduction
36 pages
Introduction To Statistics
100% (3)
Introduction To Statistics
43 pages
Statistics: Sampling, Data & Analysis Guide
No ratings yet
Statistics: Sampling, Data & Analysis Guide
6 pages
Introduction To Statistics Final
No ratings yet
Introduction To Statistics Final
30 pages
Comprehensive Guide to Statistics
100% (1)
Comprehensive Guide to Statistics
13 pages
Inferential Statistics Course
No ratings yet
Inferential Statistics Course
46 pages
Data Visualization and Analysis Techniques
No ratings yet
Data Visualization and Analysis Techniques
9 pages
Essential Statistics for Data Science
No ratings yet
Essential Statistics for Data Science
30 pages
Understanding Statistics: Descriptive & Inferential
100% (1)
Understanding Statistics: Descriptive & Inferential
6 pages
Grade 9 Statistics Measurement Guide
No ratings yet
Grade 9 Statistics Measurement Guide
9 pages
UNIT-I Study Guide - Introduction To Business Stati
No ratings yet
UNIT-I Study Guide - Introduction To Business Stati
6 pages
Introduction to Statistics and Data Analysis
No ratings yet
Introduction to Statistics and Data Analysis
39 pages
Eco2061 Week 2
No ratings yet
Eco2061 Week 2
68 pages
Business Statistics Course Overview
No ratings yet
Business Statistics Course Overview
30 pages
Data Managementmmw
No ratings yet
Data Managementmmw
26 pages
Stats Midterms Cheat Sheet
No ratings yet
Stats Midterms Cheat Sheet
3 pages
Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
80 pages
Understanding Statistics and Its Applications
No ratings yet
Understanding Statistics and Its Applications
10 pages
Data Management (1)
No ratings yet
Data Management (1)
46 pages
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
No ratings yet
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
28 pages
Data Collection and Analysis Framework
No ratings yet
Data Collection and Analysis Framework
66 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
Camm BA 5e PPT CH02 03-09-23 PC - Final
No ratings yet
Camm BA 5e PPT CH02 03-09-23 PC - Final
52 pages
Understanding Basic Statistics Concepts
No ratings yet
Understanding Basic Statistics Concepts
5 pages
Business Statistics Overview and Concepts
No ratings yet
Business Statistics Overview and Concepts
41 pages
Comprehensive Statistics Study Guide
No ratings yet
Comprehensive Statistics Study Guide
7 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
45 pages
Introduction to Statistics and Data Analysis
No ratings yet
Introduction to Statistics and Data Analysis
9 pages
Introduction to Statistics Overview
No ratings yet
Introduction to Statistics Overview
4 pages
Business Statistics May Module
No ratings yet
Business Statistics May Module
72 pages
Introduction to Statistics and Biostatistics
No ratings yet
Introduction to Statistics and Biostatistics
25 pages
Mini Dictionary
No ratings yet
Mini Dictionary
2 pages
Understanding Basic Statistics Concepts
No ratings yet
Understanding Basic Statistics Concepts
58 pages
Cultural-Historical Psychology Overview
100% (1)
Cultural-Historical Psychology Overview
41 pages
Introduction To Statistics Material 2023
No ratings yet
Introduction To Statistics Material 2023
85 pages
G. David Garson-Correspondence Analysis-Statistical Associates Publishing (2012)
No ratings yet
G. David Garson-Correspondence Analysis-Statistical Associates Publishing (2012)
37 pages
NWRD Spatial Data Quality Guidelines
No ratings yet
NWRD Spatial Data Quality Guidelines
71 pages
Statistics Refresher for Data Science
No ratings yet
Statistics Refresher for Data Science
34 pages
Data ANALYSIS and Data Interpretation
No ratings yet
Data ANALYSIS and Data Interpretation
15 pages
Week One STE 3004
No ratings yet
Week One STE 3004
19 pages
Life Sciences Statistics Guide
No ratings yet
Life Sciences Statistics Guide
58 pages
Software Measurement
No ratings yet
Software Measurement
61 pages
Principles of Statistics Overview
No ratings yet
Principles of Statistics Overview
108 pages
NTP Iso 6658.
No ratings yet
NTP Iso 6658.
44 pages
Lecture Note 04 - Measures of Central Tendency
No ratings yet
Lecture Note 04 - Measures of Central Tendency
15 pages
Understanding Scientific Theory Basics
No ratings yet
Understanding Scientific Theory Basics
32 pages
Understanding Research: Purpose and Types
No ratings yet
Understanding Research: Purpose and Types
8 pages
Introduction to Statistics and Data Collection
No ratings yet
Introduction to Statistics and Data Collection
23 pages
Test Bank Applied Statistics in Business and Economics 5th Edition Doane
No ratings yet
Test Bank Applied Statistics in Business and Economics 5th Edition Doane
66 pages
Statistics Practice Quiz
No ratings yet
Statistics Practice Quiz
1 page
Week003-Levels of Measurement: Laboratory Exercise 001
No ratings yet
Week003-Levels of Measurement: Laboratory Exercise 001
3 pages
ds4015 Big Data Analytics Vignesh K Notes
No ratings yet
ds4015 Big Data Analytics Vignesh K Notes
146 pages
Hair PPT Ch01
No ratings yet
Hair PPT Ch01
23 pages
To Practical Applications (Amy Batchelor)
No ratings yet
To Practical Applications (Amy Batchelor)
206 pages
Doing Quantitative Field Research in Management Accounting
No ratings yet
Doing Quantitative Field Research in Management Accounting
6 pages
Research Methodology Webinar
No ratings yet
Research Methodology Webinar
76 pages
Bsem 26 - Chapter 1 1 10
No ratings yet
Bsem 26 - Chapter 1 1 10
10 pages
Solution Manual For Elementary Statistics Picturing The World 5th Edition
No ratings yet
Solution Manual For Elementary Statistics Picturing The World 5th Edition
15 pages
SYBBA - Research Methodology
No ratings yet
SYBBA - Research Methodology
9 pages
Data Interpretation Techniques Explained
No ratings yet
Data Interpretation Techniques Explained
6 pages
Biostatistics & SPSS Data Analysis Guide
No ratings yet
Biostatistics & SPSS Data Analysis Guide
57 pages