QT2 Example QuestionPapers
QT2 Example QuestionPapers
Q1. Soil quality is considered to be good for growing tea plants if it contains 3% or more organic
matter. A study has been conducted to test whether two tea gardens at Darjeeling have different
soil quality.
A random sample of soil specimens of size 10 from each of Tea Garden A (TGA) and Tea Garden B
(TGB) were collected and the amount of organic matter (%) in the soil present for each specimen
is provided in the following table:
TGA 2.03 2.51 2.21 2.58 2.83 2.84 2.39 2.62 3.79 2.89
TGB 1.48 3.03 3.08 2.02 1.49 1.15 2.85 1.59 3.35 2.08
TGA 1.03 0.51 2.21 2.58 2.83 2.84 2.39 2.62 0.79 0.89
TGB 3.48 1.03 2.08 2.02 2.49 2.15 2.85 2.59 1.35 1.08
Assume that the percentage of organic matter data follows normal distribution, and the variance
of the amount of the organic matter (%) present is the same for the two tea gardens.
Q1.a) [1 Mark] Formulate the null and the alternative hypotheses.
Null hypothesis: 𝐻0 : 𝜇 𝑇𝐺𝐴 = 𝜇 𝑇𝐺𝐵
Alternative hypothesis: 𝐻1 : 𝜇 𝑇𝐺𝐴 ≠ 𝜇 𝑇𝐺𝐵
Q1.b) [2 Marks] What is the test statistic, and what is its observed value based on the sample?
Formula for test statistic used: Let us denote TGA as X and TGB as Y.
2 2
𝑋̅ −𝑌̅ (𝑛−1)𝑠𝑋 +(𝑚−1)𝑠𝑌
𝑇= , where 𝑠𝑝2 =
1 1 𝑛+𝑚−2
𝑠𝑝 √ +
𝑛 𝑚
𝑠𝑝 = 0.6601284, 0.8696056
0.457 −0.243
𝑇𝑜𝑏𝑠 = 1 1
= 1.548, 𝑇𝑜𝑏𝑠 = 1 1
= −0.62484
0.6601284×√ + 0.8696056×√ +
10 10 10 10
Q1.c) [2 Marks] What is your conclusion about the test at 1% and 5% level of significance?
Q1.d) [2 Marks] Obtain a 90% confidence interval of the difference of average organic matter (%)
in the soils of the two tea gardens based on the test statistic value computed in Q1.b).
𝑡18;0.05 = 1.734064
90% confidence interval of the difference of the difference of average organic matter (%) in the
soils of the two tea gardens:
Page 2 of 4
Business Management 2021-23 (Term-II)
Quantitative Techniques – II, Quiz 2 Solution, Full Marks: 15
1 1
[𝑋̅ − 𝑌̅ ± 𝑡18;0.05 × 𝑠𝑝 √ + ]
𝑛 𝑚
= [ −0.05492744, 0.96892744] ([−0.9173764, 0.4313764])
Q2. State RTO of Jharkhand records indicate that of all vehicles undergoing emissions testing
during the previous year (July 2019-June 2020), 70% (65%) passed the initial test.
A random sample of 200(300) cars were tested in Jharkhand this year (July 2020-June 2021) and
124 (174) passed the initial test. The Jharkhand RTO suggests that the true proportion of cars
passing the initial test differs for this year compared to the previous year.
Q2. a) [1 Mark] Formulate the null and the alternative hypotheses.
Null hypothesis: 𝐻0 : 𝑝 = 0.7(0.65)
Alternative hypothesis: 𝐻1 : 𝑝 ≠ 0.7 (0.65)
Q2.b) [2 Marks] What is the test statistic, and what is its observed value based on the sample?
Formula for test statistic used:
𝑝̂ − 𝑝0
𝑧=
√𝑝0 (1 − 𝑝0 )
𝑛
Observed value of the test statistic (show calculations):
124 174
− 0.7
𝑧𝑜𝑏𝑠 =
𝑝̂ − 𝑝0
= 200 = −2.468854 (= 300 − 0.65 = −2.541956)
√𝑝0 (1 − 𝑝0 ) √0.7(1 − 0.7) √0.65(1 − 0.65)
𝑛 200 200
Q2.c) [2 Marks] What is the p-value of the test statistic? What is your conclusion at 5% level of
significance?
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2 ∗ 𝑝(𝑍 > |𝑧𝑜𝑏𝑠 |) = 0.01355467
Conclusion at 5% (justify): α = 0.05, as p-value (0.01355467) < α (0.05), we reject 𝐻0 .
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2 ∗ 𝑝(𝑍 > |𝑧𝑜𝑏𝑠 |) = 0.01102342
Conclusion at 5% (justify): α = 0.05, as p-value (0.01102342) < α (0.05), we reject 𝐻0 .
Q2.d) [3 Marks] Compute the power of the test when the true proportion is 60% (55%), and the
test is conducted at 5% level of significance.
𝑝̂−𝑝0 𝑝̂−𝑝0
Reject 𝐻0 if < −𝑧𝛼 or > 𝑧𝛼
𝑝 (1−𝑝0 ) 𝑝 (1−𝑝0 )
√ 0 2 √ 0 2
𝑛 𝑛
𝑝0 (1 − 𝑝0 ) 𝑝0 (1 − 𝑝0 )
⇒ 𝑝̂ < 𝑝0 − 𝑧𝛼 √ 𝑜𝑟 𝑝̂ > 𝑝0 + 𝑧𝛼 √
2 𝑛 2 𝑛
⇒ 𝑝̂ < 0.6364899 𝑜𝑟 𝑝̂ > 0.7635101
0.6×0.4
Under 𝐻1 , 𝑝̂ ~ 𝑁 (0.6, 200 ) (𝑁(𝜇, 𝜎 2 ))
Power: 𝑃(𝑝̂ < 0.6364899) + 𝑃(𝑝̂ > 0.7635101)
𝑝0 (1 − 𝑝0 ) 𝑝0 (1 − 𝑝0 )
⇒ 𝑝̂ < 𝑝0 − 𝑧𝛼 √ 𝑜𝑟 𝑝̂ > 𝑝0 + 𝑧𝛼 √
2 𝑛 2 𝑛
⇒ 𝑝̂ < 0.5960268 𝑜𝑟 𝑝̂ > 0.7039732
0.55×0.45
Under 𝐻1 , 𝑝̂ ~ 𝑁 (0.55, 300 ) (𝑁(𝜇, 𝜎 2 ))
Power: 𝑃(𝑝̂ < 0.5960268) + 𝑃(𝑝̂ > 0.7039732)
Page 4 of 4
Programme: Business Management 2021-23 (Term-II)
Course: Quantitative Techniques - II
Quiz 3: Solution
Q1. [This question has parts a, b, c and d] Rock-paper-scissors is a game played by two or more people where
players choose to sign either rock, paper, or scissors with their hands. You want to evaluate whether players
choose between these three options randomly, or if certain options are favoured above others. You ask two
friends to play rock-paper-scissors and count the times each option is played. The following table summarizes
the data:
You have to use these data to evaluate whether players choose between these three options randomly, or if
certain options are favoured above others.
Q1.a) [1 Mark] Formulate the null and the alternative hypotheses.
1
Null hypothesis: 𝐻0 : 𝑝𝑅 = 𝑝𝑃 = 𝑝𝑆 = , i.e., choice of options is random
3
1
Alternative hypothesis: 𝐻1 :At least one 𝑝𝑖 ≠ , i.e., choice of options is not random
3
Q1.b) [4 Marks] What is the test statistic, and what is its observed value based on the sample? Show
computations.
Formula for test statistic used: Pearson’s Chi-Square Test Statistic
(𝑂𝑖 −𝐸𝑖 )2
𝜒𝑜2 = ∑𝑘𝑖=1 , where, 𝑂𝑖 : Number of observations in the i-th class. 𝐸𝑖 Expected number of observations
𝐸𝑖
in the i-th class under 𝐷0 . k: Number of classes, m: Number of parameters estimated under 𝐻0 and 𝜒𝑜2 ~𝜒𝑑2
where d = k-m-1.
Under 𝐻0 and 𝜒𝑜2 ~𝜒𝑑2 where d = k-m-1=3-0-1=2
Computations for the test statistic:
Q2. [This question has parts a, b and c] The better-selling candies are often high in calories. The following
data show the calorie content for one serving (28g) from samples of M&M’s, Kit Kat, and Milky Way II.
M&M 232, 210, 240, 250, 225
Kit Kat 235, 205, 245, 215, 220
Milky Way II 200, 208, 202, 190, 180, 185, 189
Suppose the data are not known to be normally distributed. You have to perform an appropriate test to
check whether there is a difference in the calorie content for the three types of candies.
Q2. a) [1 Mark] Formulate the null and the alternative hypotheses.
Null hypothesis: 𝐻0 : 𝜇̃MM = 𝜇̃KK = 𝜇̃MW (𝜇̃i is the average calorie content)
Alternative hypothesis: 𝐻1 : at least one 𝜇̃i is different
Q2.b) [4 Marks] What is the test statistic, and what is its observed value based on the sample? Show
computations.
Formula for test statistic used: Kruskall-Wallies Test
12 𝑇𝑖2 2
Test Statistic: 𝐻 = ∑𝑘𝑖=1 − 3(𝑛 + 1) and under null H ~ 𝜒k-1 where k is the number of classes.
𝑛(𝑛+1) 𝑛𝑖
Computations for the test statistic:
n1 = n2 = 5, n3 = 7, n = n1+n2+n3= 17
12 662 582 292
𝐻𝑜𝑏𝑠 = ( + + ) − 3 ∗ 18 = 11.2605
17 ∗ 18 5 5 7
Page 2 of 3
Programme: Business Management 2021-23 (Term-II)
Course: Quantitative Techniques - II
Quiz 3: Solution
M&M Kit Kat Milky Way II
232 13 235 14 200 4
210 9 205 7 208 8
240 15 245 16 202 5
250 17 215 10 190 2
225 12 220 11 180 1
204 6
197 3
T1 = 75 T2 = 58 T3 = 20
n1 = 7, n2 = n3 = 5, n = n1+n2+n3= 17
12 752 582 202
𝐻𝑜𝑏𝑠 = ( + + ) − 3 ∗ 18 = 7.034174
17 ∗ 18 7 5 5
Q2.c) [2 Marks] What is the cut-off value for the test at 5% level of significance? What is your conclusion for
the test at 5% level of significance based on the cut-off value? Justify.
2
Cut-off value at 5% level of significance: 𝜒2;0.05 = 5.991465
Conclusion:
2
As 𝐻𝑜𝑏𝑠 (11.2605) > 𝜒2;0.05 (5.991465) , we reject the null hypothesis at 5% level of significance.
2
As 𝐻𝑜𝑏𝑠 (7.034174) > 𝜒2;0.05 (5.991465) , we reject the null hypothesis at 5% level of significance.
Thus, based on the sample we can conclude that there is some difference in the average calorie content of
the candies M & M, Kit Kat and Milky Way II.
Page 3 of 3
Programme: Business Management 2021-23 (Term-II)
Course: Quantitative Techniques - II
End Term Solution
Q1. [10 Marks] Should illegal downloading of intellectual property (music, images, etc.) be punished?
This question was asked to 500 teenagers in a study. The teenagers were also asked whether they were
familiar with the laws against illegal downloading.
Q1.d) [5 Marks] What is its observed value of the test statistic based on the sample? Show computations:
Page 1 of 4
(209−177.292)2 (140−171.708)2 (45−76.708)2 (106−74.292)2
𝜒𝑜2 = + + + = 38.16599
177.292 171.708 76.708 74.292
Q1.e) [1 Mark] Cut-off value for the test at 5% level of significance (mention the distribution used, along
with the appropriate degrees of freedom):
Degrees of freedom, d = (r-1)(c-1) = 1,
Under the null hypothesis, the test statistic follows 𝜒12 .
α = 0.05
2
Thus, cut-off value: 𝜒1;0.05 = 3.84146
Q1.f) [1 Mark] With justification, give your conclusion for the test at 5% level of significance:
2
As 𝜒𝑜2 (38.16599) > 𝜒1;0.05 (3.84146), we reject the null hypothesis at 5% level of significance.
Q2. [17 Marks] To predict price of houses (in Rs.’0000) based on area of the house (in SqFt), number of
bedrooms and bathrooms present in the house and whether there is a shopping mall (Mall) within 3km
radius of the house (Yes/No), a sample of size 30 has been collected and a multiple regression model is
fitted using R. Following is part of the output using the lm command:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.921451 2.897348 -0.663 0.51329
SqFt 0.002610 0.002123 1.230 0.23028
Bedrooms 2.134413 0.652694 3.270 0.00313 **
Bathrooms 0.702250 0.816212 0.860 0.39776
factor(Mall)Yes 2.631052 0.744950 3.532 0.00163 **
Following is a partial output obtained using the anova command in R on the fitted regression model:
Q2.c) [1 Mark] Predict the price of a house (in Rs.’0000) which does not have a shopping mall within 3km
radius and has an area of 1800 SqFt, 3 bedrooms and 2 bathrooms:
Predicted Price:
̂ = −1.921451 + 0.002610 × 1800 + 2.134413 × 3 + 0.702250 × 2 = 10.58429
𝑃𝑟𝑖𝑐𝑒
Page 2 of 4
Programme: Business Management 2021-23 (Term-II)
Course: Quantitative Techniques - II
End Term Solution
Q2.d) [2 Marks] What is the adjusted R-squared value of the model? Write the formula and show your
computations.
SSE
2
Ra = 1 − n − k − 1 = 1 − (n − 1) (1 − R2 ) = 1 − (30 − 1) (1 − 0.665) = 0.6114
SST (n − k − 1) (30 − 4 − 1)
n−1
n = 30 (sample size); k = 4 (There are four predictor variables)
Q2.e) [9 Marks] For this part, you need to construct the regression ANOVA table, and perform a test of
significance for the fitted model at 5% level of significance.
i) [6 Marks] Showing the steps involved, construct the complete regression ANOVA table corresponding
to the regression model fitted.
Regression ANOVA Table:
iii) [1 Mark] Cut-off value for the test at 5% level of significance (mention the distribution used, along
with the appropriate degrees of freedoms):
Under the null hypothesis, the test statistic follows a F distribution with degrees of freedom 4 and 25.
The cut-off value at α = 0.05 is 𝐹4,25;0.05 = 2.76
iv) [1 Mark] With justification, give your conclusion for the test at 5% level of significance:
At 5% level of significance, 𝐹𝑜𝑏𝑠 (12.41) > 𝐹4,25;0.05 (2.76), thus we reject the null hypothesis.
Q2.f) [3 Marks] For this part, the goal is to perform an appropriate t-test for the significance of the impact
of SqFt on price, controlling for the presence of the other variables in the model.
i) [1 Mark] Write the null and the alternative hypotheses in term of your equation in part 2(a):
Null hypothesis: H0 : β1 = 0 Alternative hypothesis: H1 : β1 ≠ 0
ii) [1 Mark] With reason, write the appropriate degree of freedom for the t-test:
Number of observations, n = 30, number of independent variables used in the model, k = 4. Thus, degrees
of freedom for the t-test is, n-k-1 = 30 – 4 – 1 = 25.
iii) [1 Mark] With justification, give your conclusion for the test at 5% level of significance:
p-value for the t-test is: 0.23028 and α = 0.05. As p-value (0.23028) > α (0.05), we cannot reject the null
hypothesis H0 : β1 = 0.
Otherwise: 𝑡25;0.025 = 2.060 and |𝑡𝑏1 | = 1.230 (from the given output). As |𝑡𝑏1 | < 𝑡25;0.025, we cannot
reject the null hypothesis.
Page 3 of 4
Q3. [7 Marks] An e-retail website uses a 100-point customer satisfaction score to rate themselves. From
the past experience with the satisfaction rating score, a population average of 80 with a population
standard deviation of 12 is expected. Due to the pandemic, the e-retail website thinks that their ratings
have gone up on an average.
A sample of 10 customer satisfaction scores collected in November 2021 are as follows: 95, 90, 83, 75,
95, 80, 83, 82, 93, 64. Suppose that the customer satisfaction scores data follow normal distribution. You
have to perform an appropriate test to check whether the average rating score has gone up for the e-
retail website.
Q3.a) [1 Mark] Formulate the null and the alternative hypotheses.
Null hypothesis: H0 : μ = 80 Alternative hypothesis: H1 : μ > 80
Q3.b) [1 Mark] What is the formula for the appropriate test statistic?
𝑋̅ −𝜇
Test statistic: 𝑧 = 𝜎/
√𝑛
Q3. c) [2 Marks] What is the observed value of the test statistic based on the sample? Show
computations.
840
̅
X= = 84 , 𝜎 = 12, 𝑛 = 10
10
𝑋̅ − 𝜇 84 − 80
𝑧𝑜𝑏𝑠 = = = 1.054093 ≈ 1.05
𝜎/√𝑛 12/√10
Q3.d) [2 Marks] What is the p-value for testing whether the average rating score has gone up? Justify.
p-value = P(Z > zobs ) = P(Z > 1.05) = 1 − P(Z ≤ 1.05) = 1 − 0.8531 = 0.1469
Q3. e) [1 Mark] What is your conclusion for the test at 5% level of significance? Justify.
Here α = 0.05 and p-value for the test is 0.1469.
As p-value (0.1469) > α (0.05), we do not reject the null hypothesis.
Otherwise: 𝑧0.05 = 1.65 and 𝑧𝑜𝑏𝑠 = 1.05. As 𝑧𝑜𝑏𝑠 < 𝑧0.05 , we cannot reject the null hypothesis.
Thus, based on the given sample we cannot conclude that the average rating score has gone up for the
e-retail website at 5% level of significance.
Q4. a) [4 Marks] Suppose X1, X2, X3, …, Xn is an independently and identically distributed (IID) random
sample from Poisson(θ). Find a Method of Moment Estimator (MME) of θ.
As 𝑋 ~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜃) ⇒ 𝐸(𝑋) = 𝜃
1
Also 𝜇̂ 1 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 = 𝑋̅
Thus, 𝜃̂ = 𝑋̅
Q4. b) [2 Marks] Suppose that the number of phone calls received by a customer care service between
the hours 10AM-7PM follows a Poisson distribution with parameter θ. A random sample of the number
of phone calls received between the hours 10AM-7PM from Monday to Friday is: 35, 32, 45, 15, 21.
Based on the sample, compute the value of the MME of θ.
Page 4 of 4
Programme: Business Management 2022-24 (Term-II)
Course: Quantitative Techniques - II
Quiz 1
Full Marks: 20
Q1. [This question has parts a, b, and c] Let X be a discrete random variable with the following
probability mass function, where 0 ≤ θ ≤ ½:
X 0 1 2 3
P(X) ¼ ¼ + θ/2 ½ -θ θ/2
Q1.a) [5 Mark] Suppose you have an independent and identically distributed random sample
𝑋1 , 𝑋2 , … , 𝑋𝑛 of size n. Find a method of moment estimate (MME) of θ.
Q1.b) [3 Marks] Is the estimate obtained in part (a) unbiased for the parameter θ? Justify.
Q1.c) [2 Mark] Suppose the following sample was taken from the distribution of X: 0, 0, 2, 1, 3, 2, 1, 0,
2, 1. Compute the value of the method of moment estimate (MME) of θ based on the sample.
Solution:
a) The first population moment
1 1 𝜃 1 𝜃 5
𝜇1 = 𝐸(𝑋) = 0 × 4 + 1 × (4 + 2) + 2 × (2 − 𝜃) + 3 × 2 = 4.
Thus, we have to consider the second moments.
The second population moment,
1 1 𝜃 1 𝜃 9
𝜇2 = 𝐸(𝑋 2 ) = 02 × + 12 × ( + ) + 22 × ( − 𝜃) + 32 × = + 𝜃
4 4 2 2 2 4
1
The second sample moment, 𝜇̂ 2 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 .
Thus, a method of moment estimate (MME) of θ is,
𝑛
1 9
𝜃̂ = ∑ 𝑋𝑖2 −
𝑛 4
𝑖=1
1 9 1 9 1 9𝑛 9 9 9
b) 𝐸(𝜃̂) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖2 − 4) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑋𝑖2 ) − 4 = 𝑛 ( 4 + 𝑛𝜃) − 4 = 4 + 𝜃 − 4 = 𝜃.
Thus, the estimate obtained in part (a) is unbiased for the parameter θ.
1 𝑛 1 12
c) 𝑛 ∑𝑖=1 𝑋𝑖2 = 10 (0 + 0 + 4 + 1 + 9 + 4 + 1 + 0 + 4 + 1) = 5
Thus,
𝑛
1 9 12 9 3
𝜃̂ = ∑ 𝑋𝑖2 − = − =
𝑛 4 5 4 20
𝑖=1
Q2. [4 Mark] An MLA wishes to survey residents of her assembly to see what proportion of the electorate
is supportive of her plan of using state funds to pay for installation of solar panels. What sample size is
necessary if the 95% confidence interval (CI) for the above proportion is to have a width of at most 0.10
irrespective of the value of the said proportion?
Solution: The width of the 95% CI is 0.1, thus margin of error, M = 0.1/2 = 0.05 and α = 0.05
Now, the 100(1-α)% confidence interval for population proportion is
𝑝̂(1−𝑝̂) 𝑝̂(1−𝑝̂)
[𝑝̂ − 𝑧𝛼 √ , 𝑝̂ + 𝑧𝛼 √ ].
2 𝑛 2 𝑛
Thus,
Page 1 of 2
𝑧𝛼 2
𝑝̂(1−𝑝̂)
𝑧𝛼 √ < 𝑀 ⇒ 𝑛 ≥ ( 2 ) 𝑝̂ (1 − 𝑝̂ ) ….(a)
2 𝑛 𝑀
Now, 𝑧𝛼 = 𝑧0.025 = 1.96, so from (a),
2
𝑧𝛼 2 1.96 2
𝑛 ≥ ( 2 ) 𝑝̂ (1 − 𝑝̂ ) = ( ) 𝑝̂ (1 − 𝑝̂ ) = 1536.64 × 𝑝̂ (1 − 𝑝̂ )
𝑀 0.05
To find the necessary sample size, we need to maximize (a) with respect to 𝑝̂ . (a) is maximized when 𝑝̂ =
0.5.
Thus, sample size,
𝑛 ≥ 1536.64 × 𝑝̂ (1 − 𝑝̂ ) = 384.16
⇒ 𝑛 ≥ 385
Q3. [This question has parts a and b] A sample of 10 bills for meals was obtained from a restaurant.
For each of the 10 bills the tip by the customers was computed as a percentage of the bill. Following are
the 10 tip percentages:
14.21, 20.24, 20.10, 14.94, 15.69, 15.04, 12.04, 20.16, 17.85, 16.35
Assume that the tip percentage follows a normal distribution.
Q3.b)[4 Marks] Perform an appropriate test at 10% level of significance to conclude whether on an
average the tip percentage for this restaurant exceeds the standard 15% by constructing the null and
alternate hypothesis and a proper test statistic.
Solution:
Null Hypothesis: 𝐻0 : 𝜇 = 15, Alternate Hypothesis: 𝐻0 : 𝜇 > 15
𝑋̅ −𝜇
Test statistic: 𝑡 = 𝑠/ = 1.854534. Critical value: 𝑡𝑛−1;𝛼 = 𝑡9;0.1 = 1.383
√𝑛
As 𝑡 > 𝑡9;0.1, we reject the null hypothesis, i.e. tip percentage for this restaurant exceeds the standard
15%.
Programme: Business Management 2022-24 (Term-II)
Course: Quantitative Techniques - II
Quiz 2 Solution
Q1. [This question has parts a, b, c and d] Homes are typically appraised before sale. Lenders
such as banks have an incentive to assign a higher value to a house (so the home loan will be larger),
while borrowers might be inclined to value the same house at a lower price.
A random sample of 8 residential properties being purchased in Jamshedpur after foreclosure was
selected. Each property was appraised both by the borrower and by the lender, resulting in the
following data (in Lakhs of Rs.).
House 1 2 3 4 5 6 7 8
Lender’s 24.8 31.1 21.13 76.9 45.3 52.5 71.3 58.2
appraisal
Borrower’s 18.6 21.8 14.6 67.5 35.8 42.2 56.5 50.2
appraisal
Q1. Suppose that the appraisals follow a normal distribution.
Q1.a) [1 Mark] State the null and the alternative hypotheses to test whether the lender’s assessment
is higher than the borrower’s assessment.
Solution: 𝐻0 : 𝜇𝐿 = 𝜇𝐵 , 𝐻1 : 𝜇𝐿 > 𝜇𝐵
Q1.b) [1 Mark] Which of the following test you would perform to test the hypothesis in Q1.a)?
A) Kruskall-Wallies Test
B) Two-Sample Paired t-Test
C) Two Sample Independent t-Test
D) Chi-Square Test for independence
Q1.c) [4 Marks] Calculate the value of appropriate test statistic to test the hypotheses in Q1.a).
Solution:
House Lender’s Borrower’s 𝑫𝒊
Appraisal Appraisal
1 24.8 18.6 6.2
2 31.1 21.8 9.3
3 21.13 14.6 6.53
4 76.9 67.5 9.4
5 45.3 35.8 9.5
6 52.5 42.2 10.3
7 71.3 56.5 14.8
8 58.2 50.2 8.0
̅
D = 9.25375, Dsd = 2.67944, n = 8
̅
D
Test Statistic: 𝑇𝑜𝑏𝑠 = D = 9.768294
sd /√𝑛
Q1.d) [2 Marks] What is your conclusion at 10% level of significance?
Solution: Critical value: 𝑡7;0.1 = 1.415
As 𝑇𝑜𝑏𝑠 > 𝑡7;0.1 , we reject the null hypothesis, i.e. the lender’s assessment is higher than the
borrower’s assessment.
Page 1 of 3
Q2. [This question has parts a, b and c] An advertising agency is trying to test the effects of the
size and the design of a website advertisement based on the number of clicks it receives for two
months. There are three different advertising designs (Designs A, B and C) and two different sizes
(Small and Large) for the advertisement.
Following partial output was obtained from R upon performing an analysis of variance (ANOVA):
Response: Ad_clicks
Sum Sq Df
Ad_Size 48 1
Ad_Design 344 2
Ad_Size:Ad_Design 56 2
Residuals 96 6
Q2. a) [6 Marks] Construct the full two-way ANOVA table with interaction based on the given
information. Indicate the formulas used.
Q2.b) Suppose you want to test for possible interaction between website advertisement size and
website advertisement design for the number of clicks.
Q2.b) i) [1 Mark] Construct the null and alternative hypotheses to perform the test.
Q2.b) ii) [1 Mark] Which of the following would be the critical value for the test if you are testing
at 5% level of significance?
Solution:
A) 𝐹2,11;0.05 = 3.982298
B) 𝐹6,2;0.05 = 19.32953
C) 𝐹11,2;0.05 = 19.40496
D) 𝐹2,6;0.05 = 5.143253
Programme: Business Management 2022-24 (Term-II)
Course: Quantitative Techniques - II
Quiz 2 Solution
Q2.b) iii) [2 Marks] What is the value of the test statistic for this test? What is your conclusion
about the test at 5% level of significance? (Justify)
𝑀𝑆𝐴𝐵
Solution: Test statistic value, 𝐹𝐴𝐵 = 𝑀𝑆𝐸 = 1.75
Critical value: 𝐹2,6;0.05 = 5.143253.
As 𝐹𝐴𝐵 < 𝐹2,6;0.05 , we cannot reject the null hypothesis, i.e., there is no interaction between Ad size
and Ad Design.
Q2.c) Suppose you now want to test for the effect of the website advertisement design for the
number of clicks.
Q2.c) i) [1 Mark] Construct the null and alternative hypotheses to perform the test.
Solution: 𝐻0 : μA = 𝜇𝐵 = 𝜇𝐶
𝐻1 : At least one μi is different
Q2.c) ii) [1 Mark] Which of the following would be the critical value for the test if you are testing
at 5% level of significance?
Solution:
A) 𝐹2,11;0.05 = 3.982298
B) 𝐹6,2;0.05 = 19.32953
C) 𝐹11,2;0.05 = 19.40496
D) 𝐹2,6;0.05 = 5.143253
Q2.c) iii) [2 Marks] What is the value of the test statistic for this test? What is your conclusion
about the test at 5% level of significance? (Justify)
𝑀𝑆𝐵
Solution: Test statistic value, 𝐹𝐵 = 𝑀𝑆𝐸 = 10.75
Critical value: 𝐹2,6;0.05 = 5.143253.
As 𝐹𝐵 > 𝐹2,6;0.05 , we reject the null hypothesis, i.e., there is some effect of the website
advertisement design for the number of clicks.
Page 3 of 3
Business Management 2022-24 (Term-II)
Quantitative Techniques – II, End Term Solution
Full Marks: 40(+4 Bonus), Time: 90 Minutes
Q1. [10 Marks: This question has parts a, b, c and d] There are three different travel websites (Website
A, B and M) which are offering competitive prices (rent per night in Rs.) for hotels. The following data show
the samples of double room rent per night for budget hotels at a hill station during some days in January from
the three websites:
Website A 1200, 1190, 1180, 1208, 1120
Website B 2210, 1235, 1220, 1185, 1175, 1335, 1189
Website M 1135, 2190, 1300, 1150, 1380
Suppose the data are not known to be normally distributed. You have to perform an appropriate test to check
whether there is a difference in the room rents offered by the three websites.
Q1.a) [1 Mark] Formulate the null and the alternative hypotheses.
Solution: 𝐻0 : 𝜇𝐴 = 𝜇𝐵 = 𝜇𝐶 , 𝐻1 : at least one μi is different
Q1.b) [1 Mark] Which of the following tests you would be performing to test the above hypothesis? Choose
and write the correct alternative:
Solution: B) Kruskal-Wallis Test
Q1.c) [6 Marks] What is the test statistic (write the formula), and what is its observed value based on the
sample? Show computations.
12 𝑇2
Solution: Test statistic: 𝐻 = 𝑛(𝑛+1) ∑𝑘𝑖=1 𝑛𝑖 − 3(𝑛 + 1)
𝑖
Q1.d) [2 Marks] What is the critical value for the test at 5% level of significance? What is your conclusion
for the test at 5% level of significance based on the cut-off value? Justify.
2
Solution: 𝐻 ~ 𝜒𝑘−1
2
Reject 𝐻0 at level of significance α if 𝐻 > 𝜒𝑘−1;𝛼
2
Critical value at 5% level of significance: 𝜒2;0.05 = 5.99147
As Critical value at 5% level of significance (5.99147) > the observed test statistic value (1.613445), we
cannot reject the null hypothesis, i.e., there is no significant evidence of difference in room rents due to
different websites.
Page 1 of 5
Business Management 2022-24 (Term-II)
Quantitative Techniques – II, End Term Solution
Full Marks: 40(+4 Bonus), Time: 90 Minutes
Q2. [14 Marks: This question has parts a, b, c, d, e and f] The HR department of a company wants to
know how monthly income in Rs. (income), amount spent monthly on fast food in Rs. (fast_food_spend),
and the time spent in minutes for exercise per day (Exercise_per_day) affect whether an employee of that
company would contract heart disease. The response variable, heart disease contracted (coded as 1)/ not
contracted (coded as 0), is a binary variable. The following logistic regression model is fitted to the data:
glm(heart_disease ~ income + fast_food_spend + Exercise_per_day, family =
binomial(link="logit")).
The partial summary is as follows:
Coefficients:
Estimate Std. Error z value
(Intercept) 3.846 2.459e+00 1.564
income 3.388e-05 4.177e-05 0.811
fast_food_spend 2.259e-03 1.028e-03 2.197
Exercise_per_day -3.570e-01 1.121e-01 -3.185
Q2.c) [1 Mark] Compute the predicted logit score of heart disease for an employee with monthly income
Rs. 40000, who spends Rs. 2000 on fast food monthly and does 20 minutes exercise per day.
Solution: logit score:
𝜋𝑖 = 3.846 + 0.00003388 × 40000 + 0.002259 × 2000 − 0.357 × 20 = 2.5792
Q2.d) [2 Marks] Predict the probability of heart disease for the employee mentioned in part (c) (income =
40000, fast_food_spend = 2000 and Exercise_per_day = 20).
exp (2.5792)
Solution: Probability, 𝑝̂ 𝑖 = 1+exp(2.5792) = 0.9295109
Q2.e) [5 Marks] Perform a G-test for the model: state the null and alternative hypotheses, compute the
value of the test statistic, the corresponding degree of freedom and perform the test at a 5% level of
significance.
Solution: G-Test
𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 0, 𝐻1 : 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝛽𝑖 ≠ 0
2
Test statistics value = 𝜒𝑜𝑏𝑠 =Null Deviance – Residual Deviance = 66.406 – 21.217 = 45.189
Page 2 of 5
Business Management 2022-24 (Term-II)
Quantitative Techniques – II, End Term Solution
Full Marks: 40(+4 Bonus), Time: 90 Minutes
Degree of freedom = Null Deviance degrees of freedom – Residual Deviance degrees of freedom = 49-46
= 3.
2
Critical value 𝜒3;0.05 = 7.815.
2 2
As 𝜒𝑜𝑏𝑠 > 𝜒3;0.05 = 7.815, we reject the null hypothesis, i.e., there is at least one significant variable in the
model.
Q2.f) [4 Marks] Perform a test of significance for the coefficient of the independent variable income in the
fitted model: state the null and alternative hypotheses, compute the p-value for the test statistic and hence
perform the test at 5% level of significance. What is the conclusion?
Solution: Test of significance for the coefficient of the independent variable income in the fitted model:
𝐻0 : 𝛽1 = 0, 𝐻1 : 𝛽1 ≠ 0
Test statistic value 𝑍𝑜𝑏𝑠 = 0.811
p-value = 2 × 𝑃(𝑍 > 0.811) = 2 × (1 − 0.791) = 0.418
Now 𝛼 = 0.05.
As p − value (0.418) > 𝛼(0.05), we cannot reject the null hypothesis, i.e., the income variable in the
fitted model is not significant.
Q3. [20 Marks: This question has parts a, b, c, d, e, f and g] To predict the price of houses (in Rs.
lakhs) based on area of the house (in SqFt), number of bedrooms (Bed) and bathrooms (Bath) present in
the house and the neighbourhood (NBD: East/North/West) where the house is situated, a sample of 128
houses is collected and a multiple regression model is fitted using R.
Following is a part of the output obtained using the lm command:
Coefficients:
Estimate Std. Error
(Intercept) 3.0079 1.3381
SqFt 0.0035 0.0008
Bed 0.0703 0.2307
Bath 0.9479 0.3090
factor(NBD)North -0.9667 0.3225
factor(NBD)West 2.9214 0.3485
Page 3 of 5
Business Management 2022-24 (Term-II)
Quantitative Techniques – II, End Term Solution
Full Marks: 40(+4 Bonus), Time: 90 Minutes
Solution:
̂ = 3.0079 + 0.0035𝑆𝑞𝐹𝑡 + 0.0703𝐵𝑒𝑑 + 0.9479𝐵𝑎𝑡ℎ − 0.9667𝐼𝑁𝑜𝑟𝑡ℎ + 2.9214𝐼𝑊𝑒𝑠𝑡
𝑃𝑟𝑖𝑐𝑒
Q3.c) [1 Mark] Predict the price of a house (in Rs. lakhs) which has area 1200 Sq Ft, 2 bedrooms, 1
bathroom and is in the eastern neighbourhood (NBD = EAST).
̂ = 3.0079 + 0.0035 × 1200 + 0.0703 × 2 + 0.9479 × 1 = 8.2964
Solution: 𝑃𝑟𝑖𝑐𝑒
Rs. 8.2964 (in Lakhs).
Q3.d) [2 Marks] What is the adjusted R-squared value of the model? Write the formula and show your
computations.
Solution: The multiple R-squared value = 0.7089. Thus adjusted R-squared,
(n − 1) (128 − 1)
R2a = 1 − (1 − R2 ) = 1 − (1 − 0.7089) = 0.6969697
(n − k − 1) (128 − 5 − 1)
n = 128 (sample size); k = 5 (There are five predictor variables)
Q3.e) For this part, you need to construct the regression ANOVA table, and perform a test of significance
for the fitted model at 5% level of significance. Your solution needs to cover the following steps:
i) [6 Marks] Construction of the complete regression ANOVA table corresponding to the regression
model fitted, showing necessary calculations.
Solution: Regression ANOVA Table:
Source DF Sum of Squares Mean Square F-stat
Regression 5 SSR = 280.364+79.921 MSR = SSR/DF = 649.965/5 𝑀𝑆𝑅
𝐹𝑜𝑏𝑠 =
+42.726+ 246.954 = 129.993 𝑀𝑆𝐸
= 649.965
Residuals 122 SSE = 266.888 SSE/DF=266.888/122 = 129.993/2.187607
=2.187607 = 59.42246
Total 5+122 = SST = SSR+ SSE
127 =649.965+266.888
=916.853
ii) [1 Mark] Statement of the null and the alternate hypotheses for the testing process.
Solution: Null hypothesis: 𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 𝛽4 = 𝛽5 = 0,
Alternative hypothesis: 𝐻1 : at least one 𝛽𝑖 ≠ 0.
iii) [1 Mark] Critical value for the test at 5% level of significance (with brief reasoning, choose from
the following):
𝐹5,122;0.05 = 2.288 , 𝐹122,5;0.05 = 4.398, 𝐹4,122;0.05 = 2.446, 𝐹122,4;0.05 = 5.658
Solution: Under the null hypothesis, the test statistic follows a F distribution with degrees of freedom 5
and 122. The critical value at α = 0.05 is 𝐹5,122;0.05 = 2.288
iv) [1 Mark] Your conclusion for the test at 5% level of significance (with justification).
Solution: At 5% level of significance, 𝐹𝑜𝑏𝑠 (59.42246) > 𝐹5,122;0.05 (2.288), thus we reject the null
hypothesis.
Q3.f) For this part, the goal is to perform an appropriate t-test for the significance of the impact of
number of Bedrooms (Bed) on price, controlling for the presence of the other variables in the model.
Your solution needs to cover the following steps:
Page 4 of 5
Business Management 2022-24 (Term-II)
Quantitative Techniques – II, End Term Solution
Full Marks: 40(+4 Bonus), Time: 90 Minutes
i) [1 Mark] Statement of the null and the alternate hypotheses for the testing process in term of your
model equation in part 3(a).
Solution: Null hypothesis: H0 : β2 = 0, Alternative hypothesis: H1 : β2 ≠ 0
ii) [1 Mark] Computation of the appropriate degree of freedom for the t-test (with brief reason).
Solution: Degrees of freedom = n – k – 1 = 128 – 5 – 1 = 122
iii) [1 Mark] Your conclusion for the test at 5% level of significance (with justification).
Solution: α = 0.05.
0.0703
Test statistic: 𝑡𝑜𝑏𝑠 = 0.2307 = 0.3047248
𝑡122;0.025 = 1.96
As 𝑡𝑜𝑏𝑠 < 𝑡122;0.025 , we cannot reject the null hypothesis, i.e., the number of Bedrooms (Bed) variable
in the fitted model is not significant.
Q3.g) [3 Marks] Using the output given, justify whether there is a significant impact of the
neighbourhood (NBD) on price, controlling for the presence of the other variables in the model.
Solution: The t-value for (Neighborhood)West is 2.9214/0.3485 = 8.382783, which makes the variable
significant for the model. As the level West for the categorical variable neighbourhood (NBD) is
significant, thus the variable neighbourhood (NBD) on price, controlling for the presence of the other
variables in the model, has significant impact on price.
Page 5 of 5
Programme: Business Management 2023-25 (Term-II)
Course: Quantitative Techniques - II
Quiz 1 Solution, Full Marks: 20, Time: 30 Minutes
Q1. [1 Mark] (Choose the correct option) A simple random sample of size n is one in which
A) The probability of picking any element of the population must be 1/n.
B) Each possible sample of size n has equal chance of getting selected.
C) Every nth member is selected from the population.
D) You keep sampling until you have a fixed number of elements of the population having various
characteristics.
E) Randomly selected from the population using any simple sampling scheme.
Q2. [1 Mark] (Choose the correct option) Which of the following is never used in computing the margin
of error of any confidence interval for the population mean?
A) Sample mean
B) Sample standard deviation
C) Sample size
D) Confidence level
E) Population standard deviation
Q3. [1 Mark] (Choose the correct option) Which of the following statement regarding Testing of
Hypotheses is not true:
A) The probability of making a Type I Error is called the level of the test.
B) The probability of rejecting the null hypothesis, when it is false, is known as the power of the test.
C) The sum of the probabilities of a Type I Error and a Type II Error is 1.
D) As the Type I error is more serious, it is ideal to avoid making Type I error as much as we can.
E) The probability of the Type II Error depends on the true value of the parameter.
Q4. [This question has parts a, b, c, d] Suppose 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 is an IID (independently and identically
distributed) sample from a Poisson population with parameter µ.
a)[2 Marks] What is the likelihood function for µ using the sample?
b)[4 Marks] Showing all relevant steps, obtain the maximum likelihood estimator for the parameter µ.
c)[1 Mark] Show that the obtained maximum likelihood estimator is an unbiased estimator of µ.
d)[2 Marks] Each of the 150 newly manufactured items from a manufacturing process is examined and
the number of scratches per item is recorded (the items are supposed to be free of scratches), yielding
the following data:
Number of scratches per item 0 1 2 3 4 5 6 7
Observed frequency 18 37 42 30 13 7 2 1
If the number of scratches on a randomly chosen item follows a Poisson distribution with parameter µ,
compute the maximum likelihood estimate (MLE) for µ based on the data.
Page 1 of 3
b) The log-likelihood function:
𝑛
ln 𝐿(𝜇) = −𝑛𝜇 + (∑ 𝑋𝑖 ) ln 𝜇
𝑖=1
Taking the first derivative with respect to µ of the log-likelihood function:
𝑛
𝑑 ln 𝐿(𝜇) 1
= −𝑛 + (∑ 𝑋𝑖 )
𝑑𝜇 𝜇
𝑖=1
𝑑 ln 𝐿(𝜇) 1 1
Now, = 0 ⟹= 𝜇 (∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 ⇒ 𝜇̂ = 𝑛 (∑𝑛𝑖=1 𝑋𝑖 ) ⟹ 𝜇̂ = 𝑋̅
𝑑𝜇
Taking the second derivative with respect to µ of the log-likelihood function:
𝑛
𝑑 2 ln 𝐿(𝜇) 1
= − 2 (∑ 𝑋𝑖 )
𝑑𝜇 2 𝜇
𝑖=1
𝑑 2 ln 𝐿(𝜇) 𝑛
When 𝜇̂ = 𝑋̅, = − 𝑋̅ < 0.
𝑑𝜇 2
Thus, MLE of parameter µ is 𝑋̅.
c) To show 𝐸(𝑋̅) = 𝜇.
𝑛 𝑛
1 1 1
𝐸(𝑋̅) = 𝐸 ( ∑ 𝑋𝑖 ) = ∑ 𝐸(𝑋𝑖 ) = × 𝑛𝜇 = 𝜇
𝑛 𝑛 𝑛
𝑖=1 𝑖=1
d) From the dataset,
1 317
𝑋̅ = (0 × 18 + 1 × 37 + 2 × 42 + 3 × 30 + 4 × 13 + 5 × 7 + 6 × 2 + 7 × 1) =
150 150
= 2.11333
The maximum likelihood estimate (MLE) for µ based on the data is 2.11333.
Q5. [This question has parts a, b, c, d, e] Before opening a new store in a new town, a national sandwich
chain conducted a survey to investigate whether there is sufficient demand for their products. They
contacted 300 households in that town through random-digit dialling, and 198 respondents indicated they
would patronize this shop. Let p be the proportion of all households in this town that would patronize the
sandwich franchise.
198
a) [1 Mark] The sample proportion 𝑝̂ = 300 = 0.66 (Only write the correct value; you do not need to
give any justification.)
b) [1 Mark] Write the formula for a 100(1 − 𝛼)% confidence interval for p:
𝑝̂ (1 − 𝑝̂ ) 𝑝̂ (1 − 𝑝̂ )
[𝑝̂ − 𝑍𝛼 √ , 𝑝̂ + 𝑍𝛼 √ ]
2 𝑛 2 𝑛
c) [2 Marks] Using the formula written in part (b), compute a 95% confidence interval for p (Show your
computations). Note: z0.05 = 1.645, z0.025 = 1.96, z0.005 = 2.58.
Solution: Here 𝛼 = 0.05, thus 𝑍𝛼 = 𝑧0.05 = 𝑧0.025 = 1.96
2 2
𝑝̂(1−𝑝̂) 𝑝̂(1−𝑝̂) 0.66(1−0.66) 0.66(1−0.66)
Thus, [𝑝̂ − 𝑍𝛼 √ , 𝑝̂ + 𝑍𝛼 √ ] = [0.66 − 1.96√ , 0.66 + 1.96√ ]=
2 𝑛 2 𝑛 300 300
[0.66 − 0.05360519, 0.66 + 0.05360519] = [0.6063948, 0.7136052]
Programme: Business Management 2023-25 (Term-II)
Course: Quantitative Techniques - II
Quiz 1 Solution, Full Marks: 20, Time: 30 Minutes
d) [2 Marks] Which of the following/s is/are the correct interpretation for the confidence interval
computed in part (c) of this problem? (Only choose the correct option/s; you do not need to give any
justification.)
A) We can be 95% confident that 𝑝̂ is between the confidence interval computed in part (c) of this
problem.
B) We can be 95% confident that p is between the confidence interval computed in part (c) of this
problem.
C) 95% of the values of 𝑝̂ for this sample are between the confidence interval computed in part (c) of
this problem.
D) If random samples of size 300 were repeatedly selected, then 95% of the time 𝑝̂ would fall between
the confidence interval computed in part (c) of this problem.
E) If random samples of size 300 were repeatedly selected, then in the long run 95% of the confidence
intervals formed would contain the true value of p.
e) [1 Mark] What is the minimum sample size required if the 95% confidence interval (CI) for the above
proportion is to have a width of at most 0.2 (0.4)? (Only choose the correct option/s; you do not need
to give any justification.)
A) 16 B) 22 C) 62 D) 87 E) 150
f) [1 Mark] Write the formula used to obtain the answer in part (e):
𝑍𝛼 2
Solution: 𝑛 ≥ ( 𝑀2 ) 𝑝̂ (1 − 𝑝̂ )
1.96 2
When M = 0.1, n = 300, 𝑧0.025 = 1.96, 𝑛 ≥ ( 0.1 ) 0.66(1 − 0.34) = 86.2055 ⟹ 𝑛 ≥ 87
1.96 2
When M = 0.2, n = 300, 𝑧0.025 = 1.96, 𝑛 ≥ ( 0.2 ) 0.66(1 − 0.34) = 21.55138 ⟹ 𝑛 ≥ 22
[End of paper]
Page 3 of 3
Programme: Business Management 2023-25 (Term-II)
Course: Quantitative Techniques - II
Quiz 2 Solution
Full Marks: 20
Q1. [This question has parts a and b] The screening process for detecting an AI written content is not perfect.
Researchers have developed an AI detection software that is considered fairly reliable. It identifies 89% of the AI
written contents correctly. However, it also identifies 5% of the written contents as AI generated erroneously. Based
on the null hypothesis “the written content is not AI generated” and the alternative hypothesis “the written content
is AI generated”, answer the following questions:
a) [1 Mark] (Fill in the blanks) The probability of Type I error is 0.05
b) [1 Mark] (Fill in the blanks) The power of the test is 0.89
Q2. [This question has parts a, b, c and d] An aspirin manufacturer fills bottles by weight rather than by count.
Since each bottle should contain 100 tablets, the average weight per tablet should be 5 grains. Each of 100 tablets
taken from a very large lot is weighed, resulting in a sample average weight per tablet of 4.87 grains and a sample
standard deviation of 0.5 grain.
Q2. a) [1 Mark] Write the appropriate hypotheses to test whether this information provide strong evidence for
concluding that the company is not filling its bottles as advertised.
Null hypothesis: 𝐻0 : 𝜇 = 5, Alternative hypothesis: 𝐻1 : 𝜇 ≠ 5
𝑋̅−𝜇
Q2.b) [1 Mark] Formula for the test statistic used: Solution: 𝑡𝑜𝑏𝑠 = 𝑠/
√𝑛
Q2.c) [2 Marks] What is its observed value of the test statistic based on the sample? Show computations.
Solution: 𝑛 = 100, 𝑋̅ = 4.87, 𝑠 = 0.5
𝑋̅−𝜇 4.87−5
𝑡𝑜𝑏𝑠 = = 0.5 = −2.6
𝑠/√𝑛
√100
Q2.c) [2 Marks] What is the p-value of the test? (Write the expression and show computation.)
Solution: Under the null hypothesis, the test statistic 𝑡𝑜𝑏𝑠 ~𝑡𝑛−1 . As the sample size is 100, we can replace the t-
distribution (𝑡𝑛−1 ) by standard normal distribution (Z). Thus, we compute the p-value using the standard normal
distribution.
p-value: 2𝑃(𝑍 > |𝑡𝑜𝑏𝑠 |) = 2 × 𝑃(𝑍 > 2.6) = 2 × (1 − 0.9953) = 0.0094
Q2. d) [2 Mark] What is your conclusion (with justification) based on the p-value if you are doing the test at 1%
level of significance?
Solution: Here level of significance α = 0.01. As p-value (0.0094) < α (0.01), we reject the null hypothesis 𝐻0 .
Thus, we can conclude that the information provides evidence for concluding that the company is not filling its
bottles as advertised.
Q3. [This question has parts a, b, c, d, and e] Are seatbelts effective to prevent serious injuries during a car
accident? This question was asked to 1150 passengers in cars who were involved in accidents in Jharkhand.
Injury
Yes No
Seat Belt Yes 250 290
No 300 310
Q3.a) [1 Mark] Formulate the null and the alternative hypotheses.
Null hypothesis: 𝐻0 : Wearing a seatbelt and having an injury during a car accident are independent.
Alternative hypothesis: 𝐻1 : Wearing a seatbelt and having an injury during a car accident are not independent.
Q3.b) [1 Mark] Formula for the test statistic used:
2
(𝑂𝑖𝑗 −𝐸̂𝑖𝑗 )
𝜒𝑜2 = ∑𝑎𝑙𝑙 𝑐𝑒𝑙𝑙𝑠 𝐸̂𝑖𝑗
𝜒𝑜2 ~𝜒𝑑2 , where 𝑑 = (𝑟 − 1)(𝑐 − 1), r = total number of rows, c= total number of columns, 𝑂𝑖𝑗 = cell frequency
𝑟 ×𝑐
corresponding to row i and column j, 𝑟𝑖 = row total for row i, 𝑐𝑗 = column total for column j; 𝐸̂𝑖𝑗 = 𝑖 𝑗𝑛
Page 1 of 2
Q3.c) [5 Marks] What is its observed value of the test statistic based on the sample? Show computations.
Solution:
Injury Total
Yes No
Seat Belt Yes 250 290 540
No 300 310 610
Total 550 600 1150
Table for computation of 𝐸̂𝑖𝑗 :
Injury
Yes No
Seat Belt Yes 𝑟1 𝑐1 540 × 550 𝑟1 𝑐2 540 × 600
𝐸̂11 = = = 𝐸̂12 = = =
𝑛 1150 𝑛 1150
258.2609 281.7391
1 1 1 1
𝑠𝑝̂1 −𝑝̂2 = √𝑝̂ (1 − 𝑝̂ ) ( + ) = √0.4782609(1 − 0.4782609) ( + ) = 0.02951524
𝑛1 𝑛2 540 610
𝑝̂ −𝑝̂2 −0.0288403
Test statistic 𝑧𝑜𝑏𝑠 = 𝑠 1 = 0.02951524
= −0.9771325, thus |𝑧𝑜𝑏𝑠 | = 0.9771325
̂ 1 −𝑝
𝑝 ̂2
At 5% level of significance, the critical value is 𝑧0.025 = 1.96
As |𝑧𝑜𝑏𝑠 | < 𝑧0.025, we do not reject the null hypothesis at 5% level of significance.
Thus, there is not enough evidence to believe that wearing seatbelts help in preventing injuries during a car
accident.
Programme: Business Management 2023-25 (Term-II)
Course: Quantitative Techniques - II
End Term Solution
Full Marks: 40 Time: 90 min
Q1. [1 Mark] Following are the AIC values for the given fitted models for a data:
Model AIC
Model_0: 𝑌 = 𝛽0 + ε 2611.87
Model_1: 𝑌 = 𝛽0 + 𝛽1 𝑋1 + ε 2509.33
Model_2: 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ε 2469.98
Model_3: 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝛽3 𝑋3 + ε 2600.88
Based on the AIC value, the best model would be: (Choose the correct option)
A) Model_0
B) Model_1
C) Model_2
D) Model_3
Q5. [2 Marks] Which of the following statement/s is/are TRUE for p-value:
A) A small p-value means that the data we observed would have been usual if the null hypothesis
𝐻0 were true.
B) The significance level of a test is a number such that we reject the null hypothesis 𝐻0 if the p-
value is less than that number.
C) When the p-value is more than 1, we can consider the alternative hypothesis 𝐻1 to be true.
D) We calculate the p-value under the presumption that the null hypothesis 𝐻0 is true.
E) The p-value of a test not being very small contradicts the null hypothesis 𝐻0 .
Page 1 of 6
Q6. [2 Marks] (Choose the option/s which are TRUE) One-way Analysis of variance (ANOVA)
and regression are similar in the sense that,
A) They both always consider multiple independent variables.
B) They both provide ways of partitioning the variation in the response variable into explained and
unexplained components.
C) They both assume a qualitative response variable.
D) They both have t tests for testing that the response variable is statistically dependent of the
explanatory variable(s).
E) They both have F tests for testing that the response variable is statistically independent of the
explanatory variable (s).
Page 2 of 6
Programme: Business Management 2023-25 (Term-II)
Course: Quantitative Techniques - II
End Term Solution
Full Marks: 40 Time: 90 min
Q8. [5 Marks] Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be an IID random sample from a 𝑁(𝜇, 𝜎 2 ). Find one estimator of
𝜇 and one estimator of 𝜎 2 using the method of moments (MME).
Solution: Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be an IID random sample from a 𝑁(𝜇, 𝜎 2 ).
Then, 𝐸(𝑋) = 𝜇, 𝑉𝑎𝑟(𝑋) = 𝜎 2
In method of moments (MME),
The first population moment: 𝜇1 = 𝐸(𝑋) = 𝜇
1
First sample moment: 𝜇̂ 1 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 = 𝑋̅
Thus, 𝜇1 = 𝜇̂ 1 ⇒ 𝜇̂ = 𝑋̅
The second population moment: 𝜇2 = 𝐸(𝑋 2 )
As, 𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − 𝜇 2 ⇒ 𝐸(𝑋 2 ) = 𝑉𝑎𝑟(𝑋) + 𝜇 2 = 𝜎 2 + 𝜇 2
Thus, 𝜇2 = 𝐸(𝑋 2 ) = 𝜎 2 + 𝜇 2
1
The second sample moment: 𝜇̂ 2 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2
As in MME, 𝜇2 = 𝜇̂ 2
𝑛
̂2 + 𝜇̂ = 1 ∑ 𝑋 2
⇒𝜎 2
𝑖
𝑛
𝑖=1
𝑛 𝑛 𝑛
̂2 = 1 ∑ 𝑋 2 − 𝜇̂ 2 = 1 ∑ 𝑋 2 − 𝑋̅ 2 = 1 ∑(𝑋𝑖 − 𝑋̅)2
⇒𝜎 𝑖 𝑖
𝑛 𝑛 𝑛
𝑖=1 𝑖=1 𝑖=1
2
Thus, MME of 𝜇 and 𝜎 ,
̂2 = 1 ∑𝑛 (𝑋𝑖 − 𝑋̅)2
𝜇̂ = 𝑋̅, 𝜎 𝑛 𝑖=1
Q9. A store accepts four different types of credit cards (Visa, MasterCard, American Express and
Rupay) from customers for payment. The store manager wants to evaluate whether the customers
choose the card for payment in the store randomly, or if certain options are favoured above others.
She has collected the data for 200 credit card transactions, following are the data:
Visa Master Card American Express Rupay
52 50 54 44
You have to use these data to evaluate whether customers choose between these four options
randomly, or if certain options are favoured above others.
a) [1 Mark] Formulate the null and the alternative hypotheses.
1
Null hypothesis: 𝐻0 : 𝑝𝑉𝑖𝑠𝑎 = 𝑝𝑀𝑎𝑠𝑡𝑒𝑟𝐶𝑎𝑟𝑑 = 𝑝𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛𝐸𝑥𝑝𝑟𝑒𝑠𝑠 = 𝑝𝑅𝑢𝑝𝑎𝑦 = 4
Alternative hypothesis: 𝐻1 : At least one of 𝑝𝑖 is different from ¼.
Page 3 of 6
c) [2 Marks] Compute the value of the test statistic (Show computations).
Categories Visa Master Card American Express Rupay Total
𝑂𝑖 52 50 54 44 200
𝐸𝑖 50 50 50 50 200
Q10. To understand the sentiments of the teachers about whether they find their job satisfying
(yes/no) a survey was conducted on 395 elementary school teachers and 266 high school teachers.
Of the elementary school teachers, 224 said they were satisfied with their jobs, whereas 126 of the
high school teachers were satisfied with their job.
Perform a suitable test of proportions to check whether the sentiment is different for the two groups
of teachers.
a) [1 Mark] Formulate the null and the alternative hypotheses.
Null hypothesis: 𝐻0 : 𝑝𝐸𝑙𝑒 = 𝑝𝐻𝑖𝑔ℎ
Alternative hypothesis: 𝐻1 : 𝑝𝐸𝑙𝑒 ≠ 𝑝𝐻𝑖𝑔ℎ
b) [1 Mark] Give the formula for a 100(1-α)% confidence interval for the true difference of
proportions, regarding job satisfaction, for the two groups of teachers.
Solution:
𝑝̂ 𝐸𝑙𝑒 (1 − 𝑝̂ 𝐸𝑙𝑒 ) 𝑝̂𝐻𝑖𝑔ℎ (1 − 𝑝̂ 𝐻𝑖𝑔ℎ )
[(𝑝̂ 𝐸𝑙𝑒 − 𝑝̂ 𝐻𝑖𝑔ℎ ) ± 𝑧𝛼 √ + ]
2 𝑛𝐸𝑙𝑒 𝑛𝐻𝑖𝑔ℎ
c) [2 Marks] What is the observed 95% confidence interval for the true difference of proportions
based on the sample? Show computations.
Solution:
224 126
𝑝̂𝐸𝑙𝑒 = = 0.5670886, 𝑝̂ 𝐻𝑖𝑔ℎ = = 0.4736842
395 266
= [0.01602131,0.1707875 ]
Page 4 of 6
Programme: Business Management 2023-25 (Term-II)
Course: Quantitative Techniques - II
End Term Solution
Full Marks: 40 Time: 90 min
d) [2 Marks] With justification, give your conclusion for the test at 5% level of significance.
Note: z0.05 = 1.645, z0.025 = 1.96, z0.005 = 2.58.
Solution: We would reject the null hypothesis, 𝐻0 for the two-sided test, as 0 (= 𝑝𝐸𝑙𝑒 − 𝑝𝐻𝑖𝑔ℎ ) does
not belong to the 95% interval [0.01602131,0.1707875 ].
Thus, based on the data there is not enough evidence to believe that the sentiment is same for the
two groups of teachers.
Q11. To understand whether sales of a particular product depend on both social media
(SocialMedia) advertising budget and effect of Influencers (Infl), a BM student has fitted a
multiple regression model with interaction between the predictor variables to predict sales to a
sample of size 30. She has considered four different types of influencers: Nano, Micro, Macro and
Mega, in her dataset. Following is part of the output using the lm command in R:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 165.005 94.058 1.754 0.0933
SocialMedia 12.810 17.760 0.721 0.4783
Infl_Macro 25.125 118.484 0.212 0.8340
Infl_Mega -137.183 118.720 -1.156 0.2603
Infl_Micro -24.049 110.812 -0.217 0.8302
SocialMedia:Infl_Macro -7.147 25.678 -0.278 0.7834
SocialMedia:Infl_Mega 27.106 26.059 1.040 0.3096
SocialMedia:Infl_Micro -4.003 22.049 -0.182 0.8576
Following is a partial output obtained using the anova command in R on the fitted regression
model:
Response: Sales
Df Sum Sq Mean Sq F value Pr(>F)
SocialMedia 1 33421 33421 3.7670 0.06519
Infl 3 13442 4481 0.5050 0.68280
SocialMedia:Infl 3 19520 6507 0.7334 0.54320
Residuals 22 195184 8872
Now answer the following questions:
a) [1 Mark] Write the regression model.
Solution:
𝑆𝑎𝑙𝑒𝑠 = 𝛽0 + 𝛽1 𝑆𝑜𝑐𝑖𝑎𝑙𝑀𝑒𝑑𝑖𝑎 + 𝛽2 𝐼𝐼𝑁𝐹𝐿_𝑀𝑎𝑐𝑟𝑜 + 𝛽3 𝐼𝐼𝑁𝐹𝐿_𝑀𝑒𝑔𝑎 + 𝛽4 𝐼𝐼𝑁𝐹𝐿_𝑀𝑖𝑐𝑟𝑜
+ 𝛽5 𝑆𝑜𝑐𝑖𝑎𝑙𝑀𝑒𝑑𝑖𝑎 × 𝐼𝐼𝑁𝐹𝐿_𝑀𝑎𝑐𝑟𝑜 + 𝛽6 𝑆𝑜𝑐𝑖𝑎𝑙𝑀𝑒𝑑𝑖𝑎 × 𝐼𝐼𝑁𝐹𝐿_𝑀𝑒𝑔𝑎
+ 𝛽7 𝑆𝑜𝑐𝑖𝑎𝑙𝑀𝑒𝑑𝑖𝑎 × 𝐼𝐼𝑁𝐹𝐿_𝑀𝑖𝑐𝑟𝑜 + 𝜖
b) [1 Mark] Give the fitted regression equation.
Solution:
̂ =165.005+12.810 𝑆𝑜𝑐𝑖𝑎𝑙𝑀𝑒𝑑𝑖𝑎 +25.125 𝐼𝐼𝑁𝐹𝐿_𝑀𝑎𝑐𝑟𝑜 − 137.183 𝐼𝐼𝑁𝐹𝐿
𝑆𝑎𝑙𝑒𝑠 𝑀𝑒𝑔𝑎
c) [1 Mark] Compute the predicted sales for a market if the social media advertising budget is 3.5
and influencer is of type Mega.
Page 5 of 6
̂ =165.005+12.810 × 3.5 − 137.183+27.106 × 3.5 = 167.528
Solution: 𝑆𝑎𝑙𝑒𝑠
d) For this part, you need to construct the regression ANOVA table, and perform a test of
significance for the fitted model at 5% level of significance.
i) [1 Mark] State the null and the alternate hypotheses for the ANOVA:
Solution: Null hypothesis: 𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 𝛽4 = 𝛽5 = 𝛽6 = 𝛽7 = 0
Alternative hypothesis: 𝐻0 : At least one 𝛽𝑖 ≠ 0
ii) [6 Marks] Showing the steps involved, construct the complete regression ANOVA table
corresponding to the regression model fitted.
Solution:
Regression ANOVA Table
Source DF Sum of Square Mean Square F-stat
Regression 1+3+3=7 SSR = MSR = 𝑀𝑆𝑅
𝐹𝑜𝑏𝑠 =
33421+13442+19520=66383 66383/7=9483.286 𝑀𝑆𝐸
Residual 22 SSE =195184 MSE = =9483.286/8872
195184/22= 8872 = 1.068901
Total 7+22=29 SST = 261567
iii) [1 Mark] (Choose the correct option) Critical value for the test at 5% level of significance:
A) 𝐹7,22;0.05 = 2.463774
B) 𝐹22,7;0.05 = 3.426042
C) 𝐹7,29;0.05 = 2.346342
D) 𝐹29,7;0.05 = 3.380632
iv) [1 Mark] With justification, give your conclusion for the test at 5% level of significance:
Solution: As 𝐹𝑜𝑏𝑠 (1.068901) < 𝐹7,22;0.05 (2.463774), we cannot reject the null hypothesis 𝐻0 .
e) [2 Marks] What is the adjusted R-squared value of the model? Write the formula and show your
computations.
Solution:
𝑆𝑆𝐸 195184
𝑅𝑎2 =1− 𝑛 − 𝑘 − 1 = 1− 30 − 7 − 1 = 1 − 0.9836409 = 0.0163591
𝑆𝑆𝑇 261567
𝑛−1 30 − 1
[End of paper]
Page 6 of 6