67% found this document useful (3 votes)
4K views31 pages

Statistics and Probability Solved Assignments - Semester Spring 2008

The document contains solutions to two statistics assignment questions. Question 1 defines key statistical terms like population, sample, parameter and statistic. It also provides a frequency distribution table for word lengths in a passage. Question 2 calculates mean, median and mode from frequency data on class intervals. Question 1 finds that product B has greater fluctuation in sales than product A, based on calculating and comparing the coefficients of variation for weekly sales data of each product. Question 2 explains the empirical rule for normal distributions and is asked to evaluate an appropriate measure of variation and coefficient of variation for farm size data in acres.

Uploaded by

Muhammad Umair
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
67% found this document useful (3 votes)
4K views31 pages

Statistics and Probability Solved Assignments - Semester Spring 2008

The document contains solutions to two statistics assignment questions. Question 1 defines key statistical terms like population, sample, parameter and statistic. It also provides a frequency distribution table for word lengths in a passage. Question 2 calculates mean, median and mode from frequency data on class intervals. Question 1 finds that product B has greater fluctuation in sales than product A, based on calculating and comparing the coefficients of variation for weekly sales data of each product. Question 2 explains the empirical rule for normal distributions and is asked to evaluate an appropriate measure of variation and coefficient of variation for farm size data in acres.

Uploaded by

Muhammad Umair
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 31

Assignment 1

Question 1

(a) Define the following terms population, sample, parameter, statistic and variable.

Solution:
Population: Collection of all the possible observations regarding some problem that is
under consideration.
Sample: A representative part of population is called sample.
Parameter: Any numerical value computed from population is called parameter.
Statistic: Any numerical value computed from sample is called statistic.
Variable: A characteristic that varies from individual to individual or object to object.

(b) Count the number of letters in each word of the following passage, and make a
frequency distribution of word length.

“The Virtual University of Pakistan delivers education through a judicious combination


of broadcast television and the Internet. VU courses are written in meticulous detail by
acknowledged experts in the field. Lectures are then recorded in a professional studio
environment and after insertion of slides, movie clips and other material, become ready
for broadcast. Course lectures are broadcast over television and are also made available in
the form of multimedia CDs. The multiple formats allows for a high degree of flexibility
for students who may view the lectures at a time of their choosing within a 24 hour
period. Additionally, students can use the lectures to review an entire course before their
examinations; a facility simply not available in the conventional face to face
environment.”

Solution:

length of Words Tally Bar Frequency


1 |||| | 6
2 |||| |||| |||| ||| 18
3 |||| |||| |||| |||| |||| 25
4 |||| |||| | 11
5 |||| ||| 8
6 |||| |||| |||| 14
7 |||| | 6
8 |||| |||| |||| 14
9 |||| ||| 8
10 |||| 5
11 |||| 4
12 |||| 5
Total 124
Question 2

Find the mean, median and mode from the following data

Class Interval Frequency


20-29 6
30-39 15
40-49 21
50-59 29
60-69 25
70-79 22
80-89 11
90-99 9
100-109 3
110-119 1
120-129 2

Solution:

The given data n required calculations are computed in the following table

Class Frequency Mid-Point Cumulative


Class Interval fx
Boundaries f x Freq. cf
20-29 19.5-29.5 6 24.5 147 6
30-39 29.5-39.5 15 34.5 517.5 21
40-49 39.5-49.5 21 44.5 934.5 42
50-59 49.5-59.5 29 54.5 1580.5 71 Modal Class
60-69 59.5-69.5 25 64.5 1612.5 96
70-79 69.5-79.5 22 74.5 1639 118
80-89 79.5-89.5 11 84.5 929.5 129
90-99 89.5-99.5 9 94.5 850.5 138
100-109 99.5-109.5 3 104.5 313.5 141
110-119 109.5-119.5 1 114.5 114.5 142
120-129 119.5-129.5 2 124.5 249 144
144 8888
fx
Mean  x 
f
Here fx  8838, f  144
8888
x
144
x  61.722

h n
Meadian  l  (  c)
f 2
Here n  144, c  71, f  25, h  10, l  59.5
10 144
 59.5  (  71)
25 2
 59.5  0.4
 59.9

f m  f1
Mode  l  h
( f m  f1 )  ( f m  f 2 )
Here l  49.5, f m  29, f1  21, f 2  25, h  10
29  21
 49.5   10
(29  21)  (29  25)
8
 49.5   10
12
 56.83
Assignment 2
Question 1
(a) What is difference between absolute measure of dispersion and relative measures of
dispersion?

(b) The weekly sales of two products A and B were recorded as give below:

Product A 59 75 27 63 27 28 56
Product B 150 200 125 310 330 250 225

Find out which of the two shows greater fluctuation in sales.

Solution (a):

Absolute measures are describes by a number or value to represent the amount of


variation among the values in a data set. Such values are expressed in the same unit of
measurement as the set of values in the data such as rupees, inches, and feet.
The relative measures are described as the ratio of a measure of absolute
measure to an average and this value is independent of any unit of measurement. It is also
called coefficient of variations.

Solution (b):
For this we will find Coefficient of Variation CV of both products. Required
calculations are shown below,

Product A Product B
2
X X X X2
59 3481 150 22500
75 5625 200 40000
27 729 125 15625
63 3969 310 96100
27 729 330 108900
28 784 250 62500
56 3136 225 50625
Total = 335 18453 1590 396250

For Product A
x
Mean  X 
n
335
X 
7
X  47.86
and

X X
2
2

S .D ( X )  S   
n  n
 
2
18453  335 
S  
7  7 
S  2636.14  2290.31
S  18.60

Now
CoefficietofVaritaion
S
C.V   100
X
18.60
C.V   100
47.86
C.V  38.86%

For Product B
x
Mean  X 
n
1590
X 
7
X  227.14
and

X X
2
2

S .D ( X )  S   
n  n
 
2
396250  1590 
S  
7  7 
S  56607.14  51593.88
S  70.80

Now
CoefficietofVaritaion
S
C.V   100
X
70.80
C.V   100
227.14
C.V  31.17%

Conclusion/ Interpretation:
By comparing the CVs of product A and B, We see CV of product A is greater
than that of product B, this shows that Product B has greater fluctuation in sales.

Question 2
(a) What is empirical rule?

(b) Evaluate an appropriate measure of variation for the following data. Also find
coefficient of that variation.

Farm size (acre) No. of forms


Below 40 394
41-80 461
81-120 391
121-160 334
161-200 169
201-240 113
241 and above 148
Solution (a):
Empirical Rule:
For a data set having symmetrical bell-shaped distribution (normal curve), the
range within which a given percentage of values of the distribution are likely to fall
within a specified number of standard deviations of the mean is determined as follows:
( X  S ) Covers approximately 68% of values in the data set
( X  2 S ) Covers approximately 95% of values in the data set
( X  3S ) Covers approximately 100 %( 99.73%) of values in the data set

Solution (b):
Since the frequency distribution has open-end class intervals on the two extreme
sides, therefore Q.D. would be an appropriate measure of variation. The computation of
Q.D. is shown below

Farm size Class No. of forms Cumulative


(acre) Boundaries (f) frequency ( cf )
Below 40 Below 40.5 394 394 Q1 class
41-80 40.5-80.5 461 855
81-120 80.5-120.5 391 1246
121-160 120.5-160.5 334 1580 Q3 class
161-200 160.5-200.5 169 1749
201-240 200.5-240.5 113 1862
241 and above 240.5 and above 148 2010
Total 2010

First we find first quartile:

h n
Q1  l  (  c)
f 4
Here
n 2010
  502.5 th value
4 4
f  461, c  394, h  40, l  40.5
40
Q1  40.5  (502.5  394)
461
Q1  40.5  9.41
Q1  49.91

And third quartile:


h 3n
Q3  l  (  c)
f 4
Here
3n 3(2010)
  1507.5 th value
4 4
f  334, c  1246, h  40, l  120.5
40
Q3  120.5  (1507.5  1246)
334
Q3  120.5  31.31
Q3  151.81

Thus the quartile deviation is

Q3  Q1
Q.D 
2
151.81  49.91
Q.D 
2
Q.D  50.95

And coefficient of Q.D:

Q3  Q1
Coefficient of Q.D 
Q3  Q1
151.81  49.91

151.81  49.91
 0.505
Assignment 3
Question 1
(a) Define Set and its properties. Also explain the Venn diagram.

(b) The first four moments of a distribution about the origin are 1, 4, 10, and 46
respectively. Obtain the four moments about mean. Also calculate moment’s
ratios.

Solution:
a)

Set: A set is any well-defined collection or list of distinct objects, e.g. a group of
students, the books in a library, the integers between 1 and 100, all human beings on
the earth, etc
Properties of set:
Followings are the main properties of a set

i) Union
ii) Intersection
iii) Difference

Venn Diagram.
It is a diagram which is use to represent the set in such a way that the universal set
or Sample Space is represented by the rectangle while its subsets are represented by
the circles. e.g.

S
A B

b)
In usual notations, we have
A  0, 1/  1,  2/  4,  3/  10,  4/  46
x  first moment aboput origion 1/  1
Variance 2    2   2/  ( 1/ ) 2  4  1  3
S.D.    2  3  1.732
3  3/  3 2/ .1/  21/   10  341  21  0
3

 4   4/  4 3/ .1/  6 2/ 1/   31/ 


2 4

 4  46  410 1  6412  314  27

As we know that moment ratios are

 32 0  27
1  3   0 And 2  42  3
2 3 2 9

Question 2
(a) In simple linear regression analysis, interpret “a” and “b”.

(b) A company is introducing a job evaluation scheme in which all jobs are graded
by points for skill, responsibility, and so on. Monthly pay scales (Rs. in 1000’s)
are then drawn up according to the number of points allocated and other factors
such as experience and local conditions. To date the company has applied this
scheme to 9 jobs:

Job: A B C D E F G H I

Points: 5 25 7 19 10 12 15 28 16

Pay: 3.0 5.0 3.25 6.5 5.5 5.6 6.0 7.2 6.1

(i) Find the least squares line for linking pay scales to points.
(ii) Estimate the monthly pay for a job graded by 20 points
(iii) Calculates the standard error of estimate.

Solution:

a)

Solution: If y  a  bx than
a  y-intercept that represents average value of the dependent variable y when x = 0

b  slop of the regression line that represents the expected change in the value of y
(either positive or negative) for a unit change in the value of x.
b)
Calculations required are as

x y x2 y2 xy
5 3 25 9 15
25 5 625 25 125
7 3.25 49 10.5625 22.75
19 6.5 361 42.25 123.5
10 5.5 100 30.25 55
12 5.6 144 31.36 67.2
15 6 225 36 90
28 7.2 784 51.84 201.6
16 6.1 256 37.21 97.6
137 48.15 2569 273.4725 797.65

(i) x
 x  137  15 .22 , y 
 x  48 .15  5.35
n 9 n 9

n xy   x y 9  797 .65  137 48 .15 


b yx    0.133
n x   x  9  2569  137 
2 2 2

a  y  bx  5.35  0.133  15.22  3.33



So the required regression line is y  3.33  0.133x

(ii) For job grade point x = 20, the estimate average pay scale is given by

y  3.33  0.133x  3.33  0.133  20  5.96

(iii) Standard error for the estimate is

S y. x 
y 2
 a y  b xy
n2
273 .47  3.33  48 .15  0.133  797 .65
S y. x 
92
273 .47  266 .47 7
S y.x    1 1
7 7
Assignment 4
Question 1
a. Is sample space changed/reduced in conditional probability, if yes, why?

b. From the following Venn diagram, this indicated the number of outcomes of an
experiment corresponding to either event.

Total outcomes = 50
A B
8 13
6

Find (i) P (A) (ii) P (B) (iii) P (AUB)

c. Two events, A and B are statistically dependent. If P (A) =0.39, P (B) = 0.21, and
P (A or B) = 0.47, find the probability that

(i) Neither A nor B will occur.


(ii) Both A and B will occur.

Solution:

a. Yes, sample space is changed in conditional probability because some additional


information regarding to the outcomes of experiment is given.
The effect of such information is to reduce the sample space by excluding some outcomes
as being impossible which before receiving the information were believed possible.

b. From the following Venn diagram, this indicated the number of outcomes of an
experiment corresponding to either event.

Total outcomes = 50
A B
8 13
6

Find (i) P (A) (ii) P (B) (iii) P (AUB)


Sol:

(i)
n( A)
P ( A) 
n( S )
14
P ( A) 
50

P (A) = 0.28
(ii)
n( B )
P( B) 
n( S )
19
P( B) 
50

P (B) = 0.38

(iii)
P( A  B)  P( A)  P( B)  P( A  B)
14 19 6
P( A  B)   
50 50 50
27
P( A  B) 
50
P( A  B)  0.54

c.

( i)
= P (Neither A nor B will occur)
= P( A  B)
P( A  B)  1  P( A  B)
P( A  B)  1  0.47
P( A  B)  0.53

( ii)
= P (Both A and B will occur)
= P( A  B )
P( A  B)  P( A)  P( B)  P( A  B)
P( A  B)  0.39  0.21  0.47
P( A  B)  0.13
Question 2
(a).Two card are selected at random from a pack of 52 cards. What is the probability that
second is a card of king, if it is known that first card is (i) red card (ii) diamond card (iii)
spade or diamond (iv) picture card?

(b). A company is considering upgrading its computer system, and a major probation of
the upgrade is a new operating system. Suppose the probability of a favorable evaluation
is 0.65.If the probability the company will upgrade its system given a favorable
evaluation is 0.85, what is the probability that the company will upgrade and receive a
favorable evaluation?

Solution:

a.
Let k = king card
D = diamond card
SD = spade or diamond card
P = picture card

(i):
P( R  K )
P( K ) 
R P( R)
2 / 52
P( K ) 
R 26 / 52
2
P( K ) 
R 26
1
P( K ) 
R 13
P ( K )  0.077
R

(ii)
P( D  K )
P( K )
D P( D)
1/ 52
P( K ) 
D 13 / 52
1
P( K ) 
D 13
P( K )  0.077
D
(iii)
P( SD  K )
P( K )
SD P( SD)
2 / 52
P( K )
SD 26 / 52
1
P( K )
SD 13
P( K )  0.077
SD

(iv)
P( P  K )
P( K ) 
P P( P)
4 / 52
P( K ) 
P 12 / 52
4
P( K ) 
P 12
P( K )  0.333
P

b.

Let

Company upgrade the system = U


Favorable evaluation = F

Then, we are given

P (F) = 0.65 and P (U/F) = 0.85

We are to find P(U  F )


By the formula of conditional probability;

P(U  F )
P(U / F ) 
P( F )
P(U / F )* P( F )  P(U  F )

Putting the values

(0.85)(0.65) = P(U  F )
0.5525 = P(U  F )

P(U  F ) =0.5525
Assignment 5
Question 1
a. Find first four moments;  '1 ,  ' 2 ,  '3 ,  ' 4
X 8 12 16 20 24
P(X) 1/8 1/6 3/8 1/4 1/12

b. Find the distribution function of the given p. d. f

1 1
f (x)=  x , 0<x<4
2 8

a. Solution:

X 8 12 16 20 24

P(x) 1/8 1/6 3/8 1/4 1/12 1

X p(X) 8/8 12/6 48/8 20/4 24/12 16


2
X p(X) 64/8 144/6 768/8 400/4 576/12 276

X3 p(X) 512/8 1728/6 12288/8 8000/4 13824/12 5040

X4 p(X) 4096/8 20736/6 196608/8 160000/4 331776/12 96192

Now,

 '1 =E(X) =  x. p( x) =16

 ' 2 =E(X2 ) =  x . p( x) =276


2

 '3 =E(X3 ) =  x . p( x) =5040


3

 '4 =E(x4 ) =  x . p( x) =96192


4

b. Solution

Given the p. d. f is:

1 1
f (x)=  x , 0<x<4
2 8
For distribution function, we will proceed as follows
x
F(X) =  f ( x)dx

x
1 1
=  (  x)dx
0
2 8
1 1 x2
x
x
=
2 8 2 0
x x
= (1  )
2 8

Question 2
a. A continuous random variable X that can assume values only between X=2 and
X=8 inclusive has a density function given by

1
f x   x  3
48

Show that it is complete p. d. f also Find E(X).

b. Find the expected no. of boys on a committee which selected 3 at random from 4
boys and 3 girls.

a. Solution

Given that
1
f (x)= (3+x) 2<x<8
48
To prove that it is complete p.d.f it should be satisfy the following property

 f ( x)dx  1


8
1
= (3  x)dx
2
48

8 8
3 x
=  dx   48 dx
2
48 2

8
3 8 1 x2
= x2 + ( ) 2
48 48 2

3 1
= (8-2) + (32-2)
48 48

3 1
= (6) + (30)
48 48

=1 proved


E(x) =  x. f ( x)dx


8
x
E(x) =  (3  x)dx
2
48

8 8
3x x2
= dx +  dx
2
48 2
48

3 x2 8 1 x3 8
= ( ) +
48 2 2 48 3 2

3 1 512 8
= (32-2) + (  )
48 48 3 3

3 1 504
= (30) 
48 48 3
=5.375

b. Solution

Let X represents the no. of boys on the committee. Then

4
C x 3C3 x
f ( x)  7
Where x=0, 1, 2, 3
C3

Now

4
C0 3C30
f (0)  7
=1/35
C3

4
C1 3C31
f (1)  7
=12/35
C3

4
C2 3C32
f (2)  7
=18/35
C3

4
C3 3C33
f (3)  7
=4/35
C3

and

E(X)=(0)(1/35)+(1)(12/35)+(2)(18/35)+(3)(4/35)

= 1.71

OR,

Since X follows the hypergeometric distribution with (n  3, K  4, N  7)


 4  3 
  
x 3 x 
p.m. f of X , P( X  x)   
7
 
3
x  0,1, 2,3
nK
E( X ) 
N
3 4
E( X ) 
7
E ( X )  1.7143
Assignment 6
Question 1
a. Assume that the time X required for a runner to run a mile is a normal random
variable with parameters  = 4 minutes 1 second and  = 2 seconds. What is the
probability that this athlete will run the miles in more than 3 minutes 55 seconds?

Solution:

More than 3 minutes and 55 seconds = 235 second

x
Z

235  241
Z
2
Z  3

Area against Z=3.0 is 0.49865


Required probability= P( X  235)  0.50  0.49865  0.99865

b. One hundred passengers have made reservations for an airport flight. If the
probability that a passenger who has a reservation will not show up is 0.01, what
is the probability that exactly 3 will not show up?

Solution:
This is essentially a binomial experiment with n=100 and p=0.01, since p is very
small and n is considerably large, we shall apply the Poisson distribution, using
 =(100)(0.01)=1
If X represents the number of success, we have
e   x
P ( X  3) 
x!
1
e (1)3
P (3;1) 
3!
sin ce
e 1  0.3679

0.3679
P(3;1) 
6
P(X=3) = 0.061316
Question 2
a. Discuss the statement that in a binomial distribution  =6 and  =2.5.

Solution:
  np  6
  npq  2.5
Sruaring both sides
npq  6.25
Put np  6 in above
6q  6.25
q  1.04  1
Since probability can never be greater than 1
It is not possible for binomial distribution that  =6 &  =2.5.

b. The experience of a house agent indicates that he can provide suitable


accommodation for 75% of the clients who come to him. If on a particular
occasion 6 clients approach him independently, calculate the probability that at
least 5 clients will get satisfactory accommodation

Solution:

P( X )  Cxn p n q n  x where q  1  P

P( X  5)  P( X  5)  P( X  6)
P( X  5)  C56 (0.75)5 (0.25)1  C66 (0.75)6 (025)0

= 0.3560+0.1780
=0.534

c. A committee of size 3 is selected from 4 men and 2 women. Find the probability
distribution by the hyper geometric experiment for the number of men on the
committee.

Solution:

Let X denotes the number of men on the committee:

No. of men Total No. of women No. of selected


person
K=4 N=6 N-k=2 n=3
Cxk CnNxk
P( X  x) 
CnN
X Cxk CnNxk
P( X  x) 
CnN
1 C14C22 4 1 4
 
C36 20 20
2 C24C12 6  2 12
 
C36 20 20
3 C34C02 4 1 4
 
C36 20 20
Sum 1
Assignment 7
Question 1
a. Differentiate estimate and estimator?
Solution:

Estimate:
An estimate is a numerical value of the unknown parameter.
Estimator:
An estimator stands for the rule or method that is used to estimate a parameter.

b. A test in statistics was given to 50 girls and 75 boys. The girls made an average grade
of 76 with a standard deviation of 6, while the boys made an average of 82 with a
standard deviation of 8. Find a 96% confidence interval for the difference 1   2 , where
1 is the mean score of all boys and  2 is the mean score of all girls who might take this
test.

Solution:
Boys
n1  75
x1  82
S1  8

Girls
n2  50
x2  76
S2  6

1    0.96    0.04
Z  Z0.04  Z0.02  2.05
2 2

96% confidence interval is:


S12 S 22
( x1  x2 )  Z 
2 n1 n2
(8) 2 (6) 2
(82  76)  2.05 
75 50
6  2.05(1.254)
6  2.571
6  2.571, 6  2.571
3.429,8.571

Question 2
a. What is meant by bias?
Bias is the difference between the expected value of the statistic and the true value of the
unknown parameter being estimated.
It is defined as:

B= E (T)- 
Where T is the sample statistic used to estimate the population parameter  .
The bias is positive if E (T)>  , and negative if E (T) <  .

b. In a random sample of 75 axle shafts, 12 have a surface finish that is rougher than the
specification will allow. How large a sample is required if we want to be 95% confident
that the error in using p to estimate p is less than 0.05?

Solution:
e  0.05
12
Pˆ   0.16
75
qˆ  1  Pˆ  1  0.16  0.84
Z 0.025  1.96
2
 Z 
n 2  pq
ˆˆ
 e 
 
2
 1.96 
n  (0.16)(0.84)
 0.05 
n  207
Question 3
a. A random sample of 500 residents of a city is chosen and the numbers of smokers are
noted. It is found that 100 are smokers. Obtain an unbiased estimate of the proportion of
smokers in the city.

Solution:

n= 500
x=100

Sample proportion is the unbiased estimator and estimate is

P=x/n=100/500=1/5

b. The manufacturer of a patent medicine claimed that it was at least 90% effective in
relieving an allergy for a period of 8 hours. In a sample of 200 people, who had an allergy
the medicine provided relief for 160 people? Determine whether the manufacture “s
claim is legitimate.

Solution:

n=200, x=160, p=x/n=160/200=0.8

Hypothesis:
H 0 : P  0.9
H 1 : P  0.9
Level of significance:

α=5%=0.05
Test statistic:
pP
Z
PQ
n
Critical region
Z   Z 0.05  1.645
Z calculated  1.645
Calculation:
pP
Z
PQ
n

0.8  0.9
Z
(0.9)(0.1)
200
0.1
Z  4.717
0.0212

Conclusion:
Since Z-calculated =-4.717 lies in critical region, we reject H0 and conclude that
manufacturer’s claim is not legitimate.
Assignment 8
Question 1
a. If α=0.10, how many intervals would be expected to contain μ.
Solution:
In the repeated sampling, we would expect about 90 % of the intervals to contain μ in the
long run.
b. A random sample of 10 university professors gave their salaries (in thousand Rs.) 13,
11, 19, 15, 22, 20, 14, 17, 14, 15.Another random sample of 5 college professors gave
their salaries (in thousand Rs.) 9, 12, 8, 10, 16. Construct a 95% confidence interval for
the difference between means of the salaries of university and college professors
assuming that their population variances are equal.
Solution:
X1 X12 X2 X22
13 169 9 81
11 121 12 144
19 361 8 64
15 225 10 100
22 484 16 256
20 400
14 196
17 289
14 196
15 225
160 2666 55 645
Now,

x1 
 x  160  16
n1 10

x2 
 x  55  11
n2 5
( x1 ) 2 ( x 2 ) 2
{ x1  }  { x 2 
2 2
}
n1 n2
sp 
n1  n 2  2

(160) 2 ( 55 ) 2
{2666  }  {645  5
}
sp  10
10  5  2
{106}  {40}
sp 
13
=3.351

v  n1  n2  2  10  5  2  13
1    0.95
  0.05
t   t 0.05  t 0.025,13  2.16
,13
2 (v) 2

Now,
1   2
95% confidence interval for is,

1 1
( x1  x 2 )  t  s p 
2
,v n1 n 2

1 1
(16  11)  2.16(3.351) 
10 5

5  2.16 (3.351 )( 0.548 )

5  3.965

1.035  1   2  8.965

Question 2
Three cards are drawn from an ordinary deck of playing cards, with replacement, and the
number y of spades is recorded. After repeating the experiment 64 times, the following
outcomes were recorded:

Y 0 1 2 3
f 21 31 12 0

Test the hypothesis at 1 % level of significance, that the recorded data may be fitted by
the binomial distribution with values b(y; 3, 1/4) for y=0,1,2,3

Solution:
Hypothesis:
H 0 : The fit is good
H1 : The fit is not good

Level of Significance:
  0.01

Test statistic:
(oi  ei ) 2
 
2

ei
Calculations:

y f  oi fy ei o i  ei ( o i  ei ) 2 (o  e ) 2
i i
ei
0 21 0 64 3C 0 (1 / 4) 0 (3 / 4) 3  27 -6 36 1.3333
1 31 21 64 3C 1 (1 / 4)1 (3 / 4) 2  27 4 16 0.59259
2 12 24 64 3C 2 (1 / 4) 2 (3 / 4)1  10* 2 4 0.4
12
3 0 0 64 3C 3 (1 / 4) 3 (3 / 4) 0 
64 45 2.3258=χ2

Critical region:
No. of degrees of freedom=k-1-no. of estimated parameters
=3-1-0
(In binomial dis. we have two parameters and they are provided in question)

 2   2 0.01,2
 2  9.21

Conclusion: Since our calculated value of chi-square does not fall in critical region so we
do not reject H0 and conclude that the fit is good.

You might also like