Discrete Probability Concepts Explained
Discrete Probability Concepts Explained
SAMPLE SPACE
Structure
6.1 Introduction
Objectives
6.2 Probability : Axiomatic Approach
Probability of an Event : Definition
Probability of an Event : Roperties
6.3 Classical Definition of Probability
6.4 Conditional Probability
6.5 Independence of Events
6.6 Repea!ed Experiments and Trials
6.7 Summary
6.8 Solutions and Answers
6.1 INTRODUCTION
In this unit, we shall introduce you to some simple properties of the probability of an event
associated with a discrete sample space. Our definitions require you to first specify the
probabilities to be attached to each individual outcome of the random experiment.
Therefore, we need to answer the question : How does one assign probabilities to each and
every individual outcome? This question was answered very simply by the classical
probabilists (like Jacob Bernoulli). They assumed that all outcomes are equally likely.
Therefore, for them, when a random experiment has a finite number N of outcomes, the
probability of each outcome would be 1/N. Based on this assumption they developed a
probability theory, which we shall briefly describe in Sec. 6.4. However, this approach has a
number of logical difficulties. One of them is to find a reasonable way of specifying
"equally likely outcomes."
However, one possible way out of this difficulty is to relate the probability of an event to the
relative frequency with which it occurs. To illustrate this point, we consider the experiment
of tossing a coin a large number of times and noting the number of times "Head" appears.
In fact, the famous mathematician, Karl Pearson, performed this experiment 24000 times.
He found that the relative frequency, which is the number of heads divided by the total
nuinber of tosses, approaches 112 as moFe and more repetitions of the experiment are
performed. This is the same figure which the classical probabilists would assign to the
probability of obtaining a head on the toss of a balanced coin.
Thus, it appears that the probability of an event could be interpreted as the long range
relative frequency with which it occurs. This is called the statistical interpretation or the ,
' frequentist approach to the interpretation of the probability of an event. This approach has its
own difficulties. We'll not discuss these here. Apart from these two, there are a few other
approaches to the interpretation of probability. These issues are full of philosophical
controversies, which .are still not settled.
We, shall adopt the axiomatic approach formulated by Kolmogorov and treat probabilities as
numbers satisfying certain basic rules. This approach is introduced in See. 6.2.
In Sec. 6.2 and 6.3 we deal with properties of probabilities of events and their computation.
We discuss the important concept of conditional probability of an event given that another
event has occurred in Sec. 6.4. It also includes the celebrated Bayes' theorem. In Sec. 6.5 we
discuss the definition and consequences of the independence of two or more events. Finally,
we talk about the probabilistic structure of independent repetitions of experiments in Sec.
6.6. After getting familiar with the computation of probabilities in this,unit, we shall take up
the study of probability distributions in the next one.
Probability on Discrete Sample objectives
Spaces After studying this unit you should be able to :
assign probabilities to the outcomes of a random experiment with discrete sample space,
I
establish properties of probabilities of events,
calculate the probability of an event,
calculate conditional probabilities and establish Bayes theorem,
check. and utilise the independence of two or more events.
We have considered a number of examples of random experiments in the last unit. The
outcomes of such experiments cannot be predicted in advance. Nevertheless, we frequently
make vague statements about the chances or probabilities associated with outcomes of
random experiments, Cohsider the following examples of such vague statements :
i) It is very likely that it would rain today.
ii) The chance that the Indian team will win this match is very small. '
iii) A person who smokes more than 10 cigarettes a day will most probably developing
lung cancer.
iv) The chances of my whning the first prize in a lottery are negligible.
I
V) The price of sugar would most probably increase next week. \
Probability theory attempts to quantify such vague statements about the chances being good
or bad, small or large. To give you an idea of such quantification, we describe two simple
random experiments and associate probabilities with their outcomes.
Example 1
i) A balanced coin is tossed. The two possible outcomes are head (H) and tail (T). We
~ssociateprobability P{H} = 112 to the outcome H and probability P{T) = 112 to T.
ii) A person is selected from a large group of persons and his blood group is determined. ,
It can be one of the four blood groups 0 , A, B and AB. One possible assignment of ,
probabilities to these outcomes is given below
Blood group 0 A 3 AB
Probability 0.34 0.27 0.31 0.08
Now look carefully at the probabilities attached to the sample points in Example 1 (i) and
(ii). Did you notice that
i) , these are number's between 0 and 1, and
ii) the sum of the probabilities of all the sample points is one ?
This is not true of this example alone. In general, we have the following rule or axiom about
the assignment of probabilities to the points of a discrete sample space.
Axiom :Let R be a discrete samplk space containing the points m,, y,. . . ; i.e.,
Now see if you can do the following exercise on the basis of this axiom.
E l ) A sample space f2 consists of eight points o , , 02, . . . ,%. Which of the following
assignments of probabilities are valid ones ?
> -
.*
Sample Point Probability on a Discrete Sample
Space '
Assignment ol "'2 ("3 ("4 O5 ("6 ("7 ("8
If you have done El, you would have noticed that it is possible to have more than one valid
assignment of probabilities to the same sample space. If the discrete sample space is not
finite, the left side of Equation (1) should be interpreted as an infinite series. For example,
suppose = { a 1 , a2,. } and ..
P{wj] = l , 2,.......
1/2J,~j= If I r I < 1, the sum of the infinite
geometric series a + ar + a? + . ..
Then this assignment is valid because, 0 I P { oj} I 1, and is -.
a .
I -r
So far we have not explained what the probability P { o j } assigned to the point ojsignifies.
We have just said that they are all arbitrary numbers between 0 and 1, except for the
requirement that they add up to 1. In fact, we have not even tried to clarify the nature of the
sample space except to assert that it be a discrete sample space. Such an approach is
consistent with the usual procedure of beginning the study of a mathematical discipline with
a few undefined notions and axioms and then building a theory based on the laws of logic
(Remember the axioms of geometry ?). It is for this reason that this approach to the
specification of probabilities to discrete sample spaces is called the axiomatic approach. It
was introduced by the Russian mathematician A.N. Kolmogorov in 1933. This approach is
mathematically precise and is now universally accepted. But when we try to use the
mathematical theory of probability to solve some real life problems, that we have to
interpret the significance of statements like "The probability of an event A is 0.6."
Definition 1 :The probability P(.4) of an event A is the sum of the Probabilities of the
points in A. More formally,
where stands for the fact that the sum is taken over all the points q E A, A is, of
a,€A
course, a subset of R.By convention, we assign probability zero to the empty set. Thus,
P(@,) =,o.
The following example should help in clarifying this concept.
Example 2 : Let a be the sample space corresponding to three tosses of a coin with the
following assignment of probabilities.
Samplepoint HHH HHT HTH THH TTH THT HTT TIT
ProbPbiiity on Discrete Semple Probability 118 118 118 118 118 118 118 118
SP-
Let's find the probabilities of the events A and B, where
A :There is exactly one head in three tosses, and
B : All the three tosses yield the same result
Now A = ( HTF, THT, 'ITH)
Therefdre,
P(A) = 118 + 118 +I18 = 318.
A word about our notation and nomenclature is necessary at this stage. Although we say that
P(wj) is the probability assigned to the point wj of the sample space, it can be also
interpreted as the probability of the singleton event (wj).
In fact, it would be useful to remember that probabilities are defined only for events and that
P{wj) is the probability of the singleton event (wj1. This type of distinction will be all the
more necessary when you proceed to study probability theory for non-discrete sample
spaces in Block 3. '
Now let us look at some. of the probabilities of events.
Proof :Recall that according to the definition, P(A U B) is the sum of the probabilities
attached to the points of A U B, each point being considered only once. However, when we
compute P(A) + P(B), a point in A flB is included once in the computation of P(A) and
once in the computation of P(B). Thus, the probabilities of points in A n B get added twice
in the computation of P(A) + P(B). If we sdbtract the probabilities of all points in A fl B,
from P(A) + P(B), then we shall be left with P(A CI B), i.e.,
The last term in the above relation is, by definition, P(A flB). Hence we have proved P2.
We now list some properties which follow from P1 and P2.
P3 :If A and B are disjoint events, then
Why don't you try to prove these yourself? That's what we suggest in the following exercise.
Fig. 1
Proof :By P5, the result is true for N = 2. Assume that it is true for N I r, and observe that
A, U A 2 U . .. U&+listhesameasB Uk,+,,whereB=Al U A 2 u . . . UAr.ThenbyP5,
Robablllty on Dlserete Sample
Spsees
where the last inequality is a consequence of the induction hypothesis. Hence, if Bwle's
inequality holds for N Ir, it holds for N = r + 1 and hence for all N 2 2.
Example 4 :We can extend the property P2 to the case of three events, i.e., we can show
that
P(A u B u C) = P(A) + P(B) + P(C) - P(A n B)
Fig. 2 -P(B n c ) - ~ ( c n ~+ P) ( A n B n C ) . . . (5)
D e n o t e B U C b y H . T h e n A U E ! u C = A U H a n d b y P 2 , P ( A U B UC)
Here are some simple exercises which you can solve by using P1-P3
- I
Probahllity on a Discrete.Sample
E4) Prove the following : Space
Through the examples and exercises in this section we'hope you have grasped the axiomatic
approach to probability. In the next section we'll describe me classical approach.
-
In the early stages, probability theory was mainly concerned with its applications to games
of chance. The sample space for these games consisted o{ a finite number of outcomes.
These simple situations led to a definition of probability which is now called the classical
definiqon. It has m a y limitations. For example, it cannot be applied to infinite sample
Probability on Discrete Sample space. However, it is useful in understanding the concept of randomness so essential in the
spaces ' planning of experiments, small and large-scale sample surveys, as well as in solving some
interesting problems. We shall motivate the classical definition with some examples. We
shall then formulate the classical definition and apply it to solve some simple problems.
Suppose we toss a coin. This experiment has only two possible outcomes :Head (H) and
Tail (T). If the coin is a balanced coin and is symmetric, there is no particular reason to
expect that H is more likely than Tor that T is more likely than H. In other words, we may
assume that the two outcomes H and T have the same probability or that they are equally
likely. If they have the same probability, and if the sum of the.two probabilities P ( H ) and
P ( T ) is to be one, we inust have P ( H J = P(T) = 112.
I
1
Similarly, if we roll a symmetric, balanced die once, we should assign the same probability,
viz. 116 to each of the six possible outcomes l,2, . . . ,6.
The same type of argument, when used for assigning probabilities to the results of drawing a
card from a well-shuffled pack of'52 playing cards leads us to say that the probability of
drawing any specified card is 1/52. k
Definition 2 :Suppose a sample space Q has a finite number n of points ol, m2, . . . ,on.
I
The classical definition assigns the probability l/n to each of these points, i.e.,
The above assignment is also referred to as the assignment in case of equally likely
outcomes. You can check that in this case, the total of the probabilities of all then points is
-n x -1 = 1. In fact, this is a valid assignment even from the axiomatic p h t ' o f view.
n
Now suppose that an event A contains m points. Then under the classical assignment, the
probability P(A) of A is m/n. The early probabilists called m; the number of cases
favourable to A and n, the total number of cases. Thus, according to the classical definition,
Number of cases favourable to A
= ~ o t number
d of cases
We have already mentioned that this is a valid assignment consistent with the Axiom in Sec.
6.2. Therefore, it follows that the probabilities of events, defined in this manner, possess the
6
properties P1- P7.
The total number of possible outcomes is 6 x 6 = 36. There are 5 sample points. (2,6).
(3,5), (4,4), (5,3), '(6,2), which are favourable to the event A of getting a total score of 8.
Hence the required probability is 5/36.
Example 6 :If each card of an ordinary deck of 52 playing cards has the same probability of
being drawn, let us find the probability of drawing.
i) a red king or a black ace
ii) a3,4,5,6or8?
Let's tackle these one by one
i) Since there are two red kings (diamond and heart) and two black aces (spade and
club), the number of favourable cases is 4. The required probability is 4/52 = 1113.
ii) There are 4 cards of each of the 5 denominations'3,4,5,6 and 8. Thus, the total
number of favourable cases is 20 and the required probability is 20/52 = 5/13.
I
You must have realised by this time that in order to apply the ctassical definition of
probability, you must be able to count the number of points favourable to an event A as well
as the total number of sample points. This is not always easy. We can, however, use the
/
22 theory of permutations and combinations for this purpose.
To refresh your memory, here we give two important rules which are used in counting. Probablllty on a Dlscrete Sample
Space
1) Multiplication Rule : If an operation is performed in n l ways and for each of these n,
ways, a second operation can be performed in n2 ways, then the two operations can be
performed together in nln2 ways. see Fig. 3.
I
Fig. 3
Fig. 4
We now illustrate the use of this theory in calculating probabilities by considering some
examples. We assume that all outcomes in each of these examples are equally likely. Under
this assumption, the classical definition of probability is applicable.
Example 7 :We first select a digit out of the ten digits, 0, 1,2,3, . . ., 9. Then we select
another digit out of the renlaining nine. What will be the probability that both these digits
are odd?
We can select the first digit in 10 ways and for each of these ways we can select the second
digit in 9 ways. Therefore, the total number of points in the sample space is 10 x 9 = 90. The
first digit, can be odd in 5 ways ( I , 3,5.7.9). and then the second digit can be odd in 4 This is called selection wlthout
ways. Thus, the total number of ways in which both the digits can be odd is 5 x 4 = 20. The replacement since we do not
replace the first selected digit back
20 2
required probabilityis therefore -= -. before the second selection.
90 9
Remark 2 : In Example 7, every digit had the same chance of being selected. This is
sometimes expressed by saying that the digits were selected at random (with equal
probability). Sele~tion~atrandom is generally taken to be synonymous with the assignment
of the same probability to all the sample points, unless stated otherwise.
We now give a number of examples to show how to calculate the probabilities of events in a
variety of situations. Please go through these examples carefully. If you understand them.
you will have no difficulty in doing the exercises later.
Example 8 :A box contains ninety good and ten defective screws. Let us find the
probability that 5 screws selected at random out of this box are all good.
, Let A be the event that the 5 selected screws are all good.
on Mscrctc Sample
~obeblllty
sP=-
Now we can choose 5 screws out of 100 screws in ways. If the selected 5 screws are
to be good, they will have to be selected out of the 90 good screws. This can be done in
ways. This is the number of sample points favourable to A. Hence the probability of A
Example 9 :A government prints 10 lakh lottery tickets of value of Rs. 2 each. We would I
like to know the number of tickets that must be bought to have a chance of 0.5 or more to 4
win.the first prize of 2 lakhs.
The prize-winning ticket can be mndomly selected out of the 10 lakh tickets in lo6 ways.
Now, let m denote the number of tickets that we must buy. Then m is the number of points
favourable to our winning the first prize. Therefore, the probability of our winning the first
m
prize, is, -.
1o6
m 1 1o6
Since we want that -- > -, therefore m 2 --. This means that we must buy at least
106 - 2 2
--
lo6 - 500,000 tickets, at a cost of at least Rs. 10 lakhs ! ~ o i profitable
a proposition at all !
2 .
I
Example 10 :In a study centre batch of 100 students, 54 opted for MTE-06,69 opted for li
*
MTE- 11 and 35 opted for both MTE-06 and MTE- 1 1. If one of these students is selected at
random, let us find the probability that the student has opted for MTE-06 or MTE- 11.
Let M denote the event that the randomly selected student has opted for MTE-06 and S the
event that helshe has opted for MTE- 11. We want to know P(M U S). According to the
54 69
classical definition. P(M) = -- P(S) = -and P(M r l S) = Thus
100' 100 100'
suppofe now we want to know the probability that the randomly selected student has opted
for neither MTE-06 nor'~TE-11.This means that we want to know P[MC.nSC].
Now, \
Example 11 :A throws six unbiased dice and wins if he has at least one six. B throws
twelve unbiased dice and wins if he has at least two sixes. Who do you think is more likely
to win?
-
We would urge you to make a guess first and then go through the following computations.
I.
The & number of outcomes for A is nA= 66 and that for B is ng = 612,We will first
i
calculate the probabilities qA and qB that A and B, respectively, loose their games. Then the Probability on a ~lwAt,le
Sample
I probabilities of their winning are PA = 1 - qA and PB = 1 - qg, respectively. We do this
because qA and qB are easier to compute.
Now A loses if he does not have a six on any of the 6 dice he rolls. This can happen in 56
different ways, since he can have nu sis on each die in 5 ways. Hence qA = 56/66 and
therefore, PA = 1 - (5/6)6 o 0.665.
In order to calculate qg, observe that B loses if he has no six or exactly one six. The
probability that he has no six is 512/6'* = (5/6)!2. Now the single six can occur on any one
of the 12 dice, i.e., in ways. Then all the remaining 11 dice have to have a score other
than six. This can
The events of "no six" and '.'one six" in the throwing of 12 dice are disjoint events. Hence
the probability
Comparing PA and PB, we can conclude that A has a greater probability of winning.
Now here are some exercises which you should try to solve.
E9) Two cards are drawn in succession from a deck of 52 playing cards with replacement.
What is the probability that both cards are of denomination greater than 2 and less
than 5?
E10) If 3 books are selected at random from a shelf containing 5 novels, 3 books of poems
and a dictionary, what is the probability that
a) dictionary is not selected
b) 2 novels and 1 book of poems are selected.
El 1) A person has 4 keys of which only one fits the lock. He tries them successively at
random without replacement. This procedure may require 1,2,3 or 4 attempts. Show
that the probability of any one of these 4 outcomes is 114.
E12) In an experiment to study the dependence of blood,pressure on smoking habits, the
following data were collected on 220 individuals.
Non-smoker Moderate smoker Heavy smoker
High blood 20 40 40
I
pressure
Normal blood 60
I pressure
t
So far we have seen various examples of assigning probabilities to sample points and have
Probability on Discrete Sample also discussed some properties of probabilities of events. In the next section we shall talk
Spaces about the concept of conditional probability.
Suppose that two series of tickets are issued for a lottery. Let 1,2,3,4,5 be the numbers on
the 5 tickets in series I and let 6,7, 8,9, be the numbers on the. 4 tickets in series 11. I hold
the ticket,bearing number 3. Suppose the first prize in the lottery is decided by selecting one
of the 5 + 4 = 9 numbers at random. The probability that I will win the prize is 119. Does this
probability change if it is known that the prize-winning ticket is from series I? Ineffect, we
want to know the probability of my winning the prize, conditional on the knqwledge that the
prize-winning ticket is from series I.
In order to answer this question, observe that the given information reduces our
sample-space from the set { 1,2,3,.4,5,6,7,8,9) to its subset { 1.2.3.4.5) containing 5
points. In fact, this subset { 1,2,3,4,5) corresponds to the event H that the prize winning
ticket.belongs to series I. If the prize winning ticket is selected by choosing one of these 5
numbers at random, the probability that I will win the prize is 115. Therefore, it seems
logical to say that the conditional probability of the event A of my winning the prize, given
that the prize-winning number is from series I, is
Here P(A I H) is read as the conditional probability of A given the event H. Note that we can
write
This discussion enables us to introduce the following formal definition. In what follows we
assume that we are given a random experiment with discrete sample space R, and all
relevant events are subsets of R.
Definition 3 :Let H be an event of positive probability, that is, P(H) > 0. The conditional
probability P(A I H) of an event A, given the event H, is
Notice that we have not put any restriction o'n the event A except that A and H be subsets of
the same sample space R and that P(H) > 0.
Now we give two examples to help clarify this concept.
Example 12 :In a small town of 1000 people there are 400 females and 200 colour-blind
persons. It is known that ten per.cent, i.e. 40, of the 400 females are colour-blind. Let us find
the probability that a randomly chosen person is colour-blind, given that the selected person
is a female.
Now suppose we denote by A the event that the randomly chosen person is colour-blind and
by H the event that the randomly chosen person is a female. You can see that
P(A n H) = 4011000 = 0.04 and that
Then
Now can you find the probability that a randomly chosen person is colour-blind, given that
the selgcted person is a male,?
If you denote by M the event that the selected person is a male, then
Let A be the event that an order is delivered on time and H the event that it is completed on
time. Then P(H) = 0.75 and P(A H) = 0.60. We need P(A I H).
Have you understood the definition of conditional probability? You can find out for yourself
by doing these simple exercises.
E14) If A is the event that a person suffers from high blood pressure and B is the event that
he is a smoker, explain in words what the following probabilities represent.
a) P(A I B)
b) P(AC 1 B)
C) P(A I BC)
E15) Two unbiased dice are rolled. They both show the same score. What is the probability
that their common score is 6?
- - - -- -
Compare F',- with the properties of (unconditiond) probabilities given in Sec. 6.2.2.
You will find that the conditional probabilities, given the event H, have all the properties of
unconditional probabilities, which are sometimes called the absolute properties. '
Robablllty on Dlscrete Sample We can use the conditional probatii:itics to compute the unconditional probabilities of
Spaces events by employing the following obvious fact,
P(A n H) = P(H) P(A I HI. . . . (10)
obtained from Definition 3 of P(A I H).
Here is an important remark related to (10).
See E 14 for the interpretations of
P(A I H)and P(A~I H). Remark 3 : Relation (10) holds even if P(H) = 0, provided we interpret P(A I H) = 0 if P(H)
= 0. In words, this means that if the probability of -occurrence of H is zero, we say that the
probability of occurrence of A, given that H has occurred, is also zero. This is so, because
P(H) = 0 implies P(A n H) = 0, (A n H) being a subset'of H,
Exapple 14 :Two cards are drawn at random and without ieplacement from a pack of 52
playing cards. Let us find the probability that both the cards are red.
Let Al and A2 denote, respectively the events that cards drawn on the first and second draw
are red. Then by the classical definition, P(AI)= 26/52, since there are 26 red cards. If the
first card is red, we are left with 25 red cards in the pack of 51 cards. Hence P(Az I At)
= 2515 1. Thus, the probability P(A I n A2) of both cards being red is
Relation (10) specifies the probability of A n H in terms of P(H) and P(A/H). We can
extend this relation to obtain the probability, P(AI 17
A2 nsA3)in terms of P(AI),
P(A2 I A , ) and P(A3 I A, n A2). We, of course, assume that P(A I ) and P(A I n A2 ) are
both positive. Can you guess what this relation could be? Suppose we write
We end this section with a derivation of a well-known theorem in probability theory, called
the Bayes' theorem.
Consider an event B and its complementary event BC.The pair (B, BC)is called a partition of
Q, since they satisfy B n BC= I$, and B U BCis the whole sample space R. 0bserve.that for
- any event A,
b
Since A n B and A n BCare subsets of the disjoint sets B and BC,respectively, they
themselves are disjoint. As a consequence, P(A) = P(A n B) + P(A n BC).
Here we do not insist that P(B) and P(BC)be positive and follow the convention stated in on a LIbcrete Simpk
probabi~~ty
Remark 3. Space
Since A n Bi and A nBj are respectively subsets of Bi and Bj, i # j, they are disjoint.
Consequently by P7,
n
P(A) = C P(A n B ~ )
j= l
n
or P(A) = P(Bj) P(A I Bj), . .. (12)
j= 1
which is obtained by using (10). This result (12) leads to the celebrated Bayes' theorem,
which we now state.
Theorem 1(Bayes' Theorem) :If B I , B2, . . . ,B, are n events which constitute a partition
of R and A is an event of positive probability, then
foranyr, 1 I r S n .
Proof: Observe that by d&nition,
Example 17 :We have three boxes, each containing two covered compartments. The first
box has a gold coin in each compartment. The second box has a gold coin in one
compartment and a silver coin in the other. The third box has a silver coin in each of its
compartments. We choose a box at random and open a drawer at random. It contains a gold
coin. We would like to know the probability that the other compartment also has a gold coin.
Let B1, B2, B3, respectively, denote the events that Box 1, Box 2 and Box 3 are selected. It
is easy to see that B1, B2,B3 constitute a partition of the sample space of the experiment.
Let A denote the event that a gold coin is located. The composition of the boxes implies that
Since one gold coin is observed, we' will have a gold coin in the other unobserved
compartment of the box only.if we have selected Box 1. Thus, we need to obtain P(B 1 A).
Now by Bayes Theorem
Do you feel confident enough to try and solve these exercises now? In each of them, the
crucial step is to define the relevant events properly. Once you do that, the actual calculation
bf probabilities is child's play.
E16) In a city the weather changes frequently. It is known from past experience that a rainy
day is followed by a sunny day with probability 0.4, and that a sunny day is followed
by a rainy day with probability 0.7. Assume that the weather on any given day
depends only on the weather of the previous day. Find the probability that
a) a rainy day i.s followed by a rainy day
b) it would rain on Saturday and Sunday when Friday was rainy
This is an example of a Markov C) the entire period from Monday to Friday is rainy given, that the previous Sunday
, chain, named after the Russian was sunny.
mathematician A. Markov
(1856-1922)who initiated their E17) An urn contains 4 white and 4 black balls. A ball is drawn at randpm, its colour is
study. noted and is returned to the urn. Moreover, 2 additional balls of the colour drawn are
This procedure is called Polya's urn put in the urn and then a ball is drawn at random. What is the probability that the
rheme. second ball is black?
E 18) In a community 2 per cent of the people suffer from cancer. The probability that a
doctor is able to correctly diagnose a person with cancer as suffering from cancer is
0.80. The doctor wrongly diagnoses a person without cancer as having cancer with
probability 0.05. What is the ~Gbabilitythat a randomly selected person diagnosed as
having cancer is really suffenb from cancer?
E19) An explosion in a factory manufacturing explosives can occur because of (i) leakage I
of electricity, (ii) defects in machinery, (iii) carelessness of workers or (iv) sabotage.
The probability that
. 4
. Using the concept of conditional probability, we now introduce independent events in the
next section.
- - - -
From the examples discussed in the previous section you know that the conditional
probability P(A 1 H) is, in general, not the same as the unconditional probability P(A). Thus,
the knowledge of H affects the chances of occurrence of A. The following example
illustrates this fact more explicitly.
Example 18 :A box has 4 tickets numbered I, 2,3 and 4. One of these tickets is drawn at
random. Let A = { 1,2 ) be the event that the randomly selected ticket bears the number 1 or
2. Similarly define B = { 1).Then
P(A) = 112, P(B) = 114; and P(A n B) = 114.
Therefore, P(B I A) = (114) 1(112) = 112.
3/4
This example illustrates that additional information (about the occurrence of an event) can
increase or decrease the probability of occurrence of another event: We would be interested
in those situations which correspond to the cases when P(B 1 A) = P(B), as in the following
example.
Example 19 : We continue with the previous example. But no .d define H = { 1 , 2 ) and
K = { l , 3 ) . Then
P(H) = 112, P(K) = 112 and P(H n K) = 114.
Hence
In this example, knowledge of the occurrence of H does not alter the probability of
occurrence of K. We call such events, independent events.
Thus, two events A and B are independent, if
P(B 1 A) = P(B).
However, in this definition, we need to have P(A) > 0. Using the definition of P(B I A), we
can rewrite (1 3) as .' '
which does not require that P(A) or P(B) be positive. We shall now use (14) to define
independence of two events.
Definition 4 :Let A and B be two events associated with the same random experiment.
They are said to be stochasticallyindependent or simply independent if
Probability on Discrete Sample So the events A and B in Example 18 are not independent. Similarly, events C and D are
Spaces also not independent. But events K and H in Example 19 are independent.
We now proceed to study some implications of independence of two events A, and A2.
Recall that
P(Al) = P(Al n A2) + P(Al n A;).
Then
P(A, n A;) = P(AI) - P(Al n A,)
Now, if A, and A2 are independent, we get
Thus, the independence of A, andA2 hiplies that of A, and A;. NOWinterchange the roles
of A, and A2 What do you get? We get that if A, and A2 are independent, then so are
AT and A2. The independence of AT and A2 then implies the independence of AT and A;
too.
Now here is an interesting fact.
Can you prove a similar result for a null event ? You can check that if A is a null event, then
A and any other event B are independent.
V
Now, can we extend the definition of independence of two events to that of the
independence of three events? The obvious way seems to be to call A,, A2, A, independent
if P(A, fl A, nA,) = P(Al)P(A2)P(A3). But this does not work. Because if 3 events are
independgnt, we would expect any two of them also to be independent. But this is not
ensured by the condition above. To appreciate this, consider the case when Al = A2 = A,
0 < P (A) < 1, and P(A3) = 0. Then P(Al n A2) = P(A) # P(A,) P(A2) = P[(A)]~.
Thus, A, and A2 are not independent, but P(Al fl A2 fl A3) = P(Al) P(A2) P(A3) is satisfied.
So, to get around this problem we add some more conditions and get the following
> I- .-.
. -.-
Definition 5 :Three events Al, A2 and A3 corresponding to the same random experiment Probablllty on a Dlscrete Sample
Space
are said to be stochastically or mutually independent if
P(Al n 4 ) = p(A1) P(A2)
P(A2 n A,) = P(A2) P(A3)
Since the coin is unbiased, we assign the same probability, 118, to each of the eight possible
outcomes.
Check that
'P(Al) = P(A2) = P(A3 = 112
P(Al n A,) = P(A2 n A,) = P(A3 n A,) = 114, and
Thus, all the four equations in (15) are satisfied and the events Al, A2, A3 are mutually
.
independent.
We have seen that the last condition in (15) alone is not enough, since it does not guarantee
the independence of pairs of events.
Similarly, the first three equations of (15) alone are not sufficient to guarantee that all the
four conditions required for mutual independence would be satisfied. To see this, consider
the following example.
Example 21 :An unbiased die is rolled twice. Let Al denote the event "odd face on the
first roll", A2 denote the event "odd face on the second roll" and A3 denote the event that
the total score is odd. With the classical assignment of probability 1/36 to each of the sample
points, you can easily check that
P(Al) = P(A2) ;:P(A3) = 18/36 = 112, and that
P(A1 n A d = P(A2 n A,) = P(A3 n A,) = 9/36 = 114.
Thus, the first three equations in (15) are satisfied. But the last one is not valid. The reason
for it is that P(Al n A, n A3) is zero (Do you agree ?), and P(A1), p(k2). P(A3) are all
positive.
If the first three equationsof (15) are satisfied, wk say that Al, A2 and A3 are pairwise
independent. Example 21 shows that pairwise independence does not guarantee mutual
independence.
Now we are sure you can define the concept of independence of n events. Does your
definition agree with Definition 6? If n events are independent then any
r. 2 < r In events out of them should
Definition 6 :The n events Al, A2, , . . ,A,, corresponding to the same random experiment also be independent.
are mutually independent if for all r = 2. . . ., n, 1 < i, c i2 c . . . < ir I n, the product rule
holds.
Since r of-the n events can be chosen in
t1 ways, (17) represents
Probability on Discrete Sample Try to write Definition 6 for n = 3 and see if it matches Definition 5.
Spaces
We have already seen that if A, and'A2 are independent, then
AT and A2 or A, and As or A: and A; are independent. We now give a similar remark about
n independent events.
.
Remark 4 :If A,, A2, . . . A,, are n independent events, then we may replace some or all
of them by their complements without losing independence. In particular, when
..
Al, A2, . ,A, are independent, the product rule (17) holds even with some or all of
.
Ail, . . ,Ai are replaced by their complements.
r
We shall not prove this assertion, but shall use it in the following examples.
Example 22 :Suppose A l , A2, A3 are three independent events, with P(Aj) = Pj and we
want to obtain the probability that at least one of them occurs.
We want to find P(A, U, A2 U A3). Recall that (Example 8)
-
P(Al U'A2 U A3) = P(AI) + P(A2) + P(A3) P(AI flA2) - P(A2 U,A3)
-P(A3 f l A,) + P ( A l n A2 h A 3 )
= P I + P2 +P3-P1P2-P2P3-P3p1 +p1P2p3
We have
P(A ,U A2) = P(A + - P(A 1 n A2)
= P(Al) + P(A2) - P(AI) P(A2)
and P((Al UIA2) n \ A 3 = P((Al n A-j) U (A2 7) A3))
= P(Al n A,) + P(A2 n A,) - P(Al n A2 17A,)
= {P(A~)+ P(A2) - P(AI) P ( A ~ ) ~ P ( A ~ )
= P(Al UiA2)P(A3),
implying the independence of Al U\A2 and A3.
Example 24 :An automatic machine produces bolts. Each bolt has probability 1/10 of being
defective. Assuming that a bolt is defective independently of all other bolts, let's find
i) the probability that a good bolt is followed by two defective ones.
ii) the probability of getting one good and two defective bolts, not necessarily in that
order.
Let A, denote the event that the j-th inspected bolt is defective,j = 1,2,3. The assumption of
independence implies that A,, A2 and A3 are independent.
I
Notice that these events are disjoint and that each has the probability 0.009 (see (i)).
Hence, the required probability is
Example 25 :The probability that a person A will be alive 20 years hence is 0.7 and the
probability that another person B will be alive 20 years hence is 0.5. Assuming
independence, let's find the probability that neither of them will be alive after 20 years.
The probability that A dies before twenty years have elapsed is 0.3 and the corresponding
probability for B is 0.5. Hence the probability that neither of them will be alive 20 years
hence is
E21) If Al, A2 and A3 are independent events, examine for independence the following
pairs of events :
a) Al and A, n A3
c) A: and A; n AS.
E22) Obtain the probabilities of
a) A1 U ('42 n '4,)
b) A, n (A; n A;)
C) A;~(A$~A;)
under the assumptions of E21, if
P(A I ) = P(A2) = P(A3) = 113.
E23) Suppose that a sample space R consists of six permutations of (a, b, c) and three
additional points (a, a, a), (b, b, b) and (c, c, c). Each one of the nine points is assigned
the probability 119. Let A, denote the event that k-th place is occupied by the letter c,
k = 1, 2.3. Are Al, A, and A, mutually independent events?
E24) Lei A,, A,, A3 and A4 be four independent events with the same probability 113.
Obtain the probability that exactly two of them occur.
Hint :.You have to first find the probabilities
P(A, n A, n A; n A;), P(A nlA, n ~ ; nA4)
P ( A ~n A; n A~ n A;), P(A; n A", n3nA,)
P(A: nA, n A", A,), P(A; nA, n A, n AA@]
E25) Let R = ((a, a), (a, b), (b, a), (b, b) j. Let Ak be the event that letter 'a' appears at the
k-th place, k = 1,2. Examine Al and A2 for independence under the following
assignments of probabilities.
Probablllty on Dlserete Sample Sample point
Spaces
(a, a) (a. b) 0% a) (b, b)
Assignment 1 1/4 114 114 114
2 1/18 5/18 1/2 116
In E25 you must have found that A, and A2 are independent under Assignment 1 but not
under Assignment 2. This shows that independence of events depends on the assignment of
probabilities to the sample points and is not their intrinsic property.
The discussion so far has related to a random experiment performed only once. But usually
scientists carry out the same experiment more than once and preferably under identical
cohditions. In the next section, we shall consider the oxtenqon of our study to cover such
cases which involve repetition of an experiment or which involve performing tho or more
distinct experiments.
- - - -
We must mention that we have earlier discussed rolls of two dice or three or more tosses of
a coin without bringingin the concept of repeated trials. The following discussion is only an
elementary introduction to the topic of repeated trials.
To fix ideas, consider the simple experiment of tossing a coin twice. The sample space
corresponding to the first toss is S, =(H, TI say, where H = Head, T = Fail. Similarly the
sample space S2 for the second toss is also (H. T). Now observe that the sample space for
two tosses is Q =((H, H), (H, T), (T, H), (T, T)), where (H, H) stands for head on first toss
followed by a head on the second toss. The pairs (H, T), etc. are also similarly defined. Note
that Q consists of all ordered pairs that can be formed by choosing a point from S followed
by a point from S2. Mathematically we say that Q is the Cartesian product S, x S2 (read, S1
cross S2) of Sl and S2.
Now consider an experiment of tossing a coin and then rolling a die. The sample space
corresponding to toss of the coin is S = (H, T ) and that corresponding to the roll of the die
is SZ= ( 1,2,3,4,5,6).The sample space of the combined experiment is
Taking a cue from these two examples we can say that if S, and S2 are the sample spaces for
' two random experiments and %, then the Cartesian product S1 x S2 is the sample space of
the experiment consistihg of both el and E2.
We are sure that you will be able to do this simple exercise. '
'
E26) Find the sample spaces of the following experiments
a) Rolling two dice
b) Drawing two cards from a pack of 52 playing cards, with replacement.
Do you remember the definition of the Cartesian product of n(n 2 3) sets? We say that the
Cartesian product
S 1 x S 2 x...xS,=((xl ...., x,)Ixje S j , j = l ,... n}.
. .
Now, if S1, S2, . . Snrepresent the sample spaces corresponding to repetitions
el, e2, . . . , E, of the same experiment E,then the Cartesian product S1 x S2 x . . . x S,
represents the sample space for n repetitions or n trials of the experiment &.
(H, T), (T. H), (T. T)) which is the Cartesian product of (H, T ) with itself. Suppose the
Probability on a Discrete Sample
coin is unbiased so that PIH) = P ( T ) = ID for both the first and the second toss. Since the space
coin is unbiased, we may regard the four points in l2 as equally likely and assign probability
114 to each one of them. However, another way of looking at this assignment is to assume
that the results in the two tosses are independent. Mor specifically, we may consider
The statement that "the successive units are independent of each other'' is interpreted by
assigning probabilities to points of l2 by the product rule. In particular,
P((D, D, D)) = P(D) P(D) P(D] = p3,
P((D, D, G)) = P(D) P(D) P{G) = P~~
and lastly,
P((G, G, G)) = P{G) P{G) PIG) = (1 -p)3.
Notice that the sum of the probabilities of the eight points in l2 is
experiments if the events "first outcome is ui" and the event "second outcome is vj", are
independent,
i.e., if the assignment of probabilities on the product space S1 x S2 is such that
Before we conclude ow discussion of product spaces and repeated aials, let us revert to the
case of two independent experiments and % with sample spaces S1 &d S2
Suppose
S , = ( u , , u 2,... ) , P ( u i ] = p i , i 2 1
= Z pi, C qjs
Thus, the multiplication rule 'is valid not only for individual sample points of S1x S2but
also for events in the component sample spaces S1 and S2 also. Here we have talked about
events related to two experiments. But we can easily extend this fact to events related to
three or more experiments.
The independent Bernoulli trials provide the simplest example of repeated independent
trials. Here each trial has only two possible outcomes, usually called success (S) and failure
(F). We further assume that the probability of success is the same in each trial, and therefore,
Can you see the
three independent Bernoulli trials
the probability of failure is also the same for each trial. Usually we denote the probability of
and the situation in Example 26? success by p and that of failure by q = 1 - p.
Suppose, we consider three independent Bernoulli trials. The sample space is the Cartesian
product (S, F] x (S, F ) x IS, F]. It, therefore, consists of the eight points
SSS, SSF, SFS, FSS, FFS, FSF, SFF, FFF.
In view of independence, the corresponding probabilities are
In general, the sample space corresponding to n independent Bernoulli trials consists of 2"
points. A generic point in this sample space consists of the sequence of n letters, j of which
are S and n - j are F, j = 0.1, . . ., n. Each such p'oint carries the probability pi qn -J,
probability of j successes in n independent Bernoulli trials. We first note that there are
pints with j successes and (n - j) failures (we ask you to prove this in E27). Since each such
6) Probability on a Discrete Sample
sr-
b(j.n.p)=fi)pJqn-J.j = 0 , I , . . ..n.
These are called binomial probabilities and we shall return to a discussion of this topic when
we discuss the binomial distribution in Unit 8.
Now we bring this unit to a closk. But before that let's briefly recall the important concepts
that we studied in it.
6.7 SUMMARY
In this rather lengthy unit, we discussed the following main points :
1) We have introduced you'to the axiomatic approach to the definition of probability. 1n'
this approach we ~ssignprobabilities P(oj) to the points of a discrete sample space
such that
i) O S P ( o , ) S l , j = 1,2,..,i
2) We have seen how to compute the probability of an event A and have discussed its
various properties.
3) We have noted that the classical definition of probability assigns equal probabilities to
each of the points of a finite sample space.
4) We have acquainted you with the concept of conditional probability P(A I B) of a
given the event B.
foranyr, 1 S r 5 n .
6) We have defined and discussed the consequences of independence of two or more
events.
7) We have seen the method of assignment of probabilities when dealing with
independent repetitions of an experiment.
+
E3) If A and B are disjoint, A nB = and P3 follows from P2.
0.35
--
E6) a) violates P4
b) violates P5
C) and d) violate tht Axiom.
E7) Let S and W also denate the events that they are absent.
Then P(S) = 0.05, P(W) = 0.10, P(S n Wl = 0.02. Then a) p(SCfl w)
= 0.87,
b) P(S' U w) = 0.98 and c) P(S n wC) + P(S' n W) = 0.03 + 0.08 = 0.11.
E8) Use result (5) in Evample 4 to obtain the required probability which is 0.80.
El 1) Let p,, p2, p3 and p4 denote the probabilit~esthat the number of attempts is 1,2,3, and
on a Discrete Sample
4, respectively. Then, Space
3.x1 1 3x2~1-1
p1 = 114, p2=--=-
4x3 4 ' P3=4x3x2-4
3 x 2 ~ 1 ~ 1
a n d P 4 = 4 x 3 x 2 x 1 = 114.
El 2) a) 80/200 = 0.4
b) 60/200 = 0.3
c) 120/200= 0.6.
E16) a) -
1 0.4 0.6 -
b) 0.6 x 0.6 = 0.36
6/36
E19) Let A,, A2, A3 and A4 denote the four causes of explosion and E denote the event of
explosion. We need to compute P(AI 1 E), P(A2 I E), P(A3 I E) and P(A4 I E). We
have P(A,) = 0.20, P(A,) = 0.30, P(A3) = 0.40, P(A4) = 0.10 and P(E 1 A,) = 0.25,
P(E 1 A2) = 0.20, P(E 1 Ag) = 0.50, P(E I A4) = 0.75.
Using Bayes' theorem, we get
P(A, I E)=0.181,P(A2 1 E)=0.218
P(A3 1 E) = 0.327, P(A4 I E) = 0.273.
Thus, the most likely cause of explosion is the carelessness of workers.
9 1
andP(AI fl A2)=--=-
36 4
:. P(AI fl A,) = P(Al) .P(A2), and hence, Al and A2 are independent.
done in
with Fs.
6)
how many ways can we put these j letters in the n slots ? You know that this can be
ways. Once the Ss are in place, the remaining vacant slats can be filled









