How To Control Randomness
How To Control Randomness
In a slightly more popular formulation, probability theory studies how the probabilities of simple events can be used to
determine the probabilities of more complex events.
15
of the highly chaotic nature of weather even a slight variation in the current state of weather can
grow into a huge difference within a week. Consequently, weather modelling is usually based on
other methods. Nevertheless, probability theory has an extremely wide and varied field of
applications, from natural sciences (physics, biology, genetics), medicine (pharmaceutical trials)
and economy (study of financial markets, insurance) to social sciences and humanities
(demographics, linguistics, etc.).
16
Theory of Probability, 1933), integrating the concepts of probability theory with a general
measure theory. The axiomatic definition of probabilitya proposed in this book is still widely used.
However, we should also mention some alternative concepts of probability. The conceiver of one
of those, Richard von Mises (18831953) attempted to define probability as the limiting value of
relative frequencies in a process with an infinite sequence of trials. There is also the thought
school of subjective probability (Leonard Savage, Bruno de Finetti), viewing probability as
something dependent on the subjects previous knowledge like Thomas Bayes (17021761).
In conclusion, we could highlight three major stages in the development of probability theory. The
first brought forth the idea that predictions of future events are possible, with a certain degree of
accuracy, even in case of random phenomena. The second important period was the beginning
of the 19th century, with the establishment of links with statistics and the start of a clearly defined
scientific discipline with unlimited applications and opportunities. Its strict mathematical form was
given to the theory of probability in the thirties of the previous century.
For example, if you have 400 acquaintances in a large city, meaning that you could be
acquainted with a random passer-by with the probability of 400400000 0.001, then the
a
Probability is expressed as the number P(A), showing the likelihood of event A, with a value between 0 and 1 and 1 in case
of a certain event. If events , , are mutually exclusive (i.e., they can occur only one at a time), then the probability of
one of them occurring is equal to the sum of individual probabilities:
.
b
We can explain the principles of this formula. Firstly, it uses the formula for calculating the probability of the opposite event
, where means non-occurrence of event (opposite event). Secondly, it uses the principle of
1
independence of events, enabling to multiply the probabilities of individual events to calculate the probability of them
occurring together:
. In our example, 1
is the probability of not occurring, while 1
is the
occurs.
probability that none of the events , ,
17
world. Where it gets exciting is when somebody would make the correct guess in five consecutive
attempts. In a group of this size, the probability of that happening would be only 0.4%. A sceptic
could now start suspecting a crooked arrangement between the roller and the guesser. For a
believer in esotericism, this would be another proof of existence of extrasensory perception. (The
reader can probably guess the preference of the author in this matter.)
18
third with three and so on until the last man who would get the only remaining number
consequently, the amount of potential numbering variantsa would be 5 4 3 2 1 120. Now we
can count the number of correct pairs in each numbering variant (permutation). For instance, in
the sequence 1 3 2 4 5, three numbers, namely 1, 4 and 5, are in correct places, while the
numbers 2 and 3 and in wrong positions in the sequence. If we examine all 120 possible
sequences in this manner, we can find the probability distribution of correct numbers, as shown in
Table 1, p. 10.
Now we know what could happen and the probability of each outcome. We can see that even five
pairs could be guessed correctly by chance, but this would happen very rarely only in one
instance of 120. However, identifying three correct pairs is relatively simple, as it would occur, on
average, once in 12 attempts. In fact, the probability of this is much higher if we consider that
there was a whole bunch of participating psychics in the show. Assuming that there was eight,
then the probability of one of them hitting three (or more) correct pairs is rather high. The
reasoning would be as follows. First, the probability of a single psychic getting a score under
three (i.e., 0, 1 or 2 correct pairs) can be found from the table as the sum of three first
probabilities: 109/120=0.908. Then the probability of all of them getting a score under three would
8
be 0.908 =0.463. Finally, the probability of at least one psychic correctly identifying three or more
pairs is as high as 1 0.463 0.537 53.7% (!). This probability is indeed rather high and we
can confidently conclude that correctly guessing three pairs would be quite a usual outcome even
if all participants would pair the men and women completely randomly, without any additional
considerations.
Now we could ask, would our scepticism have been dispelled by a perfect score, i.e., correctly
guessing all five pairs? It is not difficult to calculate the probability of a perfect score in case of
6.5%. For a sceptic, even this is not
eight psychics and random pairing:
1
1 1/120
a sufficiently low probability value and, consequently, even a complete success could not be
taken as a proof of psychic powers. We can see that such an experiment cannot actually prove
anything the number of pairs is simply too low for that. In this case, the experiment should be
repeated with a larger number of pairs, for instance, six, seven or more. Naturally, we could now
ask, would perfect identification of six pairs, for instance, convince our sceptic. We can again
calculate the respective probability in case of random pairing. With six pairs, this probability is
1.1%. Now this is a rather small number and if somebody got six correct
1
1 1/720
results, there would be little doubt that his person is really good at identifying pairs. Especially if
we consider that pairing is made easier by visible compatibility of certain external indicators (such
as age or height).
19
is theoretically possible to achieve a perfect result simply by chance. The question is, how
probable such an outcome would be? For instance, with two cups (1 mil and 1 mif), the
probability of getting the correct result by chance is , which means that even half of bluffers
would pass this test. What then is the probability of randomly getting the correct result with eight
cups? We can make some simple calculations. Assume that the lady is bluffing and divides the
eight cups in two groups on a completely random basis. There are, in total, 70 different
possibilities for dividing 8 cups in two equal groups. Indeed, this can be found as the number of
combinations of 8 taken 4 at a time:
8!
4! 4!
87654321
43214321
70
As only one of such combinations represents the perfect result, the probability of getting the
correct division simply by chance is:
0.014
1.4% .
We can see that this is quite a low probability. This probability should be compared with a certain
critical level, which is often placed at 5% (this boundary depends on the importance of the
problem or the price of an incorrect choice). Now that the calculated probability is under the
critical probability level, a mathematical statistician can draw a carefully weighed conclusion: as
the probability of getting the correct result simply by chance is very low (less than 5%), there is no
reason to believe that the lady is bluffing it is likely that she is indeed able to tell the difference
between mil and mif.
In this context, the reader might see certain similarities with a court trial based on the
presumption of innocence a person is considered innocent until there is convincing proof to the
contrary. Similarly, in statistics a hypothesis (in our example, the claimed skill to differentiate
between two methods of tea preparation) is considered proven only if it is supported by sufficient
data.
What could be the probability of incorrect conviction? Is there a specific critical boundary that
should not be crossed? We can call it permissible probability of category one error. It seems that
no such exact boundary exists; rather, it depends on particular problems. Our preceding
examples have been of a relatively entertaining variety where the price of an incorrect decision is
not particularly high. Therefore, we have permitted the boundary of category one error to be as
high as 5%. The same does not apply to decisions in the world of medicine and justice. Indeed,
there is a proverb saying that
20
feat of labour on the part of Estonian statisticians. There is no need to add that most of our
arguments belonged to the field of probability theorya.
, where each
is either 1 or 0, depending on whether egg arrived at
its destination or not. A marvellous property of the mean value is that the mean of a sum equals
More information can be found in an article by Krista Fischer, Mtmise dilemmad et stut ei kuulutataks kurjategijaks,
et haigused ei jks avastamata (Dilemmas of measurement How to ensure that innocents are not declared guilty and
diseases are not left undiagnosed) Postimees, 6 April 2013.
b
The dispersion of random variable is the number
, i.e., the mean quadratic deviation from the mean
value. Dispersion describes the variance of individual values of a random variable around its mean value
. The
, where is the
dispersion of a discrete random variable can be calculated with the formula
.
probability of value . If and are independent, then the dispersions are added up:
21
This claim is particularly interesting in case of sampling with replacement, because it would be
trivial in case of sampling without replacement. For instance, if a student were to forget (!) the
formula of the mean (expected value), he or she could use a statistical method for finding the
mean value of multiple rolls of dice by making 100 dice rolls and calculating the arithmetic mean
/ of the results. The student can then be confident that the calculation
produces a value close to the actual mean value m (which would be 3.5 in this case).
The next natural question, how close is the calculated arithmetic mean to the mean value m is
answered by another theorem the central limit theorem. We can formulate it here only in an
approximate wording (as a rule of thumb), which should still convey the essence of this important
theorem:
If the value of a random variable is determined by many independent factors in such a manner
that their impacts combine and the impact of each individual factor is negligible compared to the
combined impact of all factors, is likely to have the normal or Gaussian distribution.
The normal distribution of peoples height (for instance, the height of adult men), referred to at the
start of this section, is a good example of how this rule can be applied. Clearly, height
is
determined by a number of factors, including genetics, the environment, nutrition, physical
activity, etc. Genetic factors can be further broken down into many individual factors. The final
distribution of
can be described very well with the normal distribution.
In conclusion, we can see that probability theory and statistics complement each other, creating a
powerful tandem. The needs of statistics have often stimulated developments of probability
theory, while probability theory provides statistics with the methods for solving its problems.
22