0% found this document useful (0 votes)
20 views47 pages

Probability Distributions Guide

Uploaded by

mehtabmiruet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views47 pages

Probability Distributions Guide

Uploaded by

mehtabmiruet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Random Variables and

Probability Distributions

Modified from a presentation by Carlos J. Rosas-


Anderson
Fundamentals of Probability
 The probability P that an outcome occurs is:
number of outcomes
P
number of trials

 The sample space is the set of all possible


outcomes of an event
 Example: Visit = {(Capture), (Escape)}
Axioms of Probability
1. The sum of all the probabilities of outcomes within
a single sample space equals one:n
 P( Ai )  1.0
i 1

2. The probability of a complex event equals the sum


of the probabilities of the outcomes making up the
event: P ( A or B )  P ( A)  P ( B )

3. The probability of 2 independent events equals the


product of their
P ( Aindividual
and B )  P (probabilities:
A)  P ( B )
Probability distributions
 We use probability 100

distributions because 80

they fit many types of 60

data in the living world 40

20 Std. Dev = 14.76


Mean = 35.3
0 N = 713.00

Ht (cm) 1996

Ex. Height (cm) of Hypericum


cumulicola at Archbold Biological
Station
Probability distributions
 Most people are familiar with the Normal
Distribution, BUT…
 …many variables relevant to biological and
ecological studies are not normally distributed!
 For example, many variables are discrete
(presence/absence, # of seeds or offspring, # of prey
consumed, etc.)
 Because normal distributions apply only to
continuous variables, we need other types of
distributions to model discrete variables.
Random variable
 The mathematical rule (or function) that
assigns a given numerical value to each
possible outcome of an experiment in the
sample space of interest.

 2 Types:
 Discrete random variables
 Continuous random variables
The Binomial Distribution
Bernoulli Random Variables
 Imagine a simple trial with only two possible outcomes:
 Success (S)

 Failure (F)

 Examples
 Toss of a coin (heads or tails)
Jacob Bernoulli (1654-
 Sex of a newborn (male or female)
1705)
 Survival of an organism in a region (live or die)
The Binomial Distribution
Overview

 Suppose that the probability of success is p

 What is the probability of failure?


 q=1–p

 Examples
 Toss of a coin (S = head): p = 0.5  q = 0.5

 Roll of a die (S = 1): p = 0.1667  q = 0.8333

 Fertility of a chicken egg (S = fertile): p = 0.8  q = 0.2


The Binomial Distribution
Overview

 Imagine that a trial is repeated n times


 Examples:
 A coin is tossed 5 times

 A die is rolled 25 times

 50 chicken eggs are examined

 ASSUMPTIONS:
1) p is constant from trial to trial

2) the trials are statistically independent of each other


The Binomial Distribution
Overview

 What is the probability of obtaining X successes in n trials?

 Example
 What is the probability of obtaining 2 heads from a coin
that was tossed 5 times?

P(HHTTT) = (1/2)5 = 1/32


The Binomial Distribution
Overview

 But there are more possibilities:

HHTTT HTHTT HTTHT HTTTH


THHTT THTHT THTTH
TTHHT TTHTH
TTTHH

P(2 heads) = 10 × 1/32 = 10/32


The Binomial Distribution
Overview

 In general, if n trials result in a series of success and failures,

FFSFFFFSFSFSSFFFFFSF…

Then the probability of X successes in that order is

P(X) = q  q  p  q  
= pX  qn – X
The Binomial Distribution
Overview

 However, if order is not important, then

P(X) = n!
 pX  qn – X
X!(n – X)!

n!
where is the number of ways to obtain X successes
X!(n – X)!

in n trials, and n! = n  (n – 1)  (n – 2)  …  2  1
The Binomial Distribution
 Remember the example of the wood lice that
can turn either toward or away from moisture?
 Use Excel to generate a binomial distribution
for the number of damp turns out of 4 trials.
Expected frequency

0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0 1 2 3 4

Number of damp turns


The Binomial Distribution
Overview
Bin(0.3, 5)
Bin(0.1, 5)
0.4
0.8
0.3
0.6
0.2
0.4
0.2
0.1
0 0
0 1 2 3 4 5 0 1 2 3 4 5

Bin(0.5, 5)

0.4
0.3
0.2
0.1
0
0 1 2 3 4 5
Bin(0.9, 5)
Bin(0.7, 5)
0.8
0.4
0.6
0.3
0.4
0.2
0.2
0.1
0
0
0 1 2 3 4 5
0 1 2 3 4 5
The Poisson Distribution
Overview
 When there are a large number of
trials but a small probability of
success, binomial calculations
become impractical
 Example: Number of deaths
from horse kicks in the French
Army in different years
 The mean number of successes Simeon D. Poisson (1781-
from n trials is λ = np 1840)

 Example: 64 deaths in 20 years


out of thousands of soldiers
The Poisson Distribution
Overview

 If we substitute λ/n for p, and let n approach infinity, the


binomial distribution becomes the Poisson distribution:

e -λλx
P(x) =
x!
The Poisson Distribution
Overview

 The Poisson distribution is applied when random events are


expected to occur in a fixed area or a fixed interval of time

 Deviation from a Poisson distribution may indicate some


degree of non-randomness in the events under study

 See Hurlbert (1990) for some caveats and suggestions for


analyzing random spatial distributions using Poisson
distributions
The Poisson Distribution
Example: Emission of -particles

 Rutherford, Geiger, and Bateman (1910) counted the number


of -particles emitted by a film of polonium in 2608
successive intervals of one-eighth of a minute
 What is n?

 What is p?

 Do their data follow a Poisson distribution?


The Poisson Distribution
Emission of -particles
No. -particles Observed
 Calculation of λ: 0 57
1 203
2 383
3 525
λ = No. of particles per interval
4 532
= 10097/2608 5 408
6 273
= 3.87 7 139
8 45
9 27
 Expected values: 10 10
11 4
e -3.87(3.87)x
2608  P(x) = 2608  12 0
x! 13 1
14 1
Over 14 0
Total 2608
The Poisson Distribution
Emission of -particles
No. -particles Observed Expected
0 57 54
1 203 210
2 383 407
3 525 525
4 532 508
5 408 394
6 273 255
7 139 140
8 45 68
9 27 29
10 10 11
11 4 4
12 0 1
13 1 1
14 1 1
Over 14 0 0
Total 2608 2608
The Poisson Distribution
Emission of -particles

Random events

Regular events

Clumped events
The Poisson Distribution
0.1 0.5

1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
1

1
0.8
0.6
0.4
0.2
2 0 6

1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
Review of Discrete Probability
Distributions

 If X is a discrete random variable,

 What does X ~ Bin(n, p) mean?

 What does X ~ Poisson(λ) mean?


The Expected Value of a Discrete
Random Variable

n
E ( X )   ai pi a1 p1  a2 p2  ...  an pn
i 1
The Variance of a Discrete Random
Variable

 ( X )  E X  E ( X )
2 2

2
n
 n

  pi  ai   ai pi 
i 1  i 1 
Continuous Random Variables
 If X is a continuous random variable, then X
has an infinitely large sample space
 Consequently, the probability of any
particular outcome within a continuous
sample space is 0
 To calculate the probabilities associated with a
continuous random variable, we focus on
events that occur within particular subintervals
of X, which we will denote as Δx
Continuous Random Variables
 The probability density function (PDF):

P ( X  xi )  f ( xi )  x
 To calculate E(X), we let Δx get infinitely
small: n
E ( X )   xi  f ( xi )  x
i 1
E ( X )   xf ( x)dx
Uniform Random Variables
 Defined for a closed interval (for example,
[0,10], which contains all numbers between 0
and 10, including the two end points 0 and 10).
0.2
1 / 10, 0  x  10 
Subinterval [3,4]
Subinterval [5,6]
f ( x)   
0.1
 0, otherwise 
P(X)

The probability
0 density function
0 1 2 3 4 5 6 7 8 9 10 (PDF)
X
Uniform Random Variables
For a uniform random variable X, where f(x)
is defined on the interval [a,b] and where a<b:
1 /(b  a ), a  x  b 
f ( x)   
 0, otherwise 

E ( X )  (b  a ) / 2
(b  a ) 2
 (X ) 
2

12
The Normal Distribution
Overview
 Discovered in 1733 by de Moivre as an approximation to the
binomial distribution when the number of trials is large
Abraham de
 Derived in 1809 by Gauss
Moivre (1667-
1754)
 Importance lies in the Central Limit Theorem, which states that
the sum of a large number of independent random variables
(binomial, Poisson, etc.) will approximate a normal distribution

 Example: Human height is determined by a large number of


factors, both genetic and environmental, which are additive in
their effects. Thus, it follows a normal distribution.

Karl F. Gauss
(1777-1855)
The Normal Distribution
Overview

 A continuous random variable is said to be normally distributed


with mean  and variance 2 if its probability density function is

1 (x  ) /2
2 2

f (x) = e
2

 f(x) is not the same as P(x)


 P(x) would be virtually 0 for every x because the normal
distribution is continuous
x2
 However, P(x1 < X ≤ x2) = f(x)dx
x 1
The Normal Distribution
Overview

0.45

0.40

0.35

0.30

0.25
f (x )

0.20

0.15

0.10

0.05

0.00
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

x
The Normal Distribution
Overview

0.45

0.40

0.35

0.30

0.25
f (x )

0.20

0.15

0.10

0.05

0.00
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

x
The Normal Distribution
Overview

Mean changes Variance changes


The Normal Distribution
Length of Fish

 A sample of rock cod in Monterey Bay suggests that the mean


length of these fish is  = 30 in. and 2 = 4 in.

 Assume that the length of rock cod is a normal random


variable  X ~ N( = 30 ,  =2)

 If we catch one of these fish in Monterey Bay,


 What is the probability that it will be at least 31 in. long?

 That it will be no more than 32 in. long?

 That its length will be between 26 and 29 inches?


The Normal Distribution
Length of Fish
 What is the probability that it will be at least 31 in. long?

0.25

0.20

0.15

0.10

0.05

0.00
25 26 27 28 29 30 31 32 33 34 35

Fish length (in.)


The Normal Distribution
Length of Fish
 That it will be no more than 32 in. long?

0.25

0.20

0.15

0.10

0.05

0.00
25 26 27 28 29 30 31 32 33 34 35

Fish length (in.)


The Normal Distribution
Length of Fish
 That its length will be between 26 and 29 inches?

0.25

0.20

0.15

0.10

0.05

0.00
25 26 27 28 29 30 31 32 33 34 35

Fish length (in.)


Standard Normal Distribution
 μ=0 and σ2=1

5000

4000

3000

2000

1000

0
-6 -4 -2 0 2 4
Useful properties of the normal
distribution
 The normal distribution has useful
properties:
 Can be added: E(X+Y)= E(X)+E(Y)
and σ2(X+Y)= σ2(X)+ σ2(Y)
 Can be transformed with shift and
change of scale operations
Consider two random variables X and Y
Let X~N(μ,σ) and let Y=aX+b where a and b are
constants
Change of scale is the operation of multiplying X by a
constant a because one unit of X becomes “a” units of
Y.
Shift is the operation of adding a constant b to X
because we simply move our random variable X “b”
units along the x-axis.

If X is a normal random variable, then the new random


variable Y created by these operations on X is also a
normal random variable .
For X~N(μ,σ) and Y=aX+b

 E(Y) =aμ+b
 σ2(Y)=a2 σ2

 A special case of a change of scale and shift operation


in which a = 1/σ and b = -1(μ/σ):
 Y = (1/σ)X-(μ/σ) = (X-μ)/σ
 This gives E(Y)=0 and σ2(Y)=1
 Thus, any normal random variable can be
transformed to a standard normal random variable.
The Central Limit Theorem
 Asserts that standardizing any random variable that itself is a
sum or average of a set of independent random variables
results in a new random variable that is “nearly the same as” a
standard normal one.
 So what? The C.L.T allows us to use statistical tools that
require our sample observations to be drawn from normal
distributions, even though the underlying data themselves may
not be normally distributed!
 The only caveats are that the sample size must be “large
enough” and that the observations themselves must be
independent and all drawn from a distribution with common
expectation and variance.
Log-normal Distribution
 X is a log-normal random 300

variable if its natural A


logarithm, ln(X), is a normal
200

random variable [NOTE: 100

ln(X) is same as loge(X)] Std. Dev = 183.79


Mean = 127.5

 Original values of X give a 0 N = 765.00

0.
10
20 .0
30 0
40 0
50 0
60 0
70 0
80 .0
90 0
10 0
11 .0
12 .0
13 0 .0
14 .0
15 .0
16 .0
0
0
0.
0.
0.
0.
0.
0
0.
0.
00
00
0
00
00
00
00
right-skewed distribution

.0
rep 1994

(A), but plotting on a 70

logarithmic scale gives a 60

normal distribution (B). 50

40

30

 Many ecologically 20

important variables are log- 10 Std. Dev = 1.44


Mean = 4.00

0 N = 765.00

normally distributed.
.7

1.

1.

2.

2.

3.

3.

4.

4.

5.

5.

6.

6.

7.
5

25

75

25

75

25

75

25

75

25

75

25

75

25
LOGREP94

SOURCE: Quintana-Ascencio et al. 2006; Hypericum data from Archbold Biological Station
Log-normal Distribution

  2 / 2
mean  e
2
   
2

  
variance   e  1   e 2
2

   

 
Exercise
 Next, we will perform an exercise in R that
will allow you to work with some of these
probability distributions!

You might also like