04 Discrete and Continuous Random Variables
04 Discrete and Continuous Random Variables
Dr. S. Jain
Data Types
Data
Numerical
Qualitative
Discrete
Continuous
Dr. S. Jain
Random Variables
A random variable is a function or rule that assigns a number to each outcome of an experiment. Basically it is just a symbol that represents the outcome of an experiment. X = number of heads when the experiment is flipping a coin 20 times. C = the daily change in a stock price. R = the number of miles per gallon you get on your auto during a family vacation. Y = the amount of medication in a blood pressure pill. V = the speed of an auto registered on a radar detector used on I-20
Saturday, April 07, 2012 Dr. S. Jain 7.3
Dr. S. Jain
Dr. S. Jain
7.5
Dr. S. Jain
Children of One Gender # Girls in Family Open Check in Lines Answer 33 Questions Count Cars at Toll Between 11:00 & 1:00 # Open
# Correct
# Cars Arriving
0, 1, 2, ..., 33
0, 1, 2, ...,
2. 3. 4. 5.
Mutually Exclusive (No Overlap) Collectively Exhaustive (Nothing Left Out) 0 p(x) 1 p(x) = 1
Dr. S. Jain 8
Marilyn says: It may sound strange, but more families of 4 children have 3 of one gender and one of the other than any other combination. Explain this.
Construct a sample space and look at the total number of ways each event can occur out of the total number of combinations that can occur, and calculate frequencies. Are all 16 combinations equally likely? Is the sex of each child independent of the other three?
P (girl) = 1/2 P (boy) = 1/2 so, P (BBBB) = x x x = 1/16
Sample Space BBBB GBBB BGBB BBGB BBBG GGBB GBGB GBBG BGGB BGBG BBGG BGGG GBGG GGBG GGGB GGGG
9
If you have a family of four, what is the probability of P(all girls or all boys) = P (2 boys, 2 girls)= 6/16 = 3/8 six different ways to have 2 boys and 2 girls P(3 boys, 1 girl or 3 girls, 2 boy)=
2/16 = 1/8
Saturday, April 07, 2012
Assume the random variable X represents the number of girls in a family of 4 kids. (lower case x is a particular value of X, ie: x=3 girls in the family)
Sample Space BBBB GBBB BGBB BBGB Random Variable X x=0 x=1 x=1
x=1
x=1 x=2 x=2 x=2 x=2 x=2 x=2
BBBG
GGBB GBGB GBBG BGGB BGBG BBGG BGGG
x=3
x=3 x=3 x=3 x=4
GBGG
GGBG GGGB
Saturday, April 07, 2012
What is the probability of exactly 3 girls in 4 kids? P(X=3) = 4/16 What is the probability of at least 3 girls in 4 kids?
Dr. S. Jain
GGGG
P(X3) = 5/16
10
Table
Number of Girls, x Probability, P(x) 1/16 4/16 6/16 4/16 1/16 16/16=1.00
Graph
4 Total
X is random and x is fixed. We can calculate the probability that different values of X will occur and make a probability distribution.
11
P(x)
Dr. S. Jain
Probability Distributions
Probability, P(x) 0.40 0.35 0.30 0.25 4/16 4/16 6/16
P(x)
Probability distributions can be written as probability histograms. Cumulative probabilities: Adding up probabilities of a range of values.
1/6 1 2 3 4 5 6
x
P(x) 1
Saturday, April 07, 2012 Dr. S. Jain
all x
13
1.0
14
Dr. S. Jain
15
4
5 6
Saturday, April 07, 2012 Dr. S. Jain
P(x4)=4/6
P(x5)=5/6 P(x6)=6/6
16
Data Types
Data
Numerical
Qualitative
Discrete
Continuous
Dr. S. Jain
17
Dr. S. Jain
18
Random Variable
Weight
Hours Spending
Possible Values
45.1, 78, ...
900, 875.9, ... 54.12, 42, ...
f(x)
3. Properties
Area under curve sums to 1
Can add up areas of function to get probability less than a specific value
a
Value
P (c x d )
f(x)
c
1984-1994 T/Maker Co.
Consider the following table of sales, divided into intervals of 1000 units each,
interval (0,1000] (1000,2000] (2000,3000] (3000,4000] (4000,5000] (5000,6000]
(6000,7000]
Saturday, April 07, 2012 Dr. S. Jain 22
(0,1000]
(1000,2000]
(2000,3000]
(3000,4000] (4000,5000]
(5000,6000]
(6000,7000]
Saturday, April 07, 2012
Were going to divide the relative frequencies by the width of the cells (which here is 1000). This will make the graph have an area of 1.
(3000,4000]
(4000,5000] (5000,6000]
(6000,7000]
Saturday, April 07, 2012
Graph
interval (0,1000] (1000,2000] (2000,3000] (3000,4000]
f(x) relativefreq. cell width
(4000,5000]
(5000,6000] (6000,7000]
0.00025
0.00010 0.00005
sales
The area of each bar is the frequency of the category, so the total area is 1.
Saturday, April 07, 2012 Dr. S. Jain 25
Graph
interval (0,1000] (1000,2000] (2000,3000] (3000,4000]
f(x) relativefreq. cell width
(4000,5000]
(5000,6000] (6000,7000]
0.00025
0.00010 0.00005
sales
If we make the intervals 500 units instead of 1000, the graph would probably look something like this:
f(x) = p(x)
sales
Dr. S. Jain
27
If we made the intervals infinitesimally small, the bars and the frequency polygon would become smooth, looking something like this:
f(x) = p(x)
This what the distribution of a continuous random variable looks like. This curve is denoted f(x) or p(x) and is called the probability density function.
sales
Saturday, April 07, 2012 Dr. S. Jain 28