4️⃣
Lecture 4 - Conditional
Probability
Probability Distribution
Discreet random variable - Its probability distribution is characterised by
Prob. Mass Function (pmf).
Properties of probability mass function:
1. Non-negativity: p(x) ≥ 0 for all x in the sample space.
2. Normalization: The sum of p(x) over all possible values of x equals 1.
3. Range: 0 ≤ p(x) ≤ 1 for all x.
4. p(x) represents the probability that the random variable X takes the value x.
5. For any subset A of the sample space, P(X ∈ A) = Σ p(x) where x ∈ A.
Continuous random variable - Its probability distribution is characterised by
Prob. Density Function (pdf).
Properties of probability density function:
1. Non-negativity: f(x) ≥ 0 for all x in the sample space.
2. Normalization: The integral of f(x) over the entire sample space equals 1.
3. Unlike pmf, the pdf at a specific point doesn't represent probability directly.
Rather, the probability of X falling within an interval [a,b] is given by the
integral of f(x) from a to b.
4. For continuous random variables, P(X = x) = 0 for any specific value x.
5. The probability is only positive for intervals, not individual points.
The probability density function (pdf) describes the relative likelihood for a
continuous random variable to take on a given value. While the pmf gives actual
Lecture 4 - Conditional Probability 1
probabilities for discrete values, the pdf gives a "density" that must be integrated
over an interval to obtain probabilities.
Distribution Function
The distribution function, also known as the cumulative distribution function
(CDF), is a function that gives the probability that a random variable X takes on a
value less than or equal to x.
For a discrete random variable X with probability mass function p(x), the CDF is
defined as:
FX (x) = P (X ≤ x) = ∑ p(t)
t≤x
For a continuous random variable X with probability density function f(t), the CDF
is defined as:
x
FX (x) = P (X ≤ x) = ∫
f (t) dt
−∞
Properties of Distribution Function:
1. Non-decreasing: If x₁ ≤ x₂, then F(x₁) ≤ F(x₂).
2. Right-continuous: lim F(x + h) = F(x) as h approaches 0 from the right.
3. Limits: lim F(x) = 0 as x approaches -∞, and lim F(x) = 1 as x approaches +∞.
4. For any a < b: P(a < X ≤ b) = F(b) - F(a).
5. For discrete random variables, F(x) has jumps at the values that X can take.
6. For continuous random variables with pdf f(x), F'(x) = f(x) where F is
differentiable.
Expected Value
The expected value (also known as expectation, mean, or first moment) of a
random variable is a weighted average of all possible values the random variable
can take, with the weights being the probabilities of those values.
Lecture 4 - Conditional Probability 2
For Discrete Random Variables:
For a discrete random variable X with probability mass function p(x), the expected
value is given by:
E [X ] = ∑ x ⋅ p(x)
where the sum is taken over all possible values of X.
For Continuous Random Variables:
For a continuous random variable X with probability density function f(x), the
expected value is given by:
∞
E [X ] = ∫
x ⋅ f (x) dx
−∞
Properties of Expected Value:
1. Linearity: E[aX + b] = aE[X] + b for constants a and b.
2. Additivity: E[X + Y] = E[X] + E[Y] for any random variables X and Y.
3. For independent random variables X and Y, E[XY] = E[X]E[Y].
4. The expected value of a constant c is the constant itself: E[c] = c.
5. If X ≥ 0 for all outcomes, then E[X] ≥ 0.
The expected value represents the "average" or "central" value of a random
variable after many repetitions of the experiment.
Variance of a variable
Variance is a measure of the dispersion or spread of a random variable around its
expected value. It quantifies how far the values of a random variable typically
deviate from the mean.
For Discrete Random Variables:
For a discrete random variable X with probability mass function p(x), the variance
is given by:
Lecture 4 - Conditional Probability 3
2 2
Var(X ) = E [(X − E [X ]) ] = ∑(x − E [X ]) ⋅ p(x)
An alternative formula that is often easier to compute is:
2 2 2
Var(X ) = E [X ] − (E [X ]) = ∑x
⋅ p(x) − (∑ x ⋅ p(x))
x x
For Continuous Random Variables:
For a continuous random variable X with probability density function f(x), the
variance is given by:
∞
2 2
Var(X ) = E [(X − E [X ]) ] = ∫ (x − E [X ])
⋅ f (x) dx
−∞
Similarly, the alternative formula is:
∞ ∞ 2
2 2 2
Var(X ) = E [X ] − (E [X ]) = ∫ x ⋅ f (x) dx − (∫ x ⋅ f (x) dx)
−∞ −∞
Properties of Variance:
1. Non-negativity: Var(X) ≥ 0 for any random variable X. Var(X) = 0 if and only if
X is constant (has no randomness).
2. For a constant c: Var(c) = 0.
3. Scaling: Var(aX) = a² · Var(X) for any constant a.
4. Shift invariance: Var(X + b) = Var(X) for any constant b.
5. For independent random variables X and Y: Var(X + Y) = Var(X) + Var(Y).
Standard Deviation:
The standard deviation is the square root of the variance:
σX =
Var(X )
Lecture 4 - Conditional Probability 4
Standard deviation is often preferred over variance because it has the same unit
as the random variable itself, making it more intuitive to interpret.
Lecture 4 - Conditional Probability 5