MLT Assignment 1
MLT Assignment 1
1. What are random variables? Explain different types of the random variables.
Ans: Random Variables. A random variable, usually written X, is a variable whose
possible values are numerical outcomes of a random phenomenon. There are two types of
random variables, discrete and continuous.
A discrete random variable is one which may take on only a countable number of distinct values
such as 0,1,2,3,4,........ Discrete random variables are usually (but not necessarily) counts. If a
random variable can take only a finite number of distinct values, then it must be discrete.
Examples of discrete random variables include the number of children in a family, the Friday
night attendance at a cinema, the number of patients in a doctor's surgery, the number of
defective light bulbs in a box of ten.
A continuous random variable is one which takes an infinite number of possible values.
Continuous random variables are usually measurements. Examples include height, weight, the
amount of sugar in an orange, the time required to run a mile.
2. Explain Probability Density function with suitable example.
Ans: Probability Density Function
The probability density function is the probability function which is defined for the continuous
random variable. The probability density function is also called the probability distribution
function or probability function. It is denoted by f (x).
Let X be the continuous random variable with a density function f (x). Therefore,
Example:
Here, the function 4x3 is greater than 0. Hence, the condition f (x) ≥ 0 is satisfied.
Consider,
There are several different ways to write the formula for Bayes' theorem. The most common
form is:
P(A) and P(B) are the probabilities of A and B occurring independently of one another (the
marginal probability).
Example
You might wish to find a person's probability of having rheumatoid arthritis if they have hay
fever. In this example, "having hay fever" is the test for rheumatoid arthritis (the event).
A would be the event "patient has rheumatoid arthritis." Data indicates 10 percent of
patients in a clinic have this type of arthritis. P(A) = 0.10
B is the test "patient has hay fever." Data indicates 5 percent of patients in a clinic have
hay fever. P(B) = 0.05
The clinic's records also show that of the patients with rheumatoid arthritis, 7 percent
have hay fever. In other words, the probability that a patient has hay fever, given they
have rheumatoid arthritis, is 7 percent. B ∣ A =0.07
So, if a patient has hay fever, their chance of having rheumatoid arthritis is 14 percent. It's
unlikely a random patient with hay fever has rheumatoid arthritis.
An explanation of logistic regression can begin with an explanation of the standard logistic
function. The logistic function is a Sigmoid function, which takes any real value between zero and
one. It is defined as
Now, when logistic regression model come across an outlier, it will take care of it.
But sometime it will shift its y axis to left or right depending on outliers positions.
Ans: The difference between regression machine learning algorithms and classification machine
learning algorithms sometimes confuse most data scientists, which make them to implement
wrong methodologies in solving their prediction problems.
Andreybu, who is from Germany and has more than 5 years of machine learning experience, says
that “understanding whether the machine learning task is a regression or classification problem is
key for selecting the right algorithm to use.”
Let’s start by talking about the similarities between the two techniques.
Regression and classification are categorized under the same umbrella of supervised machine
learning. Both share the same concept of utilizing known datasets (referred to as training datasets)
to make predictions.
In supervised learning, an algorithm is employed to learn the mapping function from the input
variable (x) to the output variable (y); that is y = f(X).
The objective of such a problem is to approximate the mapping function (f) as accurately as
possible such that whenever there is a new input data (x), the output variable (y) for the dataset
can be predicted.
The main difference between them is that the output variable in regression is numerical (or
continuous) while that for classification is categorical (or discrete).
In machine learning, regression algorithms attempt to estimate the mapping function (f) from the
input variables (x) to numerical or continuous output variables (y).
In this case, y is a real value, which can be an integer or a floating point value. Therefore,
regression prediction problems are usually quantities or sizes.
For example, when provided with a dataset about houses, and you are asked to predict their prices,
that is a regression task because price will be a continuous output.
Examples of the common regression algorithms include linear regression, Support Vector
Regression (SVR), and regression trees.
Some algorithms, such as logistic regression, have the name “regression” in their names but they
are not regression algorithms.
P(A) and P(B) are the probabilities of A and B occurring independently of one another (the
marginal probability).
Example
You might wish to find a person's probability of having rheumatoid arthritis if they have hay
fever. In this example, "having hay fever" is the test for rheumatoid arthritis (the event).
A would be the event "patient has rheumatoid arthritis." Data indicates 10 percent of
patients in a clinic have this type of arthritis. P(A) = 0.10
B is the test "patient has hay fever." Data indicates 5 percent of patients in a clinic have
hay fever. P(B) = 0.05
The clinic's records also show that of the patients with rheumatoid arthritis, 7 percent
have hay fever. In other words, the probability that a patient has hay fever, given they
have rheumatoid arthritis, is 7 percent. B ∣ A =0.07
So, if a patient has hay fever, their chance of having rheumatoid arthritis is 14 percent. It's
unlikely a random patient with hay fever has rheumatoid arthritis.
Perceptron is a linear classifier (binary). Also, it is used in supervised learning. It helps to classify
the given input data.
a. All the inputs x are multiplied with their weights w. Let’s call it k.
b. Add all the multiplied values and call them Weighted Sum.
Fig: Adding with Summation
Mathematically squaring something and multiplying something by itself are the same. Because
of this we can rewrite our Variance equation as:E(XX) - E(X)E(X)E(XX)−E(X)E(X)This version
of the Variance equation would have been much messier to illustrate even though it means the
same thing. But now we can ask the question "What if one of the Xs where another Random
Variable?", so that we would have:E(XY) - E(X)E(Y)E(XY)−E(X)E(Y)
And that, simpler than any drawing could express, is the definition of Covariance
(Cov(X,Y)Cov(X,Y)). If Variance is a measure of how a Random Variable varies with itself then
Covariance is the measure of how one variable varies with another.
Covariance is a great tool for describing the variance between two Random Variables. But this
new measure we have come up with is only really useful when talking about these variables in
isolation. Imagine we define 3 different Random Variables on a coin toss: A(\omega) =
\begin{cases}2 & \text{if } \omega \text{ is Heads} \\ 1 & \text{if } \omega \text{ is Tails}
\end{cases}A(ω)={21if ω is Headsif ω is TailsB(\omega) = \begin{cases}20 & \text{if } \omega
\text{ is Heads} \\ 10 & \text{if } \omega \text{ is Tails} \end{cases}B(ω)={2010if ω is Heads
if ω is TailsC(\omega) = \begin{cases}200 & \text{if } \omega \text{ is Heads} \\ 100 & \text{if
} \omega \text{ is Tails} \end{cases}C(ω)={200100if ω is Headsif ω is Tails
Now visualize that each of these are attached to the same Sampler, such that each is receiving the
same event at the same point in the process.
b. Two coins are tossed, find the probability that two heads are obtained.
Note: Each coin has two possible outcomes H (heads) and T (Tails).
c. Two dice are rolled, find the probability that the sum is
i) equal to 1
ii) equal to 4
iii) less than 13
d. A card is drawn at random from a deck of cards. Find the probability of getting
the 3 of diamond.
e. A card is drawn at random from a deck of cards. Find the probability of getting a
queen.
f. A jar contains 3 red marbles, 7 green marbles and 10 white marbles. If a marble is
drawn from the jar at random, what is the probability that this marble is white?
Ans: a) Our possible outcomes = 1,2,3,4,5,6 ,so no.of possible outcomes is 6. our favourable
outcomes when a die is rolled are 2, 4,6 , so no.of favourable outcomes is 3.
probability = no.of favourable outcomes by
no.of possible outcomes
=> 3/6
=> 1/2
ans: ½
b) When two coins are tossed simultaneously then the possible outcomes obtained are {HH, HT,
TH, and TT}.
Here H denotes head and T denotes tail.
Therefore, a total of 4 outcomes obtained on tossing two coins simultaneously.
Consider the event of obtaining at the most one head. At most one head is obtained means
either no head is obtained or one head is obtained.
Therefore, the event of obtaining at most one head has 3 favourable outcomes. These are TT, HT
and TH.
i) probability =0
ii) There are six faces for each of two dice, giving 36 possible outcomes. If the two dice are fair,
each of 36 outcomes is equally likely.
iii) two fair six sided dice marked 1 through 6, the chances are 0.
6 is the highest roll on a single die. Two sixes, a result that comes up 1/6*1/6=1/36 of the time,
totals 12, short of 13.
There are other dice. The d4 will have a highest total of 8, also no good.
But 2 d8 have several 13 results, and 2d10 several more, 2d12 even more, 2d20, 2d30, 2d34…
there are even more obscure die sizes than this.
And of course dice don’t have to be marked with a series from 1 to (number of face), those d6
could be marked 1 3 5 7 9 11 for all I know
There’s no information about which dice are available, how they are marked, whether they are
fair… there simply doesn’t appear to be enough information to answer your question.
The usual meaning, though, of fair d6 marked 1 through 6, has a definitive answer: no chance.
Therefore,
e) Given that there are 4 Queens in a deck of 52 cards, the probability of drawing a Queen on a
single draw is 4/52=0.076924/52=0.07692, meaning there is a 7.692% chance of drawing a
Queen.
Thus, the probability to draw a white marble out of 20 is: 10C1/20C1 =1/2