Machine Learning
Machine Learning
Machine Learning is the systematic study of algorithms and systems that improve their knowledge or
performance (learn a model for accomplishing a task) with experience (from available data /examples)
Examples:
Given an URL decide whether it is a Sports website or not
Given that a buyer is buying a book at online store, suggest some related products for that
buyer
Given an ultrasound image of abdomen scan of a pregnant lady, predict the weight of the
baby
Like human learning from past experiences, a computer does not have “experiences”.
A computer system learns from data, which represent some “past experiences” of an
application domain.
Objective of machine learning : learn a target function that can be used to predict the
values of a discrete class attribute, e.g., approve or not-approved, and high-risk or low
risk.
The task is commonly called: Supervised learning, classification, or inductive learning
Supervised Learning
The computer is presented with example inputs and their desired outputs, given by a "teacher",
and the goal is to learn a general rule that maps inputs to outputs.
Supervised learning is a machine learning technique for learning a function from training
data.
The training data consist of pairs of input objects (typically vectors), and desired outputs.
The output of the function can be a continuous value (called regression), or can predict a
class label of the input object (called classification).
The task of the supervised learner is to predict the value of the function for any valid
input object after having seen a number of training examples (i.e. pairs of input and target
output).
To achieve this, the learner has to generalize from the presented data to unseen situations
in a "reasonable" way.
Another term for supervised learning is classification.
Classifier performance depend greatly on the characteristics of the data to be classified.
There is no single classifier that works best on all given problems.
Determining a suitable classifier for a given problem is however still more an art than a
science.
The most widely used classifiers are the Neural Network (Multi-layer Perceptron),
Support Vector Machines, k-Nearest Neighbors, Gaussian Mixture Model, Gaussian,
Naive Bayes, Decision Tree and RBF classifiers.
Information gain is defined as the difference between the original information requirement
(i.e., based on just the proportion of classes) and the new requirement (i.e., obtained after
partitioning on A). That is,
In other words, Gain(A) tells us how much would be gained by branching on A. It is the
expected reduction in the information requirement caused by knowing the value of A. The
attribute A with the highest information gain, (Gain(A)), is chosen as the splitting attribute at
node N.
Hence, the gain in information from such a partitioning would be
Similarly, we can compute Gain(income) = 0.029 bits, Gain(student) = 0.151 bits, and
Gain(credit rating) = 0.048 bits. Because age has the highest information gain among the
attributes, it is selected as the splitting attribute. Node N is labeled with age, and branches are
grown for each of the attribute’s values. The tuples are then partitioned accordingly, as shown
in Figure 6.5. Notice that the tuples falling into the partition for age = middle aged all belong
to the same class. Because they all belong to class “yes,” a leaf should therefore be created at
the end of this branch and labeled with “yes.” The final decision tree returned by the
algorithm is shown in Figure 6.5.
Gain Ratio:
C4.5, a successor of ID3 uses an extention to information gain known as gain ratio, which
attempts to overcome the bias.
It applies a kind of normalization to information gain using a “split information”
SplitInfo(D)=-
Gini Index:
The Gini index is used in CART. Using the notation, the gini index measures the impurity of D, a
data partition or set of training tuples , as
Gini(D)=1-
GiniA(D)=
Example:
Information gain
PREDICTION
Numeric prediction is the task of predicting continuous (or ordered) values for given input.
For example, we may wish to predict the salary of college graduates with 10 years of work
experience, or the potential sales of a new product given its price. By far, the most widely
used approach for numeric prediction is regression.
Regression analysis can be used to model the relationship between one or more independent
or predictor variables and a dependent or response variable.
The response variable is what we want to predict.
Regression analysis is a good choice when all of the predictor variables are continuous
valued as well.
Linear Regression
2) Nonlinear Regression
Polynomial regression is often of interest when there is just one predictor variable. It can
be modeled by adding polynomial terms to the basic linear model. By applying transformations
to the variables, we can convert the nonlinear model into a linear one that can then be solved by
the method of least squares.
The non linear regression model is as follows:
Y=w0+w1x+w2x2+w3x3
To convert this equation to linear form, we define new variables:
X1=x x2=x2 x3=x3
Unsupervised learning
Unsupervised learning, no labels are given to the learning algorithm, leaving it on its own
to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden
patterns in data) or a means towards an end.
Example
Suppose you have a basket and it is filled with some different types fruits, your task is to
arrange them as groups.
This time you don’t know anything about the fruits, honestly saying this is the first time
you have seen them. You have no clue about those.
So, how will you arrange them? What will you do first???
You will take a fruit and you will arrange them by considering physical character of that
particular fruit.
Suppose you have considered color.
Then you will arrange them on considering base condition as color.
Then the groups will be something like this.
RED COLOR GROUP: apples & cherry fruits.
GREEN COLOR GROUP: bananas & grapes.
So now you will take another physical character such as size.
RED COLOR AND BIG SIZE: apple.
RED COLOR AND SMALL SIZE: cherry fruits.
GREEN COLOR AND BIG SIZE: bananas.
GREEN COLOR AND SMALL SIZE: grapes.
Job done happy ending.
Here you did not learn anything before, means no train data and no response variable.
This type of learning is known as unsupervised learning.
Clustering comes under unsupervised learning.
Clustering