0% found this document useful (0 votes)
13 views

Machine Learning (AI)

Short notes of machine learning

Uploaded by

Amritesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Machine Learning (AI)

Short notes of machine learning

Uploaded by

Amritesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Machine Learning

Module 1

Introduction: Basic definitions, Linear Algebra, Statistical learning theory, types of


learning, hypothesis space and Inductive bias, evaluation and cross validation,
Optimization.

Module 2

Statistical Decision Theory, Bayesian Learning (ML, MAP, Bayes estimates,


Conjugate priors), Linear Regression, Ridge Regression, Lasso, Principal
Component Analysis, Partial Least Squares.

Module 3

Linear Classification, Logistic Regression, Linear Discriminant Analysis, Quadratic


Discriminant Analysis, Perceptron, Support Vector Machines + Kernels, Artificial
Neural Networks + Back Propagation, Decision Trees, Bayes Optimal Classifier,
Naive Bayes.

Module 4

Hypothesis testing, Ensemble Methods, Bagging Adaboost Gradient Boosting,


Clustering, Kmeans, K-medoids, Density-based Hierarchical, Spectral .

Module 5

Expectation Maximization, GMMs, Learning theory Intro to Reinforcement


Learning, Bayesian Networks.
Machine learning
Machine learning (ML): Machine learning is a subset of AI, which enables the
machine to automatically learn from data, improve performance from past
experiences, and make predictions.

Machine learning is a method by which a computer program


can “automatically learn and improve from experience without being
explicitly programmed.”

A more technical definition given by Tom M. Mitchell’s (1997) : “A computer


program is said to learn from experience E with respect to some class of tasks
T and performance measure P, if its performance at tasks in T, as measured by
P, improves with experience E.” Example:

A handwriting recognition learning problem.

Task T: recognizing and classifying handwritten words within images.


Performance measure P: percentage of words correctly classified, accuracy.
Training experience E: a data-set of handwritten words with given
classifications.

Based on the methods and way of learning, ML is divided into mainly four types.

1. Supervised Machine Learning


2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning

Shortcomings of Machine Learning


1. Machine learning is not based in knowledge. Machines are driven by data,
not human knowledge. As a result, “intelligence” is dictated by the volume
of data you have to train it with. Machine learning cannot attain human-
level intelligence.
2. Machine learning models are difficult to train. Time, resources and massive
data sets are needed to create data models, and the process involves
manually pre-tagging and categorizing data sets.
3. Machine learning is prone to data issues. Data quality, data labelling and
building model are some of the data related problems for ML success.
4. Machine learning is often biased. Machine learning systems are known for
operating in a black box, meaning you have no visibility into how the
machine learns and makes decisions. Thus, if you identify an instance of
bias, there is no way to identify what caused it. Your only recourse is to
retrain the algorithm with additional data, but that is no guarantee to
resolve the issue.

Applications of Machine learning


1. Image Recognition: It is used to identify objects, persons, places, images etc.
2. Speech Recognition: Google assistant, Siri, Cortana, Alexa etc.
3. Virtual Personal Assistant: Google assistant, Alexa, Siri, Cortana etc.
4. Traffic prediction: Google maps
5. Product recommendations
6. Self-driving cars
7. Email Spam and Malware Filtering
8. Online Fraud Detection
9. Medical Diagnosis
10. Stock Market trading

Types of Machine learning


1. Supervised Machine Learning
Supervised learning: Supervised learning is the types of machine learning in
which machines are trained using well "labelled" training data, and on basis of
that data, machines predict the output.

o Labelled Data: The labelled data means some input data is already tagged
with the correct output.
o In supervised learning, the training data provided to the machines work as
the supervisor that teaches the machines to predict the output correctly.

Supervised learning is classified into two categories of algorithms:

1. Regression

2. Classification

1. Regression: Regression algorithms are used if there is a relationship between


the input variable and the output variable. It is used for the prediction of
continuous variables, such as Weather forecasting, Market Trends, etc. Below are
some Regression algorithms.

o Linear Regression
o Polynomial Regression
o Lasso Regression

2. Classification: Classification algorithms are used when the output variable is


categorical, which means there are two classes such as Yes-No, Male-Female,
True-false, etc.

o Logistic Regression
o Support vector Machines
o Naive Bayes

Advantages of Supervised learning


o With the help of supervised learning, the model can predict the output on
the basis of prior experiences.
o In supervised learning, we can have an exact idea about the classes of
objects.
o Supervised learning model helps us to solve various real-world problems
such as fraud detection, spam filtering, Risk Assessment, Image
classification, etc.
o Helps to optimize performance criteria with the help of experience.
Disadvantages of supervised learning
o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is
different from the training dataset.
o In supervised learning, we need enough knowledge about the classes of
object.
o Classifying big data can be challenging.
o Training for supervised learning needs a lot of computation time.

Applications of Supervised Learning

o Image Segmentation: Supervised Learning algorithms are used in image


segmentation. In this process, image classification is performed on different
image data with pre-defined labels.
o Medical Diagnosis: Supervised algorithms are also used in the medical field
for diagnosis purposes. It is done by using medical images and past labelled
data with labels for disease conditions. With such a process, the machine
can identify a disease for the new patients.
o Fraud Detection: Supervised Learning classification algorithms are used for
identifying fraud transactions, fraud customers, etc. It is done by using
historic data to identify the patterns that can lead to possible fraud.
o Spam detection: In spam detection & filtering, classification algorithms are
used. These algorithms classify an email as spam or not spam. The spam
emails are sent to the spam folder.
o Speech Recognition: Supervised learning algorithms are also used in speech
recognition. The algorithm is trained with voice data, and various
identifications can be done using the same, such as voice-activated
passwords, voice commands, etc.
o BioInformatics: BioInformatics is the storage of Biological Information of us
humans such as fingerprints, iris texture, earlobe and so on.
2. Unsupervised Machine Learning

Unsupervised learning: Unsupervised learning is a type of machine learning in


which models are trained using unlabeled dataset and are allowed to act on that
data without any supervision.

Unsupervised learning is classified into two categories of algorithms:

1. Clustering
2. Association

1. Clustering: Clustering is a method of grouping the objects into clusters such


that objects with most similarities remains into a group and has less or no
similarities with the objects of another group.

o K-means clustering
o Principal Component Analysis
o DBSCAN Algorithm

2. Association: An association rule is an unsupervised learning method which is


used for finding the relationships between variables in the large database. A
typical example of Association rule is Market Basket Analysis. Such as people who
buy X are also tend to purchase Y item.

o Apriori Algorithm
o FP-Growth algorithm

Advantages of Unsupervised Learning


o Unsupervised learning is used for more complex tasks as compared to
supervised learning because, in unsupervised learning, we don't have
labeled input data.
o It is easy to get unlabeled data in comparison to labeled data.
o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their
own experiences, which makes it closer to the real AI.

Disadvantages of Unsupervised Learning


o Unsupervised learning is intrinsically more difficult than supervised learning
as it does not have corresponding output.
o It is less accurate as input data is not labeled, and algorithms do not know
the exact output in advance.
Difference b/w Supervised and Unsupervised Learning

UNSUPERVISED
SUPERVISED LEARNING LEARNING

Uses Known and Labeled Uses Unknown and


Input Data Data as input Unabeled Data as input

Computational Less Computational


Complexity Very Complex Complexity

Uses Real Time Analysis of


Real Time Uses off-line analysis Data

Number of Classes are Number of Classes are not


Number of Classes known known

Accuracy of Accurate and Reliable Moderate Accurate and


Results Results Reliable Results

3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that represents
the intermediate ground between Supervised and Unsupervised learning
algorithms. It uses the combination of labeled and unlabeled datasets during the
training period.

Assumptions followed by Semi-Supervised Learning


To work with the unlabeled dataset, there must be a relationship between the
objects. To understand this, semi-supervised learning uses any of the following
assumptions:

o Continuity Assumption: As per the continuity assumption, the objects near


each other tend to share the same group or label.
o Cluster assumptions: In this assumption, data are divided into different
discrete clusters. Further, the points in the same cluster share the output
label.
o Manifold assumptions: This assumption helps to use distances and
densities, and this data lie on a manifold of fewer dimensions than input
space.

Semi-Supervised Learning Algorithms


o semi-supervised support vector machines(S3VM)
o Transductive support vector machine (TSVM)
o Graph Based Techniques

Applications of Semi-supervised Learning


o Machine translation: Teaching algorithms to translate language based on
less than a full dictionary of words.
o Fraud detection: Identifying cases of fraud when you only have a few
positive examples.
o Labeling data: Algorithms trained on small data sets can learn to apply data
labels to larger sets automatically.
o Speech Analysis
o Web content classification
o Text document classifier
o Protein(DNA) sequence classification

3. Reinforcement Learning
Reinforcement Learning is a feedback-based Machine learning technique in which
an agent learns to behave in an environment by performing the actions and
seeing the results of actions. The agent learns automatically with these feedbacks
and improves its performance. The agent learns with the process of hit and trial,
and based on the experience. Finding the shortest route between two points on a
map is a typical reinforcement learning use-cases

Main points in Reinforcement learning –

o Input: The input should be an initial state from which the model will start
o Output: There are many possible outputs as there are a variety of
solutions to a particular problem
o Training: Training is based upon the input, Model will return a state and
the user will decide to reward or punish the model based on its output.
o The model keeps continues to learn.
o The best solution is decided based on the maximum reward.

There are mainly two types of reinforcement learning, which are:

1. Positive Reinforcement
2. Negative Reinforcement

1. Positive Reinforcement: The positive reinforcement learning means adding


something to increase the tendency that expected behavior would occur
again. It impacts positively on the behavior of the agent and increases the
strength of the behavior.
This type of reinforcement can sustain the changes for a long time, but too
much positive reinforcement may lead to an overload of states that can reduce
the consequences.

2. Negative Reinforcement: The negative reinforcement learning is opposite to


the positive reinforcement as it increases the tendency that the specific
behavior will occur again by avoiding the negative condition.

It can be more effective than the positive reinforcement depending on


situation and behavior, but it provides reinforcement only to meet minimum
behavior.

Reinforcement Learning Algorithms


o Q-Learning
o State Action Reward State action (SARSA)
o Deep Q Neural Network (DQN)

Advantages of Reinforcement Learning


o It helps you to find which situation needs an action
o Helps you to discover which action yields the highest reward over the
longer period.
o Reinforcement Learning also provides the learning agent with a reward
function.
o It also allows it to figure out the best method for obtaining large rewards.

Disadvantages of Reinforcement Learning


o Feature/reward design which should be very involved
o Parameters may affect the speed of learning.
o Realistic environments can have partial observability.
o Too much Reinforcement may lead to an overload of states which can
diminish the results.
o Realistic environments can be non-stationary.

Reinforcement Learning Applications


o Robotics: RL is used in Robot navigation, walking, juggling, etc. Robots can
learn to perform tasks the physical world using this technique.

o Control: RL can be used for adaptive control such as Factory processes,


admission control in telecommunication, and Helicopter pilot is an example
of reinforcement learning.

o Game Playing: RL has been used to teach bots to play a number of video
games such as tic-tac-toe, chess, etc.

o Chemistry: RL can be used for optimizing the chemical reactions.

o Business: RL is now used for business strategy planning.


o Resource management: Given finite resources and a defined goal, RL can
help enterprises plan out how to allocate resources.
o Manufacturing: In various automobile manufacturing companies, the
robots use deep reinforcement learning to pick goods and put them in
some containers.
o Finance Sector: The RL is currently used in the finance sector for evaluating
trading strategies.

Artificial Neural Network


Biological neural networks: The human brain is composed of 86 billion nerve
cells called neurons. They are connected to other thousand cells
by Axons. Inputs from sensory organs are accepted by dendrites. These inputs
create electric impulses, which quickly travel through the neural network. A
neuron can then send the message to other neuron to handle the issue or does
not send it forward.
Artificial Neural Network: ANNs are composed of multiple nodes, which imitate
biological neurons of human brain. The neurons are connected by links in various
layers of the networks and they interact with each other. The nodes can take
input data and perform simple operations on the data. The result of these
operations is passed to other neurons. The output at each node is called
its activation or node value. Each link is associated with weight. ANNs are
capable of learning, which takes place by altering weight values.

The typical Artificial Neural Network looks something like the given figure.
o There are around 1000 billion neurons in the human brain. Each neuron has
an association point somewhere in the range of 1,000 and 100,000. In the
human brain, data is stored in such a manner as to be distributed, and we
can extract more than one piece of this data when necessary from our
memory parallelly. We can say that the human brain is made up of
incredibly amazing parallel processors.

o Dendrites from Biological Neural Network represent inputs in Artificial


Neural Networks, cell nucleus represents Nodes, synapse represents
Weights, and Axon represents Output.

Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output
The architecture of an artificial neural network
Artificial neural network consists of a large number of artificial neurons, which are
termed units arranged in a sequence of layers. Artificial Neural Network primarily
consists of three layers.

Input Layer: As the name suggests, it accepts inputs in several different formats
provided by the programmer.

Hidden Layer: The hidden layer presents in-between input and output layers. It
performs all the calculations to find hidden features and patterns.

Output Layer: The input goes through a series of transformations using the
hidden layer, which finally results in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the
inputs and includes a bias. This computation is represented in the form of a
transfer function.

It determines weighted total is passed as an input to an activation function to


produce the output. Activation functions choose whether a node should fire or
not. Only those who are fired make it to the output layer. There are distinctive
activation functions available that can be applied upon the sort of task we are
performing.

Types of Artificial Neural Networks


There are two types of Artificial Neural Network

1. Feed-Forward ANN
2. FeedBack ANN

1. Feed-Forward ANN: In this ANN, the information flow is unidirectional. A unit


sends information to other unit from which it does not receive any information.
There are no feedback loops. They are used in pattern generation, recognition or
classification. They have fixed inputs and outputs..

2. FeedBack ANN: Here, feedback loops are allowed. In this type of ANN, the
output returns into the network to accomplish the best-evolved results internally.
The feedback networks feed information back into itself and are well suited to
solve optimization issues. The Internal system error corrections utilize feedback
ANNs. They are used in content addressable memories.
Artificial Neural Network Algorithms
o Bayesian Networks
o Genetic Algorithm
o Back Propagation Algorithm

Advantages of Artificial Neural Network


o Parallel processing capability: Artificial neural networks have a numerical
value that can perform more than one task simultaneously.
o Storing data on the entire network: Data that is used in traditional
programming is stored on the whole network, not on a database. The
disappearance of a couple of pieces of data in one place doesn't prevent
the network from working.

o Capability to work with incomplete knowledge: After ANN training, the


information may produce output even with inadequate data. The loss of
performance here relies upon the significance of missing data.
o Having fault tolerance: Extortion of one or more cells of ANN does not
prohibit it from generating output, and this feature makes the network
fault-tolerance.

Disadvantages of Artificial Neural Network


o Assurance of proper network structure: There is no particular guideline for
determining the structure of artificial neural networks. The appropriate
network structure is accomplished through experience, trial, and error.
o Hardware dependence: Artificial neural networks need processors with
parallel processing power, as per their structure. Therefore, the realization
of the equipment is dependent.

o Unrecognized behavior of the network: It is the most significant issue of


ANN. When ANN produces a testing solution, it does not provide insight
concerning why and how. It decreases trust in the network.

o The duration of the network is unknown: The network is reduced to a


specific value of the error, and this value does not give us optimum results.

o Difficulty of showing the issue to the network: ANNs can work with
numerical data. Problems must be converted into numerical values before
being introduced to ANN. The presentation mechanism to be resolved here
will directly impact the performance of the network. It relies on the user's
abilities.

Applications of Neural Networks


They can perform tasks that are easy for a human but difficult for a machine −
o Aerospace − Autopilot aircrafts, aircraft fault detection.
o Automotive − Automobile guidance systems.
o Military − Weapon orientation and steering, target tracking, object
discrimination, facial recognition, signal/image identification.
o Speech − Speech recognition, speech classification, text to speech
conversion.
o Transportation − Truck Brake system diagnosis, vehicle scheduling, routing
systems.
o Software − Pattern Recognition in facial recognition, optical character
recognition, etc
o Signal Processing − Neural networks can be trained to process an audio
signal and filter it appropriately in the hearing aids.
o Industrial − Manufacturing process control, product design and analysis,
quality inspection systems, welding quality analysis, paper quality
prediction, chemical product design analysis, dynamic modeling of
chemical process systems, machine maintenance analysis, project bidding,
planning, and management.
o Medical − Cancer cell analysis, EEG and ECG analysis, prosthetic design,
transplant time optimizer.
o Electronics − Code sequence prediction, IC chip layout, chip failure
analysis, machine vision, voice synthesis.
o Financial − Real estate appraisal, loan advisor, corporate bond rating,
portfolio trading program, corporate financial analysis, currency value
prediction, document readers, credit application evaluators.
o Telecommunications − Image and data compression, automated
information services, real-time spoken language translation.
o Control − ANNs are often used to make steering decisions of physical
vehicles.

You might also like