0% found this document useful (0 votes)
10 views63 pages

Lecture 1 -Intro

Uploaded by

ujjawaltomar77
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
10 views63 pages

Lecture 1 -Intro

Uploaded by

ujjawaltomar77
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 63

Machine Learning

Dr. Dinesh Kumar Vishwakarma


Professor,
Department of Information Technology,
Delhi Technological University, Delhi-110042
dinesh@dtu.ac.in
http://www.dtu.ac.in/Web/Departments/InformationTechnology/faculty/
dkvishwakarma.php
Course Detail
• Faculty: Dinesh K Vishwakarma, Ph.D. in
Computer Vision
Email: dinesh@dtu.ac.in
Webpage:
http://www.dtu.ac.in/Web/Departments/InformationTec
hnology/faculty/dkvishwakarma.php
• Course Code:
Credit: L T P: 3 0 2 : 4C

2
Evaluation Schedule

3
Evaluation Criteria
• CWS (15%) PRS (25%)
 Assignments
 Tutorials
 Quiz's/Random Questions

• MTE (20%)
 1 Innovative Work in the form of Small Project, Startup Idea, Collaborative
Projects, Automation, Simulation, Case study, Solutions to Real time social,
economic and technical problems etc. (group of maximum 2 students): Graphical
abstract

• ETE (40%)
 (15x2=30%) 3 Class Tests after every 4 weeks, Best 2 will be considered for
evaluation.
 (10x1=10%) Minor Tests in the form of Quizzes, Short Answer Questions,
MCQs, Open Ended/Essay, Questions, etc. Better of the two will be considered
for evaluation.

4
Course Content
Contact
Unit No. Contents
Hours
Introduction to Machine Learning: Overview of different tasks:
classification, regression, clustering, control, Concept learning,
1 information theory and decision trees, data representation, diversity of 8
data, data table, form of learning, Basic Linear Algebra in machine
learning techniques.
Supervised Learning: Decision trees, nearest neighbours, linear
2 classifiers and kernels, neural networks, linear regression, logistic 12
regression, Support Vector Machines.
Unsupervised Learning: Clustering, Expectation Maximization, K-
3 Mean clustering, Dimensionality Reduction, Feature Selection, PCA, 10
factor analysis, manifold learning.
Reinforcement Learning: Element of Reinforcement learning, Basic of
4 Dynamic Programming; fining optimal policies, value iteration; policy 8
iteration; TD learning; Q learning; actor-critic.
Recent applications & Research Topics: Applications in the fields of
5 4
web and data mining, text recognition, speech recognition, finance.
Total Contact Hours 42
5
Books
Text Books
1 Introduction to Machine Learning, Alpaydin, E., MIT Press, 2004
2 Machine Learning, Tom Mitchell, McGraw Hill, 1997
3 Elements of Machine Learning, Pat Langley Morgan Kaufmann Publishers
4. Applied Machine Learning, M. Gopal, McGraw Hill, 2018

Reference
1 The elements of statistical learning, Friedman, Jerome, Trevor Hastie, and
Robert Tibshirani. Vol. 1. Springer, Berlin: Springer series in statistics, 2001.

2 Machine Learning: A probabilistic approach, by David Barber.


3 Pattern recognition and machine learning by Christopher Bishop, Springer
Verlag, 2006

4 An Introduction to Statistical Learning: with Applications in R (Springer Texts in


Statistics) 1st ed. 2013, Corr. 7th printing 2017 Edition

6
Resources: Journals
1 IEEE Transactions on Pattern Analysis and Machine
Intelligence
2 IEEE Transactions on Neural Networks and Learning
Systems
3 Pattern Recognition
4 International Journal of Computer Vision
5 IEEE Transactions on Fuzzy Systems
Ranking

6 Journal of Machine Learning Research


7 Expert Systems with Applications
8 Fuzzy Sets and Systems
9 Information Sciences
10 Artificial Intelligence
11 Machine Learning
12 Pattern Recognition Letters
7
Resources: Conferences

http://www.guide2research.com/topconf/machine-learning 8
A Few Quotes
• “A breakthrough in machine learning would be worth
ten Microsofts” (Bill Gates, Chairman, Microsoft)
• Machine learning is the hot new thing”
(John Hennessy, President, Stanford)
• “Web rankings today are mostly a matter of machine
learning” (Prabhakar Raghavan, Dir. Research, Yahoo)
• “Machine learning is going to result in a real revolution”
(Greg Papadopoulos, CTO, Sun)

• Machine learning (ML) is the study of


computer algorithms that improve
automatically through experience.

9
What is Machine Learning?
• A branch of artificial intelligence, concerned with the design and
development of algorithms that allow computers to evolve
behaviors based on empirical data.

• Machine Learning is the science (and art) of programming


computers so they can learn from data.
• As intelligence requires knowledge, it is necessary for the computers
to acquire knowledge.
Machine Learning is the field
• Getting computers to program themselves of study that gives computers
the ability to learn
without being explicitly
• Writing software is the bottleneck
programmed.
—Arthur Samuel, 1959
• Let the data do the work instead!

The term machine learning was coined in 1959 by Arthur Samuel


10
What is Machine Learning?
• A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P if its performance at tasks in T, as measured
by P, improves with experience E. Tom Mitchell,
"Machine Learning" 1997.

Process T Measure

E Improve
P
11
What is Machine Learning?

E T P
Experience Task Performance

Having Labelled Data:


Measuring
No. of students (male, Processing
Performance
female), etc.

Classification, Accuracy, Precession,


Supervised Learning
Regression Recall

12
What is Machine Learning?
T: Playing checkers
P: Percentage of games won against an arbitrary opponent
E: Playing practice games against itself

T: Recognizing hand-written words


P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words

T: Driving on four-lane highways using vision sensors


P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.

T: Categorize email messages as spam or legitimate.


P: Percentage of email messages correctly classified.
E: Database of emails, some with human-given labels

13
Example 1: Class of ML Analysis

• Typical customer: Admin/ Instructor.


• Database:
Current students registered
basic parameters ( Height, weight )
Basic classification.

• Goal: predict/decide whether student is


FIT?
14
Example 2: Credit Risk Analysis

• Typical customer: bank.


• Database:
Current clients data, including:
basic profile (income, house ownership,
delinquent account, etc.)
Basic classification.

• Goal: predict/decide whether to grant


credit.
15
Example 2: Credit Risk Analysis

• Rules learned from data:


IF Other-Delinquent-Accounts > 2 and
Number-Delinquent-Billing-Cycles >1
THEN DENY CREDIT
IF Other-Delinquent-Accounts = 0 and
Income > $30k
THEN GRANT CREDIT

16
Example 3: Clustering news

• Data: Reuters news / Web data


• Goal: Basic category classification:
Business, sports, politics, etc.
classify to subcategories (unspecified)

• Methodology:
consider “typical words” for each category.
Classify using a “distance “ measure.

17
What is Machine Learning?
Traditional Programming
Data
Computer Output
Program

Machine Learning

Data
Computer Program
Output
18
Resources: Datasets
• UCI Repository:
http://www.ics.uci.edu/~mlearn/MLRepository.html

• UCI KDD Archive:


http://kdd.ics.uci.edu/summary.data.application.html

• Statlib: http://lib.stat.cmu.edu/
• Delve: http://www.cs.utoronto.ca/~delve/
• Kaggle : https://www.kaggle.com/notebook

19
Why Machine Learning?
• Consider an example of Spam filtering.
 First we look, how spam typically looks like, such as (“4U,” “credit card,”
“free,” and “amazing”)
 Then we write a detection algorithm for each patterns and flagged if
pattern is detected.
 We test our program and repeat step 1 and 2 until is good enough

Traditional
Approach

20
Since the problem is not trivial, your program will likely become a long list of complex rules—pretty hard to maintain
Why Machine Learning?...
• ML techniques automatically learns which words and phrases are good
predictors of spam by detecting unusually frequent patterns of words in the
spam examples compared to the ham example.
• The program is much shorter, easier to maintain, and most likely more
accurate.

21
Why Machine Learning?...
• ML algorithms can be inspected to see what has been learned. For
instance, once the spam filter has been trained on enough spam, it can
easily be inspected to reveal the list of words and combinations of words
that it believes are the best predictors of spam.
• Sometimes this will reveal unsuspected correlations or new trends, and
thereby lead to a better understanding of the problem.

Applying ML
techniques to dig into
large amounts of data
can help discover
patterns that were not
immediately apparent.
This is called data
mining.

22
Why Machine Learning?...
• No human experts
 industrial/manufacturing control
 mass spectrometer analysis, drug design, astronomic discovery
• Black-box human expertise
 face/handwriting/speech recognition
 driving a car, flying a plane
• Rapidly changing phenomena
 credit scoring, financial modeling
 diagnosis, fraud detection
• Need for customization/personalization
 personalized news reader
 movie/book recommendation

23
Benefit of ML over Rule
Based
• Problems for which existing solutions require a lot of
hand-tuning or long lists of rules: one ML algorithm can
often simplify code and perform better.
• Complex problems for which there is no good solution at
all using a traditional approach: the best ML techniques
can find a solution.
• Fluctuating environments: a ML system can adapt to new
data.
• Getting insights about complex problems and large
amounts of data.

24
Applications

• Traffic Alerts • Online Video Streaming Applications


• Image Recognition • Virtual Professional Assistants
• Video Surveillance • Machine Learning Usage in Social Media
• Sentiment Analysis • Stock Market Signals Using Machine
• Product Recommendation Learning
• Online support using Chatbots • Auto-Driven Cars
• Google Translate • Fraud Detection
25
Related Field
data
mining control theory

statistics
decision theory
information theory machine
learning
cognitive science
databases
psychological models
evolutionary neuroscience
models

Machine learning is primarily concerned with the accuracy


and effectiveness of the computer system.
Machine Learning System
Unsupervised Feature Extraction Machine Learning Grouping of Objects
Algorithm
Training Set

Supervised
Predictive Model

New Data

Annotated Data

27
Machine Learning System

28
Machine Learning in a Nutshell
• Tens of thousands of machine learning
algorithms.
• Hundreds new every year
• Every machine learning algorithm has
three components:

Representation
Evaluation
Optimization
29
Representation
• Decision trees
• Sets of rules / Logic programs
• Instances
• Graphical models (Bayes/Markov nets)
• Neural networks
• Support vector machines
• Model ensembles
• Etc.

30
Evaluation
• Confusion Matrix • Cost / Utility
• Accuracy • Margin
• Recall/Sensitivity/T • Specificity
rue Positive Rate • F-Score
• Specificity
• etc.
• Error Rate
• ROC
• Squared error
• Likelihood
• Posterior probability
31
Optimization
• Combinatorial optimization
E.g.: Greedy search,
finding an optimal object from a finite set of objects
• Convex optimization
E.g.: Gradient descent
Finding the minimum of a function.
• Constrained optimization
E.g.: Linear programming
Optimizing an objective function with respect to some
variables in the presence of constraints on those
variables

32
Examples of Machine Learning
Problems
• Pattern Recognition
 Facial identities or facial expressions
 Handwritten or spoken words (e.g., Siri)
 Medical images
 Sensor Data/IoT
• Optimization
 Many parameters have “hidden” relationships that can be the basis of
optimization
• Pattern Generation
 Generating images or motion sequences
• Anomaly Detection
 Unusual patterns in the telemetry from physical and/or virtual plants (e.g., data
centers)
 Unusual sequences of credit card transactions
 Unusual patterns of sensor data from a nuclear power plant
• or unusual sound in your car engine or …
• Prediction
 Future stock prices or currency exchange rates

33
Web-based E.g. of ML
• Web data is huge and tasks have to performed
with very big datasets often use ML.
 especially if the data is noisy or non-stationary.
• Spam filtering, fraud detection:
 The enemy adapts so we must adapt too.
• Recommendation systems:
Lots of noisy data. Million dollar prize!
• Information retrieval:
Find documents or images with similar content.
• Data Visualization:
Display a huge database in a revealing way
34
Domain of
ML

35
Types of Learning
• Supervised (inductive) learning
Training data includes desired outputs
• Unsupervised learning
Training data does not include desired
outputs
• Semi-supervised learning
Training data includes a few desired outputs
• Reinforcement learning
Rewards from sequence of actions
36
Inductive Learning
• Learner discovers rules by observing
examples
• Given examples of a function (X, F(X))
• Predict function F(X) for new examples X
Discrete F(X): Classification
Continuous F(X): Regression
F(X) = Probability(X): Probability estimation

37
Learning Algorithms

Supervised learning Unsupervised learning

Semi-supervised learning
38
Machine learning structure

• Supervised learning

39
Supervised Learning

40
E.g. Supervised Learning

41
Document Classifier E.g. Supervised Learning

42
Spectrum of Supervision

Unsupervised “Weakly” supervised Fully supervised

Definition depends on task


Slide credit: L. 43
Lazebnik
Machine learning structure
• Unsupervised learning

44
Unsupervised Learning

45
E.g. Unsupervised Learning

46
Reinforcement Learning

47
Reinforcement Learning

2
48
Reinforcement Learning

4
49
Reinforcement Learning

50
E.g. Reinforcement Learning

51
Why Machine Learning is Hard?

52
What We’ll Cover
• Fundamentals of Linear Algebra and Probability
• Supervised learning
 Linear Regression
 Logistic Regression
 Decision tree induction
 Instance-based learning
 Bayesian learning
 Neural networks
 Support vector machines
 Model ensembles
• Unsupervised learning
 Clustering
 Dimensionality reduction
• Reinforcement Learning

53
Data Representation
• Information systems:
It represents knowledge from RAW data,
which is used for decision making.
• Data warehousing
It provide integrated, consistent and cleaned
data to machine learning algorithms.
• Data Table:
It is used to represent information.
54
DATA TABLE
• Each row represents a measurements/
observations and each column gives the value of
an attribute of the information system for all
measurements/ observations.
• Different terms are used to call ‘Rows’
information such as “Instances, examples,
samples, measurements, observations,
records, patterns, objects, cases, events”
• Similarly, the ‘Column’ information is used to call
“attributes and features”.
55
E.G. DATA TABLE
• Consider a patient information in the data table.
• Features and attributes: Headache, Muscle-
Pain, Temperature. These attributes represented
in linguistic form.
Patient Headache Muscle Pain Temperature Flu
1 NO YES HIGH YES
2 YES YES HIGH YES
3 YES YES VERY HIGH YES
4 NO YES NORMAL NO
5 YES NO HIGH NO
6 NO YES VERY HIGH YES

56
E.G. DATA TABLE
• An outcome for each observation is known as “a
priori” for directed/supervised learning.
• Decision Attribute: one distinguished attributes
that represent knowledge and information
system of this kind called decision system.
• E.g. ‘FLU’ is decision attribute
• {Flu: Yes}, {Flu; No}.
• Flu is a decision attribute with respect to
condition attributes: headache, muscle-pain,
temperature.
57
E.G. DATA TABLE
• A data file represents inputs as N instances:
𝑆 (1) , 𝑆 (2) , 𝑆 (3) , … … … . 𝑆 𝑁 .
• Each individual instances 𝑆 𝑖 ; 𝑖 = 1, 2, … . . 𝑁 that
provides the input to the machine learning tools
is characterized by its predefined values for a set
of features/attributes 𝑥1 , 𝑥2 , 𝑥3 , … … … . 𝑥𝑛 or
𝑥𝑗 ; 𝑗 = 1,2,3, … . . 𝑛

58
E.G. DATA TABLE
𝑥𝑗
𝑥1 𝑥2 𝑥3 𝑥3 ……. 𝑥𝑛 Decision 𝑦
𝑆 (𝑖)
𝑆 (1)
𝑆 (2)
𝑆 (3)
𝑆 (4)
.
.
𝑆 (𝐍)

Training experience is available in the form of N examples: 𝑆 (𝑖) ∈ 𝑆; 𝑖 =


1, 2, 3 … 𝑁. Where 𝑆 is a set of possible instances, which come from
real world.

59
DATA REPRESENTATION
• An instance can be represented for 𝑛
attribute/features: 𝑥𝑗 ; 𝑗 = 1, 2, 3, … … . . 𝑛.
• These features can be visualize as 𝑛 numerical
features as a point in 𝑛 -dimensional state space
ℜ𝑛 .
• 𝒙 = [𝑥1 𝑥2 𝑥3 𝑥4 ……𝑥𝑛 ] 𝑻 ∈ ℜ𝑛 . The set 𝑋 is a
finite set of feature vector 𝑥 (𝑖) for all possible
instance.
• Also visualized as 𝑋 region in the state space ℜ𝑛
to which instance belongs, i.e. 𝑋 ⊂ ℜ𝑛

60
DATA REPRESENTATION
• Here, 𝑥 (𝑖) is a representation of 𝑠 (𝑖) , 𝑋 is the
representation space.
• The pair of (𝑆, 𝑋) constitutes the information system.
Where 𝑆 is non-empty set of instances and 𝑋 is non-
empty features.
• Here, index 𝑖 represents instances and j represents
features.
 𝒔 𝒊 ; 𝒊 = 𝟏, 𝟐, 𝟑, … . . 𝑵 ∈ 𝑺
 𝒙 𝒊 ; 𝒊 = 𝟏, 𝟐, 𝟑, … . . 𝑵 ∈ 𝑿 (set of features)
(𝒊)
 𝒙𝒋 ; 𝒋 = 𝟏, 𝟐, 𝟑, … . . 𝑵 = 𝒙(𝒊)
 Features 𝒙𝒋 ; 𝒋 = 𝟏, 𝟐, … . 𝒏, may be viewed as state variables
and feature vector 𝒙 as a state vector in n-dimensional space.
61
DATA REPRESENTATION
• For every feature 𝑥𝑗 a set of values can be written as
𝑉𝑥𝑗 ∈ 𝑅 and called as domain of 𝑥𝑗 ; 𝑗 = 1, 2, … . 𝑛.
(𝑖)
• 𝑉𝑥𝑗 ∈ 𝑉𝑥𝑗 ; 𝑖 = 1, 2, … 𝑁.
• The tuple (S,X,Y) may be constituted and this is called
decision system.

62
Thank You

63

You might also like