Lecture 1
Lecture 1
Natural Language
Kernel Methods Processing
Image Processing
Signal Processing
Instructors
• Lectures:
Radu Ionescu (raducu.ionescu@gmail.com)
• Labs:
Alin Croitoru (alincroitoru97@gmail.com)
Eduard Poesina (eduardgabriel.poe@gmail.com)
Vlad Hondru (vlad.hondru25@gmail.com)
• Website:
https://practical-ml-fmi.github.io/ML/
Grading System
• Your final grade is composed of:
50% for Project 1
50% for Project 2
• Both projects are individual!
• Each project consists of employing machine learning methods
on a specific data set
• Project 1 is about participating in a Kaggle competition
The competition will be launched in a couple of weeks
• Project 2 is about comparing two unsupervised approaches
There are many datasets out there, so no overlap allowed
among students!
Methods and data sets must be chosen beforehand!
Grading System
• Project 1 must be presented no later than week 10
• Project 2 must be presented no later than the day of the “exam”
• There will be no paper exam, only oral exam!
• The average grade of projects 1 and 2 must be >= 5
• The project consists of the code implementation in Python (any
library is allowed) and a PDF report including (2 points):
a description of the data set (for project 2 only)
a description of the implemented machine learning methods
figures and / or tables with results / hyperparameter tuning
comments / interpretation for the results
conclusion
Grading System
First project consists of implementing some machine learning
method(s) for the proposed Kaggle challenge (TBA)
The grades will be proportional to your model’s accuracy:
- Top 1-20 => your grade can be up to 10
- Top 21-50 => your grade can be up to 9
- Top 51-80 => your grade can be up to 8
- Top 81-100 => your grade can be up to 7
- Top 101-120 => your grade can be up to 6
- Others => your grade can be up to 5
Ranks can change depending on the final number of
participants
Grading System
Submit projects to: practical.ml.fmi@gmail.com
Submit only .py files only! (.ipynb not accepted)
We will set deadlines (during every evaluation session) for:
choosing the project
submitting the project
presenting the project
• If you don’t know the dates, please ask! Don’t wait until the
presentation day!
Grading System
• Extra points during lectures / labs
awarded only in the first round of evaluation
• Lectures:
awarded based on the ranking of answers on Kahoot
top 3 get up to 0.3 points per lecture, next 3 up to 0.2 points and
so on
• Labs:
first to answer solve an exercise gets 0.2 points
maximum 0.4 points per lab for each student
• Up to 1 bonus point during lectures (added to final grade)
• Up to 1 bonus point during labs (added to final grade)
• Maybe up to 2 bonus points for some data annotation (TBD)
(NO) Collaboration Policy
• Collaboration
Each student must write their own code for the project(s)
Borrowing code from web sources with copy & paste is
not permitted under any circumstances
• No tolerance on plagiarism
Neither ethical nor in your best interest
Code will be checked automatically and manually!
Don’t cheat. We will find out!
We are serious about this!
Examples of unacceptable plagiarism
Examples of unacceptable plagiarism
Examples of unacceptable plagiarism
Examples of unacceptable plagiarism
Examples of unacceptable plagiarism
Examples of unacceptable plagiarism
Examples of acceptable code
Examples of acceptable code
Examples of acceptable code
What is artificial intelligence (AI)?
• The ultimate goal of artificial intelligence is to build systems
able to reach human intelligence levels
• Turing test: a computer is said to possess human-level
intelligence if a remote human interrogator, within a fixed time
frame, cannot distinguish between the computer and a human
subject based on their replies to various questions posed by the
interrogator
Perhaps we are going in the right
direction?
What is machine learning (ML)?
Data
Computer Output
Program
Machine Learning
Data
Computer Program
Output
A well-posed machine learning
problem
• What problems can be solved* with machine learning?
• Well-posed machine learning problem:
"A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure
P, if its performance at tasks in T, as measured by P,
improves with experience E.” – Tom Mitchell
(*) implies a certain degree of accuracy
A well-posed machine learning
problem
• Arthur Samuel (1959) wrote a program for playing checkers
(perhaps the first program based on the concept of learning, as
defined by Tom Mitchell)
• The program played 10K games against itself
• The program was designed to find the good and bad positions
on the board from the current state, based on the probability of
winning or losing
• In this example:
E = 10000 games
T = play checkers
P = win or lose
Strong AI versus Weak AI
• Strong / generic / true AI
(see the Turing test and its extensions)
• Weak / narrow AI
(focuses on a specific well-posed problem)
When do we use machine learning?
• We use ML when it is hard (impossible) to define a set of
rules by hand / to write a program based on explicit rules
• More compute
power
• Better
algorithms /
models
Amount of Training Data
ML in a nutshell
• Data
• ,…
Biology Applied
Neuroscience Maths
Computer
Statistics
Science
• Non-standard paradigms:
Active learning
Transfer learning
Transductive learning
Supervised learning
• We have a set of labeled training samples
• Example 1: object recognition in images annotated
with corresponding class labels
Car Person
Person Dog
Car
Supervised learning
• Example 2: handwritten digit recognition (on the MNIST data
set)
• Images of 28 x 28 pixels
• We can represent each image as a vector x of 784 components
• We train a classifier such that:
Supervised learning
• Example 2 (continued): handwritten digit recognition (on the
MNIST data set)
• Starting with a training set of about 60K images (about 6000 images
per class)
• … the error rate can go down to 0.23% (using convolutional neural
networks)
• Among the first (learning-based) systems used in a large-scale
commercial setting for postal code and bank cheque processing
Supervised learning
• Example 3: face detection
Confidence /
performance
guarantee?
Why linear
combination?
Why these words?
Where do the weights
come from?
Supervised learning
• Example 5: predicting stock prices on the market
• Regression
Age estimation in images
• Classification?
• Regression?
What age?
The supervised learning paradigm
Supervised learning models
• Naive Bayes (lecture 2)
• k-Nearest Neighbors (lecture 3)
• Decision trees and random forests (lecture 4)
• Support Vector Machines (lecture 5, 6)
• Kernel methods (lecture 5)
• Kernel Ridge Regression (lecture 5)
• Neural networks (lectures 7, 8, 9)
• Many others…
Unsupervised learning
• We have an unlabeled training set of samples
• Example 1: clustering images based on similarity
Unsupervised learning
• Example 1: clustering MNIST images based on
similarity [Georgescu et al. ICIP2019]
Unsupervised learning
• Example 2: unsupervised features learning
Unsupervised learning
• Example 2: unsupervised features learning for
abnormal event detection [Ionescu et al. CVPR2019]
Unsupervised learning
• Example 3: clustering mammals by family, species, etc.
• Dimensionality Reduction
Unsupervised learning models
• K-means clustering (lecture 10, 11)
• DBScan (lecture 12)
• Hierarchical clustering (lecture 12)
• Principal Component Analysis (lecture 13)
• t-Distributed Stochastic Neighbor Embedding
(lecture 13)
• Hidden Markov Models
• Many others…
Semi-supervised learning
• We have a training set of samples that are partially
annotated with class labels
• Example 1: object recognition in images, some of
which are annotated with corresponding class labels
Car Dog
Person
Reinforcement learning
• How does it work?
• The system learns intelligent behavior using a
reinforcement signal (reward)
• The reward is given after several actions are taken (it
does not come after every action)
• Time matters (data is sequential, not i.i.d.)
• The actions of the system can influence the data
Reinforcement learning
• Example 1: learning to play Go
• +/- reward for winning / losing the game
Reinforcement learning
• Example 2: teaching a robot to ride a bike
• +/- reward for moving forward / falling
Reinforcement learning
• Example 3: learning to play Pong from image pixels
• +/- reward for increasing
• personal / adversary score
Reinforcement learning paradigm
Formalizing as Markov Decision
Process
Formalizing as Markov Decision
Process
Formalizing as Markov Decision
Process
• Solution based on dynamic programming (small
graphs) or approximation (large graphs)
• Goal: select the actions that maximize the total final
reward
• The actions can have long-term consequences
• Sacrificing the immediate reward can lead to higher
rewards on the long term
Formalizing as Markov Decision
Process
• AlphaGo example:
Narrator 1: “That’s a very strange move”
Narrator 2: “I thought it was a mistake”
But actually, “the move turned the course of the
game. AlphaGo went on to win Game Two, and at
the post-game press conference, Lee Sedol was in
shock.”
https://www.wired.com/2016/03/two-moves-alphago-
lee-sedol-redefined-future/
Active learning
• Given a large set of unlabeled samples, we have to
choose a small subset for annotation in order to
obtain a good classification model
Transfer learning
• Starting with a model trained for a certain task /
domain, use the model for a different task / domain