Machine Learning in
Science and Engineering
Gunnar Rtsch
Friedrich Miescher Laboratory
Max Planck Society
Tbingen, Germany
http://www.tuebingen.mpg.de/~raetsch
Gunnar Rtsch
Machine Learning in Science and Engineering
1
CCC Berlin, December 27, 2004
Roadmap
Motivating Examples
Some Background
Boosting & SVMs
Applications
Rationale: Let computers learn to automate
processes and to understand highly
complex data
Gunnar Rtsch
Machine Learning in Science and Engineering
2
CCC Berlin, December 27, 2004
Example 1: Spam Classification
From:
[email protected]Subject: Congratulations
Date: 16. December 2004 02:12:54 MEZ
From:
[email protected]
Subject: ML Positions in Santa Cruz
Date: 4. December 2004 06:00:37 MEZ
LOTTERY COORDINATOR,
INTERNATIONAL PROMOTIONS/PRIZE AWARD DEPARTMENT.
SMARTBALL LOTTERY, UK.
We have a Machine Learning position
at Computer Science Department of
the University of California at Santa Cruz
(at the assistant, associate or full professor level).
DEAR WINNER,
WINNER OF HIGH STAKES DRAWS
Congratulations to you as we bring to your notice, the
results of the the end of year, HIGH STAKES DRAWS of
SMARTBALL LOTTERY UNITED KINGDOM. We are happy to inform you
that you have emerged a winner under the HIGH STAKES DRAWS
SECOND CATEGORY,which is part of our promotional draws. The
draws were held on15th DECEMBER 2004 and results are being
officially announced today. Participants were selected
through a computer ballot system drawn from 30,000
names/email addresses of individuals and companies from
Africa, America, Asia, Australia,Europe, Middle East, and
Oceania as part of our International Promotions Program.
Current faculty members in related areas:
Machine Learning:
DAVID HELMBOLD and MANFRED WARMUTH
Artificial Intelligence:
BOB LEVINSON
DAVID HAUSSLER was one of the main ML researchers in our
department. He now has launched the new Biomolecular Engineering
department at Santa Cruz
There is considerable synergy for Machine Learning at Santa
Cruz:
-New department of Applied Math and Statistics with an emphasis
on Bayesian Methods http://www.ams.ucsc.edu/
-- New department of Biomolecular Engineering
http://www.cbse.ucsc.edu/
Goal: Classify emails into spam / no spam
How? Learn from previously classified emails!
Training:
analyze previous emails
Application: classify new emails
Gunnar Rtsch
Machine Learning in Science and Engineering
3
CCC Berlin, December 27, 2004
Example 2: Drug Design
Chemist
F
F
Inactives
F
N N
HO
Cl
HO
HO
N N
HO
N
N
H
O
O
N
S
O
Cl
OH
OH HN
OH
HO
OH
N
Cl
Actives
Gunnar Rtsch
Machine Learning in Science and Engineering
4
CCC Berlin, December 27, 2004
The Drug Design Cycle
Chemist
F
F
Inactives
F
N N
HO
Cl
HO
HO
N N
HO
N
N
H
O
O
N
S
O
Cl
OH
OH HN
OH
HO
OH
N
Cl
Actives
former CombiChem
technology
Gunnar Rtsch
Machine Learning in Science and Engineering
5
CCC Berlin, December 27, 2004
The Drug Design Cycle
Learning
F
F
Inactives
N N
HO
Cl
HO
HO
N N
HO
N
N
H
O
O
N
S
O
Cl
OH
OH HN
OH
HO
OH
N
Cl
Machine
Actives
former CombiChem
technology
Gunnar Rtsch
Machine Learning in Science and Engineering
6
CCC Berlin, December 27, 2004
Example 3: Face Detection
Gunnar Rtsch
Machine Learning in Science and Engineering
7
CCC Berlin, December 27, 2004
Premises for Machine Learning
Supervised Machine Learning
Observe N training examples with label
Learn function
Predict label of unseen example
Examples generated from statistical process
Relationship between features and label
Assumption: unseen examples are generated
from same or similar process
Gunnar Rtsch
Machine Learning in Science and Engineering
8
CCC Berlin, December 27, 2004
Problem Formulation
Natural
+1
Plastic
-1
Natural
+1
Plastic
-1
The World:
Data
Unknown Target Function
Unknown Distribution
Objective
Problem:
Gunnar Rtsch
is unknown
Machine Learning in Science and Engineering
9
CCC Berlin, December 27, 2004
Problem Formulation
Gunnar Rtsch
Machine Learning in Science and Engineering
10
CCC Berlin, December 27, 2004
Example: Natural vs. Plastic Apples
Gunnar Rtsch
Machine Learning in Science and Engineering
11
CCC Berlin, December 27, 2004
Example: Natural vs. Plastic Apples
Gunnar Rtsch
Machine Learning in Science and Engineering
12
CCC Berlin, December 27, 2004
Example: Natural vs. Plastic Apples
Gunnar Rtsch
Machine Learning in Science and Engineering
13
CCC Berlin, December 27, 2004
AdaBoost (Freund & Schapire, 1996)
Idea:
Use simple many rules of thumb
Simple hypotheses are not perfect!
Hypotheses combination => increased accuracy
Problems
How to generate different hypotheses?
How to combine them?
Method
Compute distribution
on examples
Find hypothesis on the weighted sample
Combine hypotheses
linearly:
Gunnar Rtsch
Machine Learning in Science and Engineering
14
CCC Berlin, December 27, 2004
Boosting: 1st iteration (simple hypothesis)
Gunnar Rtsch
Machine Learning in Science and Engineering
15
CCC Berlin, December 27, 2004
Boosting: recompute weighting
Gunnar Rtsch
Machine Learning in Science and Engineering
16
CCC Berlin, December 27, 2004
Boosting: 2nd iteration
Gunnar Rtsch
Machine Learning in Science and Engineering
17
CCC Berlin, December 27, 2004
Boosting: 2nd hypothesis
Gunnar Rtsch
Machine Learning in Science and Engineering
18
CCC Berlin, December 27, 2004
Boosting: recompute weighting
Gunnar Rtsch
Machine Learning in Science and Engineering
19
CCC Berlin, December 27, 2004
Boosting: 3rd hypothesis
Gunnar Rtsch
Machine Learning in Science and Engineering
20
CCC Berlin, December 27, 2004
Boosting: 4rd hypothesis
Gunnar Rtsch
Machine Learning in Science and Engineering
21
CCC Berlin, December 27, 2004
Boosting: combination of hypotheses
Gunnar Rtsch
Machine Learning in Science and Engineering
22
CCC Berlin, December 27, 2004
Boosting: decision
Gunnar Rtsch
Machine Learning in Science and Engineering
23
CCC Berlin, December 27, 2004
AdaBoost Algorithm
Gunnar Rtsch
Machine Learning in Science and Engineering
24
CCC Berlin, December 27, 2004
AdaBoost algorithm
Combination of
Decision stumps/trees
Neural networks
Heuristic rules
Further reading
http://www.boosting.org
http://www.mlss.cc
Gunnar Rtsch
Machine Learning in Science and Engineering
25
CCC Berlin, December 27, 2004
property 2
Linear Separation
property 1
Gunnar Rtsch
Machine Learning in Science and Engineering
26
CCC Berlin, December 27, 2004
Linear Separation
property 2
property 1
Gunnar Rtsch
Machine Learning in Science and Engineering
27
CCC Berlin, December 27, 2004
Linear Separation with Margins
property 2
property 2
property 1
m
ar
gi
n
property 1
large margin => good generalization
Gunnar Rtsch
Machine Learning in Science and Engineering
28
CCC Berlin, December 27, 2004
Large Margin Separation
Idea:
Find hyperplane
that maximizes margin
(with
Use
)
for prediction
m
ar
gi
n
Solution:
Linear combination of examples
many s are zero
Support Vector Machines
Demo
Gunnar Rtsch
Machine Learning in Science and Engineering
29
CCC Berlin, December 27, 2004
Kernel Trick
Linear in
input space
Gunnar Rtsch
Non-linear in Linear in
input space
feature space
Machine Learning in Science and Engineering
30
CCC Berlin, December 27, 2004
Example: Polynomial Kernel
Gunnar Rtsch
Machine Learning in Science and Engineering
31
CCC Berlin, December 27, 2004
Support Vector Machines
Demo: Gaussian Kernel
Many other algorithms can use kernels
Many other application specific kernels
Gunnar Rtsch
Machine Learning in Science and Engineering
32
CCC Berlin, December 27, 2004
Capabilities of Current Techniques
Theoretically & algorithmically well understood:
Classification with few classes
Regression (real valued)
Novelty Detection
Bottom Line: Machine Learning
works well for relatively simple
objects with simple properties
Current Research
Complex objects
Many classes
Complex learning setup (active learning)
Prediction of complex properties
Gunnar Rtsch
Machine Learning in Science and Engineering
33
CCC Berlin, December 27, 2004
Capabilities of Current Techniques
Theoretically & algorithmically well understood:
Classification with few classes
Regression (real valued)
Novelty Detection
Bottom Line: Machine Learning
works well for relatively simple
objects with simple properties
Current Research
Complex objects
Many classes
Complex learning setup (active learning)
Prediction of complex properties
Gunnar Rtsch
Machine Learning in Science and Engineering
34
CCC Berlin, December 27, 2004
Capabilities of Current Techniques
Theoretically & algorithmically well understood:
Classification with few classes
Regression (real valued)
Novelty Detection
Bottom Line: Machine Learning
works well for relatively simple
objects with simple properties
Current Research
Complex objects
Many classes
Complex learning setup (active learning)
Prediction of complex properties
Gunnar Rtsch
Machine Learning in Science and Engineering
35
CCC Berlin, December 27, 2004
Capabilities of Current Techniques
Theoretically & algorithmically well understood:
Classification with few classes
Regression (real valued)
Novelty Detection
Bottom Line: Machine Learning
works well for relatively simple
objects with simple properties
Current Research
Complex objects
Many classes
Complex learning setup (active learning)
Prediction of complex properties
Gunnar Rtsch
Machine Learning in Science and Engineering
36
CCC Berlin, December 27, 2004
Capabilities of Current Techniques
Theoretically & algorithmically well understood:
Classification with few classes
Regression (real valued)
Novelty Detection
Bottom Line: Machine Learning
works well for relatively simple
objects with simple properties
Current Research
Complex objects
Many classes
Complex learning setup (active learning)
Prediction of complex properties
Gunnar Rtsch
Machine Learning in Science and Engineering
37
CCC Berlin, December 27, 2004
Many Applications
Handwritten Letter/Digit recognition
Face/Object detection in natural scenes
Brain-Computer Interfacing
Gene Finding
Drug Discovery
Intrusion Detection Systems (unsupervised)
Document Classification (by topic, spam mails)
Non-Intrusive Load Monitoring of electric appliances
Company Fraud Detection (Questionaires)
Fake Interviewer identification in social studies
Optimized Disk caching strategies
Optimal Disk-Spin-Down prediction
Gunnar Rtsch
Machine Learning in Science and Engineering
38
CCC Berlin, December 27, 2004
MNIST Benchmark
SVM with polynomial kernel
(considers d-th order correlations of pixels)
Gunnar Rtsch
Machine Learning in Science and Engineering
39
CCC Berlin, December 27, 2004
MNIST Error Rates
Gunnar Rtsch
Machine Learning in Science and Engineering
40
CCC Berlin, December 27, 2004
Face Detection
1.
Classifier
face
non-face
2. Search
2 ( l 1)
600
450
0
.
7
l =1
Gunnar Rtsch
Machine Learning in Science and Engineering
525,820 patches
41
CCC Berlin, December 27, 2004
Fast Face Detection
Note: for easy patches, a
quick and inaccurate
classification is sufficient.
Method: sequential
approximation of the classifier
in a Hilbert space
Result: a set of face detection
filters
Romdhani, Blake, Schlkopf, & Torr, 2001
Gunnar Rtsch
Machine Learning in Science and Engineering
CCC Berlin, December 27, 2004
42
Example: 1280x1024 Image
1 Filter, 19.8% patches left
Gunnar Rtsch
Machine Learning in Science and Engineering
CCC Berlin, December 27, 2004
43
Example: 1280x1024 Image
10 Filters, 0.74% Patches left
Gunnar Rtsch
Machine Learning in Science and Engineering
CCC Berlin, December 27, 2004
44
Example: 1280x1024 Image
20 Filters, 0.06% Patches left
Gunnar Rtsch
Machine Learning in Science and Engineering
CCC Berlin, December 27, 2004
45
Example: 1280x1024 Image
30 Filters, 0.01% Patches left
Gunnar Rtsch
Machine Learning in Science and Engineering
CCC Berlin, December 27, 2004
46
Example: 1280x1024 Image
70 Filters, 0.007 % patches left
Gunnar Rtsch
Machine Learning in Science and Engineering
CCC Berlin, December 27, 2004
47
Single Trial Analysis of EEG:towards BCI
Gabriel Curio
Neurophysics Group
Dept. of Neurology
Klinikum Benjamin
Franklin
Freie Universitt
Berlin, Germany
Gunnar Rtsch
Benjamin Blankertz
Klaus-Robert Mller
Intelligent Data Analysis Group, Fraunhofer-FIRST
Berlin, Germany
Machine Learning in Science and Engineering
48
CCC Berlin, December 27, 2004
Cerebral Cocktail Party Problem
Gunnar Rtsch
Machine Learning in Science and Engineering
49
CCC Berlin, December 27, 2004
The Cocktail Party Problem
How to decompose superimposed signals?
Analogous Signal Processing problem as for cocktail party problem
Gunnar Rtsch
Machine Learning in Science and Engineering
50
CCC Berlin, December 27, 2004
The Cocktail Party Problem
input: 3 mixed signals
algorithm: enforce independence
(independent component analysis)
via temporal de-correlation
output: 3 separated signals
"Imagine that you are on the edge of a lake and a friend challenges you to play a game. The game
is this: Your friend digs two narrow channels up from the side of the lake []. Halfway up each one,
your friend stretches a handkerchief and fastens it to the sides of the channel. As waves reach the
side of the lake they travel up the channels and cause the two handkerchiefs to go into motion. You
are allowed to look only at the handkerchiefs and from their motions to answer a series of
questions: How many boats are there on the lake and where are they? Which is the most powerful
one? Which one is closer? Is the wind blowing? (Auditory Scene Analysis, A. Bregman )
(Demo: Andreas Ziehe, Fraunhofer FIRST, Berlin)
Gunnar Rtsch
Machine Learning in Science and Engineering
51
CCC Berlin, December 27, 2004
Minimal Electrode Configuration
coverage: bilateral primary
sensorimotor cortices
27 scalp electrodes
reference: nose
bandpass: 0.05 Hz - 200 Hz
ADC 1 kHz
downsampling to 100 Hz
EMG (forearms bilaterally):
m. flexor digitorum
EOG
event channel:
keystroke timing
(ms precision)
Gunnar Rtsch
Machine Learning in Science and Engineering
52
CCC Berlin, December 27, 2004
Single Trial vs. Averaging
[V]
15
10
5
0
-5
-10
-15
LEFT
hand
(ch. C4)
15
10
5
0
-5
-10
-15
-600 -500 -400 -300 -200 -100
[V]
RIGHT
hand
(ch. C3)
15
10
5
0
-5
-10
-15
0 [ms]
15
10
5
0
-5
-10
-15
-500 -400 -300 -200 -100
Gunnar Rtsch
-600 -500 -400 -300 -200 -100
0 [ms]
0 [ms]
Machine Learning in Science and Engineering
-500 -400 -300 -200 -100
0 [ms]
53
CCC Berlin, December 27, 2004
BCI Setup
ACQUISITION modes:
- few single electrodes
- 32-128 channel electrode caps
- subdural macroelectrodes
- intracortical multi-single-units
EEG parameters:
- slow cortical potentials
- /_ amplitude modulations
- Bereitschafts-/motor-potential
TASK alternatives:
- feedback control
- imagined movements
- movement (preparation)
- mental state diversity
Gunnar Rtsch
Machine Learning in Science and Engineering
54
CCC Berlin, December 27, 2004
Gunnar Rtsch
Machine Learning in Science and Engineering
55
CCC Berlin, December 27, 2004
Finding Genes on Genomic DNA
Splice Sites: on the boundary
Exons (may code for protein)
Introns (noncoding)
Coding region starts with Translation Initiation Site (TIS: ATG)
Gunnar Rtsch
Machine Learning in Science and Engineering
56
CCC Berlin, December 27, 2004
Application: TIS Finding
Engineering Support Vector Machine (SVM) Kernels
That Recognize Translation Initiation Sites (TIS)
GMD.SCAI
Institute for Algorithms
and Scientific Computing
Gunnar Rtsch
Alexander Zien
Thomas Lengauer
GMD.FIRST
Institute for
Computer Architecture
and Software Technology
Machine Learning in Science and Engineering
Gunnar Rtsch
Sebastian Mika
Bernhard Schlkopf
Klaus-Robert Mller
57
CCC Berlin, December 27, 2004
TIS Finding: Classification Problem
Select candidate positions
for TIS by looking for ATG
A (1,0,0,0,0)
C (0,1,0,0,0)
G (0,0,1,0,0)
T (0,0,0,1,0)
N (0,0,0,0,1)
Build fixed-length sequence
representation of candidates
(...,0,1,0,0,0,0,...)
Transform sequence into
representaion in real space
1000-dimensional real space
Gunnar Rtsch
Machine Learning in Science and Engineering
58
CCC Berlin, December 27, 2004
2-class Splice Site Detection
Window of 150nt
around known splice sites
Positive examples: fixed window around a true splice site
Negative examples: generated by shifting the window
Design of new Support Vector Kernel
Gunnar Rtsch
Machine Learning in Science and Engineering
59
CCC Berlin, December 27, 2004
The Drug Design Cycle
Learning
F
F
Inactives
N N
HO
Cl
HO
HO
N N
HO
N
N
H
O
O
N
S
O
Cl
OH
OH HN
OH
HO
OH
N
Cl
Machine
Actives
former CombiChem
technology
Gunnar Rtsch
Machine Learning in Science and Engineering
60
CCC Berlin, December 27, 2004
Three types of Compounds/Points
few
Gunnar Rtsch
actives
more
inactives
plenty
untested
Machine Learning in Science and Engineering
61
CCC Berlin, December 27, 2004
Shape/Feature Descriptor
bit =
Shape
Feature type
Feature location
bit number
254230
Shape/Feature Signature
Shape i
~105 bits
Shape j
S. Putta, A Novel Shape/Feature Descriptor, 2001
Gunnar Rtsch
Machine Learning in Science and Engineering
62
CCC Berlin, December 27, 2004
Maximizing the Number of Hits
Total number of
active examples
selected
after each batch
On Thrombin
dataset
Largest Selection Strategy
Gunnar Rtsch
Machine Learning in Science and Engineering
63
CCC Berlin, December 27, 2004
Concluding Remarks
Computational Challenges
Algorithms can work with 100.000s of examples
(need
operations
Usually model parameters to be tuned
(cross-validation computationally expensive)
Need computer clusters and
Job scheduling systems (pbs, gridengine)
Often use MATLAB
(to be replaced by python: help!)
Machine learning is an exciting research area
involving Computer Science, Statistics & Mathematics
with
a large number of present and future applications (in all situations
where data is available, but explicit knowledge is scarce)
an elegant underlying theory
and an abundance of questions to study.
New computational biology group in Tbingen: looking for people to hire
Gunnar Rtsch
Machine Learning in Science and Engineering
64
CCC Berlin, December 27, 2004
Thanks for Your Attention!
Gunnar Rtsch
http://www.tuebingen.mpg.de/~raetsch
[email protected]Collegues & Contributors: K. Bennett, G. Dornhege, A.
Jagota, M. Kawanabe, J. Kohlmorgen, S. Lemm, C. Lemmen, P.
Laskov, J. Liao, T. Lengauer, R. Meir, S. Mika, K-R. Mller,
T. Onoda, A. Smola, C. Schfer, B. Schlkopf, R. Sommer, S.
Sonnenburg, J. Srinivasan, K. Tsuda, M. Warmuth, J. Weston,
A. Zien
Special Thanks: Nora Toussaint, Julia Lning, Matthias Noll
Gunnar Rtsch
Machine Learning in Science and Engineering
65
CCC Berlin, December 27, 2004