0% found this document useful (0 votes)
11 views29 pages

L1 Intro

Uploaded by

meldakarakis0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
11 views29 pages

L1 Intro

Uploaded by

meldakarakis0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 29

3/27/18

STAT/CSE 416: Intro to Machine Learning


Welcome
Emily Fox
University of Washington
March 27, 2018

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Machine learning is
changing the world

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

1
3/27/18

Old view of ML

My curve
ML is better Write a
Data Algorithm than your paper
curve

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Search Coupons
Retail

Movie Disruptive companies Networking


Distribution
differentiated by
Music INTELLIGENT Campaigning

APPLICATIONS
Advertising
using Real Estate

Human
Machine Learning
Resources Legal
Advice
Dating Wearables

CRM
Taxis

4 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

2
3/27/18

What is machine learning?

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Generically…
Study of algorithms that
improve their performance
at some task
with experience

6 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

3
3/27/18

The machine learning pipeline

ML
Data Intelligence
Method

7 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

ML case studies

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

4
3/27/18

Case Study 1:
Predicting house prices

ML
Data Regression Intelligence
Method

$ = ??
price ($)

$ $
$
+ house house size
features
9 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

What is regression?
From features to predictions

Data Regression Intelligence

Input x:
features derived Learn xày
from data
relationship Predict y:
continuous “output” or
“response” to input
10 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

5
3/27/18

Salary after STAT/CSE 416

hard work

• How much will your salary be? (y = $$)


• Depends on x = performance in courses, quality of
programming assignments, # of discussion responses, …

11 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Stock prediction
• Predict the price of a stock (y)
• Depends on x =
- Recent history of stock price
- News events
- Related commodities

12 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

6
3/27/18

Tweet popularity
• How many people will retweet your tweet?
• Depends on # followers, # of followers of followers,
features of text tweeted,
popularity of hashtag,
# of past retweets,…

13 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Reading your mind

Output y
very sad very happy

Inputs x are
brain region
intensities

14 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

7
3/27/18

Case Study 2:
Sentiment analysis

ML
Data Classification Intelligence
Method
Sushi was awesome,
the food was awesome,
but the service was awful. Score(x) < 0
All reviews:
“awful”

Score(x) > 0

“awesome”
15 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

What is classification?
From features to predictions

ML
Data Classifier Intelligence
Method

Input x:
features derived Learn xày
from data
relationship Predict y:
categorical “output”,
class or label
16 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

8
3/27/18

Spam filtering
Not spam

Text of email,
sender, IP,…
Spam

Input: x Output: y

17 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Multiclass classifier
Output y has more than 2 categories

Education

Finance

Technology

Input: x Output: y
Webpage

18 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

9
3/27/18

Image classification

Input: x Output: y
Image pixels Predicted object

19 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Personalized medical diagnosis


Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia

20 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

10
3/27/18

Reading your mind


Output y
“Hammer”

Inputs x are “House”


brain region
intensities

21 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Case Study 3:
Document retrieval

Nearest
ML
Data Intelligence
neighbor
Method

22 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

11
3/27/18

What is retrieval?
Search for related items

Nearest
ML
Data Intelligence
Neighbor
Method

Input x,{x’}:
features for Compute
query point
distances to Output xNN:
+
features of other x’ “nearest” point or
23
all other datapoints ©2018 Emily Fox set of points to query
STAT/CSE 416: Intro to Machine Learning

Retrieve “nearest neighbor” article


Space of all articles,
organized by similarity of text

query article

nearest neighbor
24 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

12
3/27/18

Or set of nearest neighbors


Space of all articles,
organized by similarity of text

query article

set of nearest neighbors


25 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Retrieval applications
Products
Just about everything…

Images

Social networks
(people you might want
Streaming content: to connect with)
- Songs News articles
- Movies
- TV shows
- …

26 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

13
3/27/18

Case Study 3++:


Document structuring for retrieval

ML
Data Clustering Intelligence
Method

SPORTS WORLD NEWS

ENTERTAINMENT SCIENCE
27 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

What is clustering?
Discover groups of similar inputs

ML
Data Clustering Intelligence
Method

Input {x}:
features for Separate
points in
points into Output {z}:
dataset
disjoint sets cluster labels per
datapoint
28 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

14
3/27/18

Clustering images
For search, group as:
- Ocean
- Pink flower
- Dog
- Sunset
- Clouds
-…

29 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Or users on websites…
Discover groups of
users for better
targeting of content

30 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

15
3/27/18

Embedding
Example: Embedding images to visualize data
ML
Data PCA Intelligence
Method

Can we give each


image a coordinate,
[Saul &
such that similar Roweis ‘03]
images are near
each other?

Images with
thousands or
millions of pixels
31 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Case Study 4:
Product recommendation
Matrix
ML
Data Intelligence
Factorization
Method
Your past purchases: Customers Recommended items:
features

features

features
+ purchase Products
histories of all features
customers
features

features
32 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

16
3/27/18

Recommender systems applications

Movies Songs Friends, apps, …

33 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Case Study 5:
Visual product recommender

Deep
ML
Data Intelligence
Method
Learning

Input images: Nearest neighbors:


Layer 1 Layer 2
1 1

x1 z1 y

x z
2 2

34 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

17
3/27/18

What is (supervised) deep learning?


Flexible method for performing classification or regression

Deep
ML
Data Intelligence
Method
Learning

Input {x}:
raw data or Nonlinear
extracted features
feature Output {z}:
for points in
dataset representation label or value
35 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

ImageNet 2012 competition:


1.2M training images, 1000 categories
Top 3 teams
0.3
Error (best of 5 guesses)

0.25
Huge
0.2
gain
0.15

0.1

0.05

0
SuperVision ISI OXFORD_VGG

Exploited hand-coded features like SIFT


©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

18
3/27/18

Examples of deep learning success stories


• Image classification
• Image segmentation
• Image captioning
• Object detection
• Speech recognition
• Speech synthesis
• Machine translation
• Handwriting recognition
• …

37 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Other ML topics we won’t cover


• Reinforcement learning
• Learning theory
• Active learning
• Multi-task and transfer learning
• Spectral methods
• …

38 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

19
3/27/18

Syllabus

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Will learn about the ML pipeline…

ML
Data Intelligence
Method

Model & Optimization


Task parameters algorithm Evaluation

Use pre-specified
(black box)
40 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

20
3/27/18

Level of the course


Motto:
tough concepts made intuitive and applicable

minimize prereq knowledge


maximize ability to develop and deploy
learn concepts through case studies

41 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Detailed •

Linear regression, regularized approaches (ridge, Lasso)
Linear classifiers: logistic regression
topics Models •

Non-linear models: decision trees
Nearest neighbors, clustering
• Recommender systems
• Deep learning

• Gradient descent
Algorithms • Boosting
• K-means

• Point estimation, MLE


• Loss functions, bias-variance tradeoff, cross-validation
Concepts • Sparsity, overfitting, model selection
• Decision boundaries

42 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

21
3/27/18

Course logistics

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Prerequisites
• Formally:
- Either CSE 143 or CSE 160; either STAT 311 or STAT/MATH 390 or STAT 391
• Basic Probability + Statistics
- Distributions, densities, independence, marginalization, conditioning,
expectation, variance…
• Programming
- Python will be very useful, but we’ll help you get started

• We provide some background, but the class will be fast paced!

• Ability to deal with “abstract mathematical concepts”

44 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

22
3/27/18

Computing needs
• Everything will be on JupyterHub
- Just need to log in
- No need to install and run Python locally
- Email with username/password

iPython notebooks are the thing!!!


(Real tool people use)
JupyterHub will make things seamless

45 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Course staff + office hours


Instructor:
• Emily Fox
- Office hours: Thursdays, 11:30am – 12:30pm, CSE 568

TAs:
• Devin Didericksen
- Office hours: Tuesday 3:30 – 5:00pm, 3rd floor CSE breakout

• Varun Mahadevan
- Office hours: Wednesdays, 12:30 – 2pm, 5th floor CSE breakout

• John Kaltenbach
- Office hours: TBA

• Hunter Schafer
- Office hours: Mondays, 12:30 – 2pm; Tuesdays 12:30 – 1:30pm, CSE 220

• Patrick Spieker
- Office hours: Wednesdays and Fridays, 10:30 – 11:30am, 3rd floor CSE breakout

46 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

23
3/27/18

Quiz Sections
• Important to attend weekly
• Topics:
- Intros to and demos of running things in Python
- Reinforcing concepts from lecture
- Bonus material to supplement lectures

47 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Communication channels – us to you


• Course email list
- Announcements from us. Please check your email!

• Course website
- https://courses.cs.washington.edu/courses/cse416/18sp/
- Lecture slides, quiz section handouts, high-level (static) course info

• Canvas
- Discussion board, access to concept quizzes, submissions of work, and grades

• Google calendar
- Live updates to schedules (also via email to course mailing list)
- Shared url to be announced…stay tuned

48 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

24
3/27/18

Communication channels – you to us/eachother


• Canvas discussion board
- For all non-personal questions
- Answering your question will help others
- Feel free to (and please do!) chime in
- Guidelines and expectations:
• Look through threads before posting a new one
• Reflect on question before posting
• Our goal is to respond within 24 hrs

• Instructor email list: cse416-staff@cs.washington.edu


- Only for personal issues
49 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Textbooks
• None! Come to lectures and quiz sections
- Annotated slides will be posted
- Quiz section handouts will be posted
- Blog posts and other sources will sometimes be referenced, too

• Optional Books:
- A Course in Machine Learning; Hal Duame III
http://ciml.info
- Machine Learning: A Probabilistic Perspective; Kevin Murphy
- Pattern Recognition and Machine Learning; Chris Bishop
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction;
Trevor Hastie, Robert Tibshirani, Jerome Friedman
https://web.stanford.edu/~hastie/ElemStatLearn/

50 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

25
3/27/18

Programming assignments
Programming assignments are hands-on experience with ML methods on
real data. The assignments are hard, start early J

Submission procedure and late policy:


• Use Canvas to submit code and answers related to running the code
• 2 late days per quarter, and then 33% subtracted per late day
• All assignments must be handed in, even for zero credit

Collaboration policy:
• You may discuss the questions
• Each student must write their own code and submit their own answers
- We will be using a cheating detection software
• Submit the names of anyone with whom you collaborate
• Please don’t search for answers on the web, Google, etc.
- please ask us if you are not sure if you can use a particular reference

51 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Exams
• Concept quizzes
- Online!!!
- Spread throughout the quarter
- At least one per major topic
- Primary purpose is to make sure you are following content
- Must be completed 100% individually

• Final
- Finals week
- Monday, June 4, 10:30-12:20 in MLR 301

52 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

26
3/27/18

Grading
• Programming assignments (60%)
- Start early, Start early, Start early, Start early, Start early, Start early,
Start early, Start early, Start early, Start early, Start early, Start early,
Start early, Start early, Start early, Start early, Start early, Start early
- Bonus Assignment 0 to get setup with tools (0%)

• Concept quizzes (15%)


- Bonus Concept quiz 0 to refresh prob/stat background (0%)

• Final (25%)

53 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Getting started in CSE 416


• Concept quiz 0
- Recall basic prob/stat topics
• Programming assignment 0
- Intro to iPython notebooks and Turi Create tutorial

• Resources:
- Java-to-Python guide (thanks to Hunter!)
- Videos on Python and Turi Create fundamentals
- Quiz section intro to running things on JupyterHub

54 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

27
3/27/18

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

You’ll be able to do
amazing things…

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

28
3/27/18

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

29

You might also like