00 Course Introduction
00 Course Introduction
Sprint 0 - Week 0
2024 H1
Copyright
Copyright
© 2021
© ML6.
2023 All
ML6.rights
All rights
reserved.
reserved.
ML6 Confidential
ML6 Public Information | 5
ML6 at a glance.
One of the largest and fastest growing AI business & engineering teams
in Europe since 2013.
MACHINE LEARNING
MACHINE LEARNING AI techniques that give machines the ability to learn from
data without being explicitly programmed, i.e. to
automatically improve through experience
13 trillion USD
Most of it will be outside the
consumer internet industry
…
Why do we need ML Systems Design?
Building a ML application means implementing much more than just your ML model.
INFO 9023 -
Machine
Learning Systems
Design
ML System: All the components responsible for the implementation and management of the data
and models powering an ML application.
ML Systems Design: The act of designing the architecture and implementing an ML System.
Output:
● Jupyter Notebook
● Single model working on static dataset
Output:
● Deployed model (e.g. API in the Cloud)
● Monitor live model performance
● Directly connected to data source
● Fully automated pipeline to train and deploy new models
● …
https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
Key concept: Data preparation
It all starts with data. How to go through all these steps efficiently and effectively.
Key concept: ML model serving
How to efficiently serve ML model to client.
(Labeled images:
ImageNet,
Captcha, …)
= Latency
Key concept: ML model deployment
How to efficiently deploy your model for serving.
Key concept: Containerisation
● Application code
● configuration files
● libraries
● dependencies
https://www.ibm.com/topics/containerization
Key concept: APIs
Allow other services to call your model or application.
https://www.postman.com/what-is-an-api/
Key concept: Cloud infrastructure
Cloud infrastructure allow for data storage, compute allocation, training and deploying
model, monitoring, …
Key concept: ML Pipeline
Orchestrates components to prepare data, train, evaluate and deploy ML models
(among other things)
https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
Key concept: Monitoring
Ensuring that models in production are performing well.
● Is the model still doing accurate predictions with the new data coming in?
● Is the data distribution changing?
● Is the target variable changing?
● Are concepts around the model changing?
7. 2. Technical
Accountabil robustness
ity & safety
Ethical + Robust
=
6. Societal & Trustworthy AI 3. Privacy &
environment data
al well-being governance
5. Diversity,
non-discrim 4.
ination, Transparen
fairness cy
https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai
Roles &
organisation of
ML projects
Typical ML project lifecycle
https://martinfowler.com/articles/cd4ml.html
Different
set of skills
per roles
https://www.linkedin.com/posts/maria-vechtomova_data-ai-mlops-activity-7125095266403143680-9X-U/
In reality it’s
a bit blurry
https://www.linkedin.com/posts/maria-vechtomova_data-ai-mlops-activity-7125095266403143680-9X-U/
In reality it’s
a bit blurry
https://www.linkedin.com/posts/maria-vechtomova_data-ai-mlops-activity-7125095266403143680-9X-U/
ML Engineering skills are in high demand
Teams can adopt different MLOps maturity levels
Level Highlights Technology
Level 0 ● Difficult to manage full ML model lifecycle ● Manual training, builds and deployments
No MLOps ● Teams are disparate and releases are painful ● Manual testing of model and application
● "black boxes," little feedback during/post deployment ● No centralized tracking of model performance
Level 2 ● Training environment is fully managed and traceable ● Automated model training
Automated ● Easy to reproduce model ● Centralized tracking of model training performance
Training ● Releases are manual, but low friction ● Model management
Level 3 ● Releases are low friction and automatic ● Integrated A/B testing of model performance
Automated ● Full traceability from deployment back to original data ● Automated tests for all code
Deployment ● Entire environment managed: dev > test > production ● Centralized tracking of model training performance
Level 4 ● Full system automated and easily monitored ● Automated model training and testing
Full MLOps ● Automated feedback collection and retraining ● Verbose, centralized metrics from deployed model
● Close to zero-downtime
https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/mlops-maturity-model#level-0-no-mlops
Study on demanded skills for MLOps engineers.
Looking at 310 job offers on MLOps in Q4 2023.
https://marvelousmlops.substack.com/p/what-do-teams-really-want-from-an?utm_source=profile&utm_medium=reader2
Going from standard ML Engineer to MLOps master…
Real-life example
of a MLOps system
Post features
Member features
Engagement features
https://www.linkedin.com/blog/engineering/trust-and-safety/viral-spam-content-detection-at-linkedin
Linkedin integrates many ML applications
Personalised LinkedIn News Feed
… Using boosted tree algorithm
Select personalised content for on the following features:
users…
Identity: Who are you? Where do you work? What are your
skills? Who are you connected with?
Behavior: What have you liked and shared in the past? Who
do you interact with most frequently? Where do you spend
the most time in your news feed?
https://www.linkedin.com/blog/engineering/feed/a-look-behind-the-ai-that-powers-linkedins-feed-sifting-through
Linkedin’s Productivity Machine Learning
(Pro-ML) platform.
Teams of
data scientists
https://www.linkedin.com/blog/engineering/machine-learning/one-stop-mlops-portal-at-linkedin
Linkedin Pro-ML platform.
Step: Model authoring, creation, and evaluation
https://www.linkedin.com/blog/engineering/machine-learning/one-stop-mlops-portal-at-linkedin
Linkedin Pro-ML platform.
Step: Model productionisation
https://www.linkedin.com/blog/engineering/machine-learning/one-stop-mlops-portal-at-linkedin
Linkedin Pro-ML
platform.
Step: “Health insurance”
(aka monitoring)
https://www.linkedin.com/blog/engineering/machine-learning/one-stop-mlops-portal-at-linkedin
Course organisation
Objective for this course.
We want to enable you with practical skills to go make positive impact with ML 🚀
We’ll cover key concepts of MLSD during our classes. We will also host Labs to show how to use key tools to
develop ML applications.
We’re happy if you learn useful things and can go apply your own ideas.
Structure of the course
Learning streams and pillars
Learning streams
Lectures
Labs
Group Project
Concrete Labs,
concepts of building ML resources, real life Healthy tempo (break
applications. Tailored examples, time to out exercises, QA, …).
choice of current best experiment, support line,
practices. … … lots of memes.
We’re making history
This is the first version of this class
Class organisation
● We meet every Monday from 9:00 to 12:30 in B28 R.75 (0/75) [Liège Sart-Tilman - Polytech].
● Typically you’ll have about 2h of lecture + labs. Remaining of the time can be spent working on your project.
Useful links:
Build one ML system throughout the course. The application is picked by yourself.
● Teams: 3 - 5 students
○ Try to form group by next week!
○ Let the teaching staff know if you don’t have a group and you’ll be assigned one
● Structure
○ The building blocks to be implemented in the project follow the course’s 5 sprints.
● Handovers
○ There will be 3 milestone meetings where you can present your results
○ Code submission - make sure to document anything you want the teaching staff to read
● Support
○ Often lectures/labs will be shorter than the time slot for this course. You can spend the extra time working with your team.
Teaching staff will be in the room to provide support.
○ Open office hours on Monday afternoon in office Number I 77 B in Montefiore
○ Feel free to reach out by email if you have any question/struggle
● You’re in the driving seat!
○ Many building blocks are optional. You are free to choose the overall design and tools used for your project. Experiment and ask
questions if you have any.
Project
Guiding principles