Using Django, Docker and Scikit-Learn To Bootstrap Your Machine Learning Project
Using Django, Docker and Scikit-Learn To Bootstrap Your Machine Learning Project
http://bit.ly/2s5R01V http://bit.ly/2s5R01V
How I’ll
approach
1. Review of machine learning
today’s 2. Anatomy of a data science team
chat. 3.
4.
Engineering a machine learning problem
Iterating on machine learning
engineering with Docker, Django, and
sci-kit learn (sklearn)
What is machine learning?
Machine Learning
http://bit.ly/2s5R01V
Task: Classify a
piece of data
Is a pizza request
successful? Is it
altruistic or not?
Experience:
Labeled training
data
Request_id | No
Request_id | Yes
Performance
Measurement: Is
the label correct?
http://bit.ly/2s5R01V
Why Python has been adopted by
scientific community
http://bit.ly/2s5R01V
Feature engineering is expensive,
it takes time to:
- Shape the data
- Select which features to
use
- Collect data!
X, y = get_xy()
X_train, X_test, y_train, y_test =
train_test_split(X, y, random_state=1111)
Also, according to Kelsey Hightower, “the first rule of Python is you don’t
use the system installed version of Python”
http://bit.ly/2s5R01V
Example Dockerfile
FROM python:3
RUN pip install virtualenv
WORKDIR /home/jupyter
http://bit.ly/2s5R01V
Model versioning
Example Dockerfile with a volume
Docker volumes allow a mountable data directory, permitting an individual to check in and out
new notebooks as they see fit
...
VOLUME ["/home/jupyter/notebooks"]
Whenever the data scientist and/or other team member is ready to save their work, the pickled
model when saved inside a Docker container will automatically save to the mounted data volume
directory
http://bit.ly/2s5R01V
Django-izing Docker + sklearn
model
Process for updating a model
Now the process becomes:
http://bit.ly/2s5R01V
Wrap docker-py into Django endpoint
from docker import APIClient
from io import BytesIO
try:
urlpatterns = [
url(r'^create/image/(?P<model>\w{0,50})',
with open(path, 'r') as d:
dockerfile = [x.strip() for x in d.readlines()] create_image, name='create_image'),
dockerfile = ' '.join(dockerfile)
dockerfile = bytes(dockerfile.encode('utf-8'))
]
f = BytesIO(dockerfile)
For more information on the Docker Python SDK
# Point to the Docker instance reference the docs on the low level API here
cli = APIClient(base_url='tcp://192.168.99.100:2376')
http://bit.ly/2s5R01V
Want to learn more?
Talks: