Skip to content

ndenStanford/ml-mesh

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

 Onclusive ML Mesh

This repo contains the codebase for the machine learning APIs service mesh.

Repository organisation

This repository contains the modular implementation of the logic powering Onclusive's ML stack:

  1. internal libraries
  2. internally maintained core images
  3. applications serving internally maintained models
  4. applications serving externally maintained models

Libraries

A top-level doc on can be found here

All internal libraries can be found here. See individual library for detailed documentation.

Core images

An overview of developer utilities and existing images on can be found here

All internal core images can be found here. See individual core image for detailed documentation.

Projects

ML projects are decomposed into multiple pre-defined steps that represent an abstraction of a model lifecycle at Onclusive.

An overview of developer utilities and existing images on can be found here

  • ingest: if the data needed for training is external to Onclusive, an ingest step is needed to bring data into our internal storage.
  • register: register features to be used in training component.
  • train: model training and registering to internal model registry.
  • compile: model compilation (optimized for serving) and registering to internal model registry
  • serve: model served as a REST API.
  • backfill: backfilling.

Strict abstraction boundaries help express the invariant and logical consistency of each component behaviour (input, processing and output). This allows us to create well defined patterns that can be applied specifically to implement each of these steps on new projects. Not all of these steps are mandatory: for instance, pre-trained model used for zero-shot learning will not have a prepare and train step.

Apps

An overview of developer utilities and existing images on can be found here

All internal core images can be found here. See individual app for detailed documentation.

Developing

Setting up your local environment

If you are on MacOS, you can run the script ./bin/bootstrap/darwin that will set up your local machine for development. If you are using Linux, use ./bin/bootstrap/linux.

Windows setup is not supported yet - to be explored. If you want to contribute to this please reach out to the MLOPs team.

Setup AWS credentials

Setup your aws credentials for dev and prod ML accounts (Default region name  should be us-east-2 and Default output format should be JSON). Ask @mlops on slack to get your credentials created if you don't have them already.

For your dev credentials:

aws configure --profile dev

For your prod credentials:

aws configure --profile prod

You can also switch profiles at any time by updating the environment variable as follows

export AWS_PROFILE=dev

Build all base images

As all images used in projects and apps are based on our core docker images. It helps save time to build all images. Run the command

make docker.build/python-base
make docker.build/gpu-base
make docker.build/neuron-inference

It takes about 10 minutes to run, go stretch your legs, get a coffee, or consult our Contribution Guide.

Contributing to the codebase

See here for a detailed step-by-step guide on how to contribute to the mesh.

Dependency management

Python is our language of choice. In order to manage versions effectively, we recommend to use pyenv. In order to setup your environment with the repository official version.

Common issues

Poetry command takes longer to run

If poetry commands take longer to run, it's a good idea to clear the pypi cache:

poetry cache clear pypi --all

Docker-compose tries to download images instead of building (MacOS)

The error message is:

Failed to solve with frontend dockerfile.v0: failed to create LLB definition: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope:
authorization failed

Run the following command:

export DOCKER_BUILDKIT=0

No space left on disk (remote instance)

If you run into this error, you can use the make command:

make clean

Resources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published