Skip to content

Model Deployment at scale on Kubernetes 🦄️

License

Notifications You must be signed in to change notification settings

knechtionscoding/Yatai

 
 

Repository files navigation

🦄️ Yatai: Model Deployment at scale on Kubernetes

actions_status join_slack

Yatai helps ML teams to deploy large scale model serving workloads on Kubernetes. It standardizes BentoML deployment on Kubernetes, provides UI for managing all your ML models and deployments in one place, and enables advanced GitOps and CI/CD workflow.

👉 Pop into our Slack community! We're happy to help with any issue you face or even just to meet you and hear what you're working on :)

Core features:

  • Deployment Automation - deploy Bentos as auto-scaling API endpoints on Kubernetes and easily rollout new versions
  • Bento Registry - manage all your team's Bentos and Models, backed by cloud blob storage (S3, MinIO)
  • Observability - monitoring dashboard helping users to identify model performance issues
  • CI/CD - flexible APIs for integrating with your training and CI pipelines

yatai-overview-page

See more product screenshots yatai-deployment-creation yatai-bento-repos yatai-model-detail yatai-cluster-components yatai-deployment-details yatai-activities

Why Yatai

  • Yatai is built upon BentoML, the unified model serving framework that is high-performing and feature-rich
  • Yatai focus on the model serving and deployment part of your MLOps stack, works well with any ML training/monitoring platforms, such as AWS SageMaker or MLFlow
  • Yatai is Kubernetes native, integrates well with other cloud native tools in the K8s eco-system
  • Yatai is human-centric, provides easy-to-use Web UI and APIs for ML scientists, MLOps engineers, and project managers

Getting Started

1. Install Yatai locally with Minikube
  • Prerequisites:
  • Start a minikube Kubernetes cluster: minikube start --cpus 4 --memory 4096
  • Install Yatai Helm Chart:
    helm repo add yatai https://bentoml.github.io/yatai-chart
    helm repo update
    helm install yatai yatai/yatai -n yatai-system --create-namespace
  • Wait for installation to complete, this may take a few minutes to complete: helm status yatai -n yatai-system
  • Start minikube tunnel for accessing Yatai UI: sudo minikube tunnel
  • Get initialization link for creating your admin account:
    export YATAI_INITIALIZATION_TOKEN=$(kubectl get secret yatai --namespace yatai-system -o jsonpath="{.data.initialization_token}" | base64 --decode)
    echo "Visit: http://yatai.127.0.0.1.sslip.io/setup?token=$YATAI_INITIALIZATION_TOKEN"
2. Get an API token and login BentoML CLI
3. Pushing Bento to Yatai
  • Train a sample ML model and build a Bento using code from the BentoML Quickstart Project:
    git clone https://github.com/bentoml/gallery.git && cd ./gallery/quickstart
    pip install -r ./requirements.txt
    python train.py
    bentoml build
  • Push your newly built Bento to Yatai:
    bentoml push iris_classifier:latest
4. Create your first deployment!
  • A Bento Deployment can be created via Web UI or via kubectl command:

    • Deploy via Web UI

    • Deploy directly via kubectl command:

      • Define your Bento deployment in a my_deployment.yaml file:
          apiVersion: serving.yatai.ai/v1alpha2
          kind: BentoDeployment
          metadata:
            name: my-bento-deployment
            namespace: my-namespace
          spec:
            bento_tag: iris_classifier:3oevmqfvnkvwvuqj
            ingress:
              enabled: true
            resources:
              limits:
                  cpu: "500m"
                  memory: "512m"
              requests:
                  cpu: "250m"
                  memory: "128m"
            autoscaling:
              max_replicas: 10
              min_replicas: 2
            runners:
            - name: iris_clf
              resources:
                limits:
                  cpu: "1000m"
                  memory: "1Gi"
                requests:
                  cpu: "500m"
                  memory: "512m"
                autoscaling:
                  max_replicas: 4
                  min_replicas: 1
      • Apply the deployment to your minikube cluster
        kubectl apply -f my_deployment.yaml
  • Monitor deployment process on Web UI and test out endpoint when deployment created

    curl \                                                                                                                                                      
        -X POST \
        -H "content-type: application/json" \
        --data "[[5, 4, 3, 2]]" \
        https://demo-default-yatai-127-0-0-1.apps.yatai.dev/classify
5. Moving to production
  • See Administrator's Guide for a comprehensive overview for deploying and configuring Yatai for production use.

Community

Contributing

There are many ways to contribute to the project:

  • If you have any feedback on the project, share it with the community in GitHub Discussions under the BentoML repo.
  • Report issues you're facing and "Thumbs up" on issues and feature requests that are relevant to you.
  • Investigate bugs and reviewing other developer's pull requests.
  • Contributing code or documentation to the project by submitting a GitHub pull request. See the development guide.

Licence

Elastic License 2.0 (ELv2)

About

Model Deployment at scale on Kubernetes 🦄️

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • TypeScript 52.0%
  • Go 41.2%
  • CSS 3.3%
  • PLpgSQL 1.0%
  • Nix 0.7%
  • Smarty 0.4%
  • Other 1.4%