A sample chatbot built with Streamlit to interact with an open-source model hosted on AKS via KAITO.
To test this app, you need to have an AKS cluster with KAITO installed.
To run this example, you will need the following installed on your development machine:
Follow the instructions in the KAITO documentation to create an AKS cluster and install KAITO.
To provision a phi-3-mini workspace, run the following command.
kubectl apply -f https://raw.githubusercontent.com/kaito-project/kaito/main/examples/inference/kaito_workspace_phi_3.yaml
To test the app locally, the app must be able to communicate with the KAITO workspace. To do this, expose the workspace using LoadBalancer service type.
The sample manifest does not include a public IP, however KAITO docs provide instructions for provisioning a workspace with a public IP.
To add a public IP to an existing workspace, run the following command.
kubectl patch service workspace-phi-3-mini -p '{"spec":{"type":"LoadBalancer"}}'
Wait for the public IP to be provisioned then run the following command to set the environment variables.
export MODEL_ENDPOINT="http://$(kubectl get svc workspace-phi-3-mini -o jsonpath='{.status.loadBalancer.ingress[0].ip}')/v1/chat/completions"
export MODEL_NAME="phi-3-mini-128k-instruct"
Create a virtual environment and install the dependencies.
python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt
Run the app.
streamlit run main.py
To test this app on an AKS cluster, run the following command to deploy the kaito chat demo app in the same AKS cluster that is hosting the workspace.
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: kaitochatdemo
name: kaitochatdemo
spec:
replicas: 1
selector:
matchLabels:
app: kaitochatdemo
template:
metadata:
labels:
app: kaitochatdemo
spec:
containers:
- name: kaitochatdemo
image: ghcr.io/pauldotyu/kaitochat/kaitochatdemo:latest
resources: {}
env:
- name: MODEL_ENDPOINT
value: "http://workspace-phi-3-mini:80/v1/chat/completions"
- name: MODEL_NAME
value: "phi-3-mini-128k-instruct"
ports:
- containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
labels:
app: kaitochatdemo
name: kaitochatdemo
spec:
type: LoadBalancer
ports:
- port: 80
protocol: TCP
targetPort: 8501
selector:
app: kaitochatdemo
EOF
Wait for the public IP to be provisioned then run the following command to get the public IP.
echo "http://$(kubectl get svc kaitochatdemo -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
Click the link that is outputted in the terminal to open the app in your browser.
Note
Be sure to remove the resources publicly accessible service created in this step after you are done testing the app.
After you are done testing the app, you can follow the instructions in the KAITO documentation to remove the KAITO installation from your AKS cluster and/or delete the AKS cluster.