A real-time conversational AI bot powered by Pipecat and deployed on Modal. This project features an interactive RAG (Retrieval-Augmented Generation) system with real-time speech-to-speech interaction.
git clone git@github.com:modal-projects/open-source-av-ragbot.git
cd open-source-av-ragbotThis project uses uv for Python package management:
# Install dependencies
uv sync
# Activate virtual environment
source .venv/bin/activateGo to modal.com and make an account if you don't have one.
# Authenticate your Modal installation
modal setupcd client
npm inpm run build
# return to root dir
cd ..The project consists of multiple Modal services that need to be deployed:
# From the root dir of the project
# Deploy an LLM Service
# etiher VLLM inference server for optimized TTFT
modal deploy -m server.llm.vllm_server
# or use SGLang server for
# aster cold starts with GPU snapshots
modal deploy -m server.llm.sglang_server
# Deploy Parakeet STT service
modal deploy -m server.stt.parakeet_stt
# Deploy Kokoro TTS service
modal deploy -m server.tts.kokoro_tts
# Deploy main bot application with frontend
modal deploy -m appWe can speed up the cold start time of our bot (this is more important) and our Parakeet and LLM service (if using SGLang) using snapshots. However this leads to extra start up time for the first few containers when the apps are (re-)deployed. To warmup snapshots, you can run these files as Python scripts.
python -m server.stt.parakeet_stt
python -m server.llm.sglang_server
python -m app