Starred repositories
A high-throughput and memory-efficient inference and serving engine for LLMs
⚡ TabPFN: Foundation Model for Tabular Data ⚡
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
A generative world for general-purpose robotics & embodied AI learning.
Scalable and user friendly neural 🧠 forecasting algorithms.
Chronos: Pretrained Models for Probabilistic Time Series Forecasting
An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Let your Claude able to think
Investment Research for Everyone, Everywhere.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.
Simple, unified interface to multiple Generative AI providers
Speech To Speech: an effort for an open-sourced and modular GPT4-o
peilongchencc / My-GLM-4-Voice
Forked from THUDM/GLM-4-Voiceubuntu 系统下 GLM-4-Voice 部署经验分享
A curated list of reinforcement learning with human feedback resources (continually updated)
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
🤖 Build voice-based LLM agents. Modular + open source.
🔊 Text-Prompted Generative Audio Model
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone