Stars
Make bilingual epub books Using AI translate
Python tool for converting files and office documents to Markdown.
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
zero-shot voice conversion & singing voice conversion, with real-time support
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
https://hf.co/hexgrad/Kokoro-82M
A self-hosted Telegram file downloader for continuous, stable, and unattended downloads.
Clean your macOS with a script, not an expensive app
End to end, high speed, and privately self-host free version of Google Translate - 低占用速度快可私有部署的自由版 Google 翻译
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
An autoregressive character-level language model for making more things
Effortlessly run LLM backends, APIs, frontends, and services with one command.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
🎨 Refly is an open-source AI-native creation engine. Its intuitive free-form canvas interface combines multi-threaded dialogues, artifacts, AI knowledge base integration, chrome extension clip & sa…
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
A python program that turns an LLM, running on Ollama, into an automated researcher, which will with a single query determine focus areas to investigate, do websearches and scrape content from vari…
RooVetGit / Roo-Code
Forked from cline/clineRoo Code (prev. Roo Cline) gives you a whole dev team of AI agents in your code editor.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
AI wearables. Put it on, speak, transcribe, automatically