Skip to content
View momoGBG's full-sized avatar

Block or report momoGBG

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Python 235 38 Updated Jan 21, 2025

OpenAI compatible TTS for Sesame CSM:1b - Voice Cloning from File/YT

Python 278 46 Updated Mar 25, 2025
Python 822 48 Updated Mar 24, 2025

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,937 700 Updated Apr 12, 2025

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,573 541 Updated Mar 24, 2025

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

TypeScript 49,029 4,629 Updated Apr 15, 2025

Towards Human-Sounding Speech

Python 4,065 327 Updated Apr 15, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 13,231 1,518 Updated Apr 16, 2025

Official repository for LTX-Video

Python 3,316 292 Updated Mar 5, 2025

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 970 125 Updated Mar 31, 2025
Python 968 59 Updated Mar 22, 2025

A Conversational Speech Generation Model

Python 12,522 1,127 Updated Mar 27, 2025
Python 48 25 Updated Jul 22, 2024

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,587 202 Updated Apr 15, 2025

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 249 29 Updated Mar 12, 2025

Enjoy the magic of Diffusion models!

Python 8,341 745 Updated Apr 16, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 9,976 1,088 Updated Apr 2, 2025

TorchCFM: a Conditional Flow Matching library

Python 1,638 133 Updated Mar 11, 2025

A cross-platform Markdown note-taking application dedicated to using AI to bridge recording and writing, organizing fragmented knowledge into a readable note.

TypeScript 1,347 115 Updated Apr 16, 2025

Toolkit for linearizing PDFs for LLM datasets/training

Python 11,124 754 Updated Apr 15, 2025

Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction

Python 176 12 Updated Feb 28, 2025

Witness the aha moment of VLM with less than $3.

Python 3,530 275 Updated Mar 1, 2025

Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 37,121 2,893 Updated Apr 14, 2025

A novel approach to hunyuan image-to-video sampling

Python 296 16 Updated Feb 5, 2025
Python 4,169 338 Updated Mar 12, 2025

A Fast TTS Engine

Python 487 37 Updated Jan 23, 2025

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Python 226 39 Updated Apr 8, 2025

翻墙-科学上网、自由上网、免费科学上网、免费翻墙、fanqiang、油管youtube/视频下载、软件、VPN、一键翻墙浏览器,vps一键搭建翻墙服务器脚本/教程,免费shadowsocks/ss/ssr/v2ray/goflyway账号/节点,翻墙梯子,电脑、手机、iOS、安卓、windows、Mac、Linux、路由器翻墙、科学上网、youtube视频下载、youtube油管镜像/免翻墙…

Python 60,468 9,889 Updated Apr 16, 2025
Python 554 54 Updated Apr 11, 2025
Next
Showing results