-
National Taiwan University
- Taipei, Taiwan
-
09:20
(UTC +08:00) - https://leo19941227.github.io
- @leo19941227
Lists (3)
Sort Name ascending (A-Z)
Stars
Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"
A multi-voice TTS system trained with an emphasis on quality
LLaSA: Scaling Train-time and Test-time Compute for LLaMA-based Speech Synthesis
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11037
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models
A toolkit to calculate speech audio quality. Not affiliated with the original authors
UT-Sarulab MOS prediction system using SSL models
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Flops counter for convolutional networks in pytorch framework
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Speech Human Evaluation Estimation Toolkit (SHEET)
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
A summary of related works about flow matching, stochastic interpolants