Stars
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
Fast algorithm for determined blind source separation with update of demixing filters with joint adjustment of the remaining sources.
[Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement
Paderwasn is a collection of methods for acoustic signal processing in wireless acoustic sensor networks (WASNs).
A lightweight library for portable low-level GPU computation using WebGPU.
Reference implementation for DPO (Direct Preference Optimization)
Generate synthetic wind noise signals based on a wind speed profile.
Stable Diffusion web UI
Synthesizes a room impulse response using a ray tracing simulation engine.
Graph Neural Networks for Sound Source Localization
Pitch detection and pitch tracking, voicing unvoicing detection (VAD),基音检测
A python algorithm to change the pitch of the voice in real time
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (arXiv:2401.01498)
An optimized neural network operator library for chips base on Xuantie CPU.
" Music Style Transfer with Time-Varying Inversion of Diffusion Models"
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Code and dataset for photorealistic Codec Avatars driven from audio
The official implementation of GTCRN, an ultra-lite speech enhancement model.
Fast Independent Vector Extraction: Code and data to reproduce the results from the paper.