Skip to content
View huyanxin's full-sized avatar

Highlights

  • Pro

Block or report huyanxin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

Python 4 Updated Mar 5, 2025

Spark-TTS Inference Code

Python 1,084 82 Updated Mar 5, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,169 775 Updated Mar 1, 2025

This is the official implementation of the LiSenNet

Python 56 7 Updated Nov 15, 2024

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

Python 315 18 Updated Mar 6, 2025

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.

Python 190 14 Updated Aug 25, 2024

Fast algorithm for determined blind source separation with update of demixing filters with joint adjustment of the remaining sources.

Python 33 8 Updated Mar 22, 2021

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,671 616 Updated Mar 6, 2025

Target Speaker Extraction Toolkit

Python 146 16 Updated Mar 5, 2025

[Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement

Python 36 2 Updated Dec 2, 2024

Paderwasn is a collection of methods for acoustic signal processing in wireless acoustic sensor networks (WASNs).

Python 17 7 Updated Aug 13, 2024

A lightweight library for portable low-level GPU computation using WebGPU.

C++ 3,830 188 Updated Feb 21, 2025

Reference implementation for DPO (Direct Preference Optimization)

Python 2,425 202 Updated Aug 11, 2024

10W首中文歌词数据库

466 77 Updated Jun 13, 2021

Generate synthetic wind noise signals based on a wind speed profile.

Python 31 5 Updated Apr 23, 2024

LLM training in simple, raw C/CUDA

Cuda 25,939 2,970 Updated Oct 2, 2024

Stable Diffusion web UI

Python 148,966 27,824 Updated Mar 4, 2025

Synthesizes a room impulse response using a ray tracing simulation engine.

C 12 3 Updated Mar 22, 2017

Graph Neural Networks for Sound Source Localization

Jupyter Notebook 16 7 Updated Oct 31, 2023

Pitch detection and pitch tracking, voicing unvoicing detection (VAD),基音检测

MATLAB 93 21 Updated Apr 21, 2022

A python algorithm to change the pitch of the voice in real time

Python 13 1 Updated Dec 13, 2020

Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (arXiv:2401.01498)

Python 60 4 Updated Apr 4, 2024
Cuda 109 29 Updated Apr 11, 2024

An optimized neural network operator library for chips base on Xuantie CPU.

C 87 38 Updated Jun 26, 2024

" Music Style Transfer with Time-Varying Inversion of Diffusion Models"

Jupyter Notebook 41 4 Updated Jul 23, 2024

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,313 126 Updated Jul 11, 2024

Official implementation of Self-Remixing

Python 13 Updated Feb 3, 2024

多个SVC/TTS的C++推理库

C 1,049 124 Updated Feb 27, 2025

中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。

4,075 1,006 Updated Mar 27, 2024
Next