Skip to content
View wanghua-lei's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Block or report wanghua-lei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 733 20 Updated Jan 6, 2025

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…

C 220 14 Updated Dec 16, 2024

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Python 3,676 305 Updated Oct 28, 2024

UniSpeech - Large Scale Self-Supervised Learning for Speech

Python 444 74 Updated Apr 5, 2024

An open-source framework for training large multimodal models.

Python 3,789 289 Updated Aug 31, 2024

Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"

Python 88 8 Updated Mar 20, 2024
Python 59 7 Updated Jul 17, 2024

An Open Source text-to-speech system built by inverting Whisper.

Jupyter Notebook 4,062 223 Updated Dec 12, 2024

Acoustic mosquito detection code with Bayesian Neural Networks

Jupyter Notebook 50 16 Updated Oct 4, 2021
Jupyter Notebook 2 Updated Dec 16, 2022

Collection of scripts and utilities for reorganizing corpora to use with the Montreal Forced Aligner

Python 44 6 Updated Jun 22, 2021

This repository contains the code to setup the experiments for the ComParE 2022 mosquito event detection sub-challenge.

Python 5 3 Updated Oct 25, 2022

A library built for easier audio self-supervised training, downstream tasks evaluation

Python 110 10 Updated Aug 27, 2024

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Jupyter Notebook 108 13 Updated Oct 15, 2024

dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.

Python 484 78 Updated Jul 11, 2023

LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation with Spoken Language Models" (arXiv 2024).

43 1 Updated Dec 28, 2024

(NeurIPS 2024) Learning to Visual Question Answering, Asking and Assessment

Python 79 2 Updated Nov 7, 2024

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 399 38 Updated Dec 26, 2024

This repository contains code and metadata of How2 dataset

Python 169 17 Updated Dec 30, 2024

Prompting Large Language Models with Audio for General-Purpose Speech Summarization

Python 13 4 Updated Dec 28, 2024

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程

Jupyter Notebook 10,749 1,237 Updated Jan 5, 2025

batch processing of Llama-2 7B

Python 1 Updated Nov 8, 2023

学习vLLM,使用vLLM部署Qwen2-0.5B的模型,并使用docker部署。

Jupyter Notebook 10 1 Updated Jun 22, 2024

A curated list of awesome Multimodal studies.

HTML 118 10 Updated Jan 4, 2025

[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 877 90 Updated Dec 28, 2024

chinese speech pretrained models

Shell 1,055 89 Updated Aug 23, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 38,206 4,323 Updated Jan 2, 2025

Inference and training library for high-quality TTS models.

Python 4,861 503 Updated Dec 10, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,096 434 Updated Aug 10, 2024
Next