Skip to content
View hsiangyuzhao's full-sized avatar

Highlights

  • Pro

Block or report hsiangyuzhao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 7,961 1,002 Updated Dec 13, 2024

💻 🤖 A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech 🔈

Jupyter Notebook 434 44 Updated Jun 26, 2024

A language model programming library.

Python 5,424 319 Updated Nov 21, 2024

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,483 225 Updated Dec 9, 2024

TigerBot: A multi-language multi-task LLM

Python 2,248 194 Updated Jun 7, 2024

机场推荐与机场评测

4,209 108 Updated Nov 18, 2024

多模态中文LLaMA&Alpaca大语言模型(VisualCLA)

Python 429 36 Updated Jul 27, 2023

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Python 7,116 576 Updated Sep 23, 2024

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,502 1,875 Updated Apr 30, 2024

Utilities intended for use with Llama models.

Python 5,285 882 Updated Dec 10, 2024

The official repository for "One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts"

Python 161 9 Updated Nov 12, 2024

[ICLR 2024] FairSeg: A Large-Scale Medical Image Segmentation Dataset for Fairness Learning Using Segment Anything Model with Fair Error-Bound Scaling

Python 97 16 Updated Aug 2, 2024

A collection of datasets for the purpose of emotion recognition/detection in speech.

HTML 306 42 Updated Sep 30, 2024

A collection of dataset consists of a total of 8 English speech datasets for SER

Jupyter Notebook 12 Updated Oct 11, 2024

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Python 1,899 193 Updated Dec 9, 2024

Predicting depression from acoustic features of speech using a Convolutional Neural Network.

Python 291 93 Updated Oct 29, 2018

A PyTorch-based Speech Toolkit

Python 9,051 1,412 Updated Dec 9, 2024

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Python 680 114 Updated Oct 23, 2023

Permanent Apple Intelligence + Xcode Predictive Code Completion for Chinese-market Mac computers

Shell 758 30 Updated Jul 31, 2024

解锁完整的 Apple功能和集成服务

9,660 361 Updated Nov 20, 2024

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,675 6,426 Updated Oct 18, 2024

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,459 266 Updated Nov 8, 2024

心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2、LLama3.1

Python 897 127 Updated Oct 21, 2024

[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language

Python 600 33 Updated Oct 22, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,789 117 Updated Oct 30, 2024

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Python 3,335 336 Updated Nov 3, 2024

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 796 64 Updated Aug 27, 2024

✨✨Latest Advances on Multimodal Large Language Models

13,059 835 Updated Dec 13, 2024

Awesome speech/audio LLMs, representation learning, and codec models

777 43 Updated Dec 12, 2024
Next