[ICLR 2024] FairSeg: A Large-Scale Medical Image Segmentation Dataset for Fairness Learning Using Segment Anything Model with Fair Error-Bound Scaling

Python 97 16 Updated Aug 2, 2024

SuperKogito / SER-datasets

A collection of datasets for the purpose of emotion recognition/detection in speech.

HTML 307 42 Updated Sep 30, 2024

standing-o / Combined_Dataset_for_Speech_Emotion_Recognition

A collection of dataset consists of a total of 8 English speech datasets for SER

Jupyter Notebook 12 Updated Oct 11, 2024

iver56 / audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Python 1,899 193 Updated Dec 9, 2024

kykiefer / depression-detect

Predicting depression from acoustic features of speech using a Convolutional Neural Network.

Python 291 93 Updated Oct 29, 2018

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Python 9,050 1,412 Updated Dec 9, 2024

openspeech-team / openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Python 680 114 Updated Oct 23, 2023

CatMe0w / zouxian

Permanent Apple Intelligence + Xcode Predictive Code Completion for Chinese-market Mac computers

Shell 755 29 Updated Jul 31, 2024

VirgilClyne / iRingo

解锁完整的 Apple功能和集成服务

9,659 361 Updated Nov 20, 2024

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,675 6,426 Updated Oct 18, 2024

lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,459 266 Updated Nov 8, 2024

SmartFlowAI / EmoLLM

心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2、LLama3.1

Python 897 127 Updated Oct 21, 2024

csuhan / OneLLM

[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language

Python 600 33 Updated Oct 22, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,789 117 Updated Oct 30, 2024

NExT-GPT / NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Python 3,333 336 Updated Nov 3, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 796 64 Updated Aug 27, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

13,056 835 Updated Dec 13, 2024

ga642381 / speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

776 43 Updated Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xiangyu Zhao hsiangyuzhao

Achievements

Achievements

Highlights

Block or report hsiangyuzhao

Stars

SWivid / F5-TTS

Emotional-Text-to-Speech / dl-for-emo-tts

MadcowD / ell

haoheliu / AudioLDM

TigerResearch / TigerBot

hwanz / SSR-V2ray-Trojan

airaria / Visual-Chinese-LLaMA-Alpaca

ymcui / Chinese-LLaMA-Alpaca-2

ymcui / Chinese-LLaMA-Alpaca

meta-llama / llama-models

annoymity2022 / Chinese-Dataset

zhaoziheng / SAT

Harvard-Ophthalmology-AI-Lab / FairSeg