Skip to content
View JusperLee's full-sized avatar
🏠
Working at home
🏠
Working at home

Organizations

@QHU-HDACP @Meta2ML

Block or report JusperLee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,595 158 Updated Dec 12, 2024

PyTorch video decoding

Python 135 10 Updated Dec 14, 2024

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

Python 231 15 Updated Dec 13, 2024

Efficient 2:4 sparse training algorithms and implementations

Python 44 Updated Dec 8, 2024

Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"

Python 295 3 Updated May 24, 2024

need more time for construction

Python 1 3 Updated Sep 10, 2020
18 Updated Nov 12, 2024

Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation

Python 4 Updated Dec 11, 2024

DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding

Python 529 25 Updated Dec 9, 2024

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 209 11 Updated Dec 10, 2024

Audio Large Language Models

183 9 Updated Dec 13, 2024

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 1,716 113 Updated Dec 13, 2024

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 463 22 Updated Sep 6, 2024
Python 5 Updated Nov 5, 2024

Official PyTorch implementation of "Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images".

Python 11 Updated Nov 25, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,751 169 Updated Sep 25, 2024

OpenAI Whisper Prompt Examples

48 2 Updated Jul 17, 2023

This is a Python package for NISQA.

Python 8 2 Updated Apr 9, 2024
Python 27 1 Updated Dec 10, 2024

A Survey of Spoken Dialogue Models (60 pages)

193 9 Updated Nov 28, 2024

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 57 2 Updated Dec 13, 2024

A resource for learning about Machine learning & Deep Learning

Python 7,755 2,705 Updated Aug 17, 2024

A curated list for Efficient Large Language Models

Python 1,323 94 Updated Dec 9, 2024

Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Python 442 33 Updated Dec 9, 2024

UTokyo-SaruLab MOS Prediction System

Python 108 9 Updated Dec 9, 2024

ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation

Python 8 Updated Dec 10, 2024

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.

Python 71 6 Updated Nov 14, 2024

🚀 Next Generation AI One-Stop Internationalization Solution. 🚀 下一代 AI 一站式 B/C 端解决方案,支持 OpenAI,Midjourney,Claude,讯飞星火,Stable Diffusion,DALL·E,ChatGLM,通义千问,腾讯混元,360 智脑,百川 AI,火山方舟,新必应,Gemini,Moonshot …

TypeScript 7,495 966 Updated Dec 8, 2024

Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra

Python 11 4 Updated Dec 10, 2024
Next