JusperLee

🏠

Working at home

Kai Li (李凯) JusperLee

🏠

Working at home

940 followers · 242 following

@thu-ml
Tsinghua University
14:37 (UTC +08:00)
cslikai.cn
@cs_kai_li

Achievements

x3 x2

Achievements

x3 x2

Highlights

Developer Program Member
Pro

Organizations

Lists (1)

Sort

✨ Inspiration

1 repository

Stars

InternLM / InternLM-XComposer

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,595 158 Updated Dec 12, 2024

pytorch / torchcodec

PyTorch video decoding

Python 135 10 Updated Dec 14, 2024

FunAudioLLM / InspireMusic

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

Python 231 15 Updated Dec 13, 2024

huyz2023 / 2by4-pretrain

Efficient 2:4 sparse training algorithms and implementations

Python 44 Updated Dec 8, 2024

YUCHEN005 / STAR-Adapt

Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"

Python 295 3 Updated May 24, 2024

asdlei99 / WaveSplit-pytorch-incomplete

need more time for construction

Python 1 3 Updated Sep 10, 2020

FreesiaGPT / Embodied-AI

18 Updated Nov 12, 2024

XavierJiezou / KTDA

Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation

Python 4 Updated Dec 11, 2024

IDEA-Research / DINO-X-API

DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding

Python 529 25 Updated Dec 9, 2024

VITA-MLLM / Freeze-Omni

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 209 11 Updated Dec 10, 2024

AudioLLMs / AudioLLM

Audio Large Language Models

183 9 Updated Dec 13, 2024

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 1,716 113 Updated Dec 13, 2024

urgent-challenge / urgent2025_challenge

Python 30 4 Updated Dec 10, 2024

nyrahealth / CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 463 22 Updated Sep 6, 2024

lifeiteng / encodec

Python 5 Updated Nov 5, 2024

XavierJiezou / Cloud-Adapter

Official PyTorch implementation of "Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images".

Python 11 Updated Nov 25, 2024

deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,751 169 Updated Sep 25, 2024

alphacep / whisper-prompts

OpenAI Whisper Prompt Examples

48 2 Updated Jul 17, 2023

kale4eat / nisqalib

This is a Python package for NISQA.

Python 8 2 Updated Apr 9, 2024

THUNLP-MT / StreamingBench

Python 27 1 Updated Dec 10, 2024

jishengpeng / WavChat

A Survey of Spoken Dialogue Models (60 pages)

193 9 Updated Nov 28, 2024

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 57 2 Updated Dec 13, 2024

aladdinpersson / Machine-Learning-Collection

A resource for learning about Machine learning & Deep Learning

Python 7,755 2,705 Updated Aug 17, 2024

horseee / Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Python 1,323 94 Updated Dec 9, 2024

Haiyang-W / TokenFormer

Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Python 442 33 Updated Dec 9, 2024

sarulab-speech / UTMOSv2

UTokyo-SaruLab MOS Prediction System

Python 108 9 Updated Dec 9, 2024

thkkk / manibox

ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation

Python 8 Updated Dec 10, 2024

WangHelin1997 / SoloAudio

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.

Python 71 6 Updated Nov 14, 2024

coaidev / coai

🚀 Next Generation AI One-Stop Internationalization Solution. 🚀 下一代 AI 一站式 B/C 端解决方案，支持 OpenAI，Midjourney，Claude，讯飞星火，Stable Diffusion，DALL·E，ChatGLM，通义千问，腾讯混元，360 智脑，百川 AI，火山方舟，新必应，Gemini，Moonshot …

TypeScript 7,495 966 Updated Dec 8, 2024

voidful / MMLM

Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra

Python 11 4 Updated Dec 10, 2024