Skip to content
View cleardusk's full-sized avatar
:octocat:
Researching & Coding
:octocat:
Researching & Coding

Organizations

@CBSR-CASIA @Westlake-AI @KwaiVGI

Block or report cleardusk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale

Python 282 6 Updated Dec 11, 2024

Efficient Track Anything

Python 348 9 Updated Dec 9, 2024

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 5,298 366 Updated Dec 11, 2024

⏬ Dumb downloader that scrapes the web

Python 54,124 9,663 Updated Dec 10, 2024

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 1,557 74 Updated Dec 10, 2024

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.

Python 1,088 51 Updated Nov 22, 2024

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 5,825 340 Updated Dec 5, 2024

Official repository for LTX-Video

Python 1,855 124 Updated Nov 24, 2024

Unifying 3D Mesh Generation with Language Models

Python 771 35 Updated Dec 5, 2024

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥

Python 479 21 Updated Nov 29, 2024

CUDA Python Low-level Bindings

Python 995 81 Updated Dec 10, 2024

Google Research

Jupyter Notebook 34,469 7,946 Updated Dec 6, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 6,757 720 Updated Nov 22, 2024

official code of WaLa paper

Python 84 5 Updated Nov 13, 2024

A Video Tokenizer Evaluation Dataset

Python 54 3 Updated Nov 23, 2024

A suite of image and video neural tokenizers

Python 963 23 Updated Nov 13, 2024

gradio WebUI for AdvancedLivePortrait

Python 388 32 Updated Dec 7, 2024

Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation

Python 2,413 181 Updated Dec 6, 2024

Inference script for Oasis 500M

Python 1,627 135 Updated Nov 8, 2024

[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild

Python 4,027 635 Updated Dec 1, 2024

GLM-4-Voice | 端到端中英语音对话模型

Python 2,433 196 Updated Dec 5, 2024
Python 405 21 Updated Nov 28, 2024

The best OSS video generation models

Python 2,396 246 Updated Dec 6, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,656 2,276 Updated Aug 12, 2024

NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Python 406 24 Updated Oct 20, 2024

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,666 197 Updated Nov 6, 2024

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 1,221 57 Updated Nov 13, 2024

Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen

344 19 Updated Oct 19, 2024

LAPA: Latent Action Pretraining from Videos

Python 97 5 Updated Nov 22, 2024
Next