-
KwaiVGI, Kuaishou Tech. << PhD@CASIA
- Beijing, China
-
20:40
(UTC +08:00) - https://guojianzhu.com
- in/guojianzhu
- https://huggingface.co/cleardusk
Highlights
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
HunyuanVideo: A Systematic Framework For Large Video Generation Model
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Unifying 3D Mesh Generation with Language Models
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
A suite of image and video neural tokenizers
gradio WebUI for AdvancedLivePortrait
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Janus-Series: Unified Multimodal Understanding and Generation Models
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
LAPA: Latent Action Pretraining from Videos