Stars
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Inpaint anything using Segment Anything and inpainting models.
arielnlee / LLaVA-1.6-ft
Forked from haotian-liu/LLaVA[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Open-Sora: Democratizing Efficient Video Production for All
[CVPR‘23] Hyperspherical Embedding for Point Cloud Completion
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
[CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
"Structure-Aware Sparse-View X-ray 3D Reconstruction" (CVPR 2024)
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
An unofficial implementation of the paper "ObjectStitch: Object Compositing with Diffusion Model", CVPR 2023.
Pytorch Implementation of "SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation"(CVPR 2024)
Official PyTorch Implementation for Diffusion Hyperfeatures, NeurIPS 2023
The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
Code release for Image Sculpting: Precise Object Editing with 3D Geometry Control [CVPR 2024]
[NeurIPS'23] Emergent Correspondence from Image Diffusion