Skip to content
View song630's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Purdue University
  • Beijing

Block or report song630

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Jupyter Notebook 413 12 Updated May 24, 2024
Python 11 Updated Dec 8, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,668 85 Updated Dec 12, 2024

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

266 11 Updated Dec 7, 2024

Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.

Python 299 8 Updated Sep 24, 2024

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Python 3,649 314 Updated Oct 11, 2024

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 6,691 565 Updated Feb 29, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 33 3 Updated Apr 2, 2024

4M: Massively Multimodal Masked Modeling

Python 1,638 99 Updated Oct 7, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 22,606 2,216 Updated Nov 28, 2024

Kolors Team

Python 3,999 289 Updated Nov 13, 2024

[CVPR‘23] Hyperspherical Embedding for Point Cloud Completion

Python 18 1 Updated Aug 23, 2023

Bring portraits to life!

Python 13,314 1,419 Updated Nov 12, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 2,157 170 Updated Dec 13, 2024

The official Meta Llama 3 GitHub site

Python 27,493 3,131 Updated Aug 12, 2024

Code for PhysDreamer

Python 515 25 Updated Sep 15, 2024

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

Python 4,470 423 Updated Sep 21, 2024

[CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

Jupyter Notebook 147 7 Updated Oct 3, 2024

"Structure-Aware Sparse-View X-ray 3D Reconstruction" (CVPR 2024)

Python 472 21 Updated Nov 22, 2024

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,533 585 Updated May 31, 2024

An unofficial implementation of the paper "ObjectStitch: Object Compositing with Diffusion Model", CVPR 2023.

Python 64 2 Updated Nov 30, 2024
Python 296 7 Updated Jan 27, 2024

Pytorch Implementation of "SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation"(CVPR 2024)

Python 94 8 Updated Jul 22, 2024

Official PyTorch Implementation for Diffusion Hyperfeatures, NeurIPS 2023

Jupyter Notebook 95 9 Updated Oct 21, 2024

The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"

Python 186 10 Updated Apr 10, 2024

A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/

JavaScript 2,273 345 Updated Sep 10, 2024

HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

Python 297 16 Updated Mar 14, 2024

Code release for Image Sculpting: Precise Object Editing with 3D Geometry Control [CVPR 2024]

Python 278 19 Updated Mar 4, 2024

[NeurIPS'23] Emergent Correspondence from Image Diffusion

Python 629 35 Updated May 14, 2024
Next