
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Enjoy the magic of Diffusion models!
A minimal and universal controller for FLUX.1.
FastVideo is a lightweight framework for accelerating large video diffusion models.
Scripts and doc for https://www.dolthub.com/repositories/chenditc/investment_data
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.
A PyTorch native library for large model training
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25).
You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.
[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
🎬 人人影视 机器人和网站,包含人人影视全部资源以及众多网友的网盘分享
HunyuanVideo: A Systematic Framework For Large Video Generation Model
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
InstantIR: Blind Image Restoration with Instant Generative Reference 🔥
Example models using DeepSpeed
Unofficial PyTorch Implementation for paper FlashFace
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
A general fine-tuning kit geared toward diffusion models.
In 2024, the strongest open-source implementation of asymmetric magvit_v2 supports inference code but excludes VQVAE. It supports the joint encoding of images and videos, accommodating arbitrary vi…
ControlNet++: All-in-one ControlNet for image generations and editing!
🚀 Cross attention map tools for huggingface/diffusers
Official Implementation of "Learning Inclusion Matching for Animation Paint Bucket Colorization"