-
Tsinghua University
- Beijing
Lists (1)
Sort Name ascending (A-Z)
Stars
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A plugin from ECMWF/ai models, with models sourced from PuYun Large AI-based Meteorological Model in Macarbon (Hangzhou)
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
🎥 Python and OpenCV-based scene cut/transition detection program & library.
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Better Aligning Text-to-Image Models with Human Preference. ICCV 2023
Image to prompt with BLIP and CLIP
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
A collection of resources and papers on Diffusion Models
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
EVA Series: Visual Representation Fantasies from BAAI
2024 up-to-date list of DATASETS, CODEBASES and PAPERS on Multi-Task Learning (MTL), from Machine Learning perspective.
CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet
General Vision Benchmark, GV-B, a project from OpenGVLab
Official repository for the General Robust Image Task (GRIT) Benchmark
[ICCV 2023 & AAAI 2023] Binary Adapters & FacT, [Tech report] Convpass
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset
❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119