Skip to content

amusi/CVPR2025-Papers-with-Code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Feb 28, 2025
9ed3a53 · Feb 28, 2025
Mar 4, 2021
Mar 4, 2021
Mar 2, 2022
Feb 27, 2023
Feb 27, 2024
Feb 27, 2025
Mar 3, 2023
Feb 28, 2025
Mar 20, 2022

Repository files navigation

CVPR 2025 论文和开源项目合集(Papers with Code)

CVPR 2025 decisions are now available on OpenReview!22.1% = 2878 / 13008

注1:欢迎各位大佬提交issue,分享CVPR 2025论文和开源项目!

注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision

欢迎扫码加入【CVer学术交流群】,可以获取CVPR 2025等最前沿工作!这是最大的计算机视觉AI知识星球!每日更新,第一时间分享最新最前沿的计算机视觉、AIGC、扩散模型、多模态、深度学习、自动驾驶、医疗影像和遥感等方向的学习资料,快加入学起来!

【CVPR 2025 论文开源目录】

3DGS(Gaussian Splatting)

Avatars

Backbone

CLIP

Mamba

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Embodied AI

GAN

OCR

NeRF

DETR

Prompt

多模态大语言模型(MLLM)

LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences

大语言模型(LLM)

NAS

ReID(重识别)

扩散模型(Diffusion Models)

TinyFusion: Diffusion Transformers Learned Shallow

Vision Transformer

视觉和语言(Vision-Language)

目标检测(Object Detection)

LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models

异常检测(Anomaly Detection)

目标跟踪(Object Tracking)

Multiple Object Tracking as ID Prediction

医学图像(Medical Image)

医学图像分割(Medical Image Segmentation)

自动驾驶(Autonomous Driving)

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

3D点云(3D-Point-Cloud)

3D目标检测(3D Object Detection)

3D语义分割(3D Semantic Segmentation)

图像编辑(Image Editing)

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

视频编辑(Video Editing)

Low-level Vision

超分辨率(Super-Resolution)

AESOP: Auto-Encoded Supervision for Perceptual Image Super-Resolution

去噪(Denoising)

图像去噪(Image Denoising)

3D人体姿态估计(3D Human Pose Estimation)

图像生成(Image Generation)

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

PAR: Parallelized Autoregressive Visual Generation

视频生成(Video Generation)

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models

X-Dyna: Expressive Dynamic Human Image Animation

PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

3D生成

视频理解(Video Understanding)

具身智能(Embodied AI)

Universal Actions for Enhanced Embodied Foundation Models

知识蒸馏(Knowledge Distillation)

立体匹配(Stereo Matching)

暗光图像增强(Low-light Image Enhancement)

HVI: A New color space for Low-light Image Enhancement

场景图生成(Scene Graph Generation)

风格迁移(Style Transfer)

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

视频质量评价(Video Quality Assessment)

数据集(Datasets)

其他(Others)