-
@MCG-NJU, Nanjing University
- Shanghai, China
- https://scholar.google.com/citations?user=Z9yWFA0AAAAJ&hl=en
Starred repositories
[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
[NeurIPS 2024] Behavioral Topology (BeTop), a multi-agent behavior formulation for interactive motion prediction and planning
[NeurIPS 2024] OPUS: Occupancy Prediction Using a Sparse Set
GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
Official implementation of the SIGGRAPH 2024 paper "A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets"
[ECCV 2024 Oral] SPLAM: Accelerating Image Generation with Sub-path Linear Approximation Model
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
[TPAMI 2024] Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
[NeurIPS 2024] VFIMamba: Video Frame Interpolation with State Space Models
[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric
[CVIU 2024] End-to-end dense video grounding via parallel regression
FastPillars: A Deployment-friendly Pillar-based 3D Detector
[CVPR 2024] SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
Advanced traffic lights for Cities: Skylines II
Vision-based 3D occupancy prediction in autonomous driving: a review and outlook
[CVPR2024] NARUTO: Neural Active Reconstruction
[CVPR 2024] Sparse Global Matching for Video Frame Interpolation with Large Motion
[IJCV 2024] Logit Normalization for Long-Tail Object Detection
[CVPR 2024] Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering
💫 [CVPR 2024] LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis
[CVPR 2024] BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
[CVPR 2024] Official PyTorch Code of SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects
[ECCV 2024] Ray Denoising (RayDN): Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection