-
Xi'an Jiaotong University
- Xi'an, Shaanxi
-
15:41
(UTC -12:00)
Highlights
- Pro
Starred repositories
🚀 Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull request…
Official codes for "Q-Ground: Image Quality Grounding with Large Multi-modality Models", ACM MM2024 (Oral)
🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
An open-source academic paper management tool.
YOLO-UniOW: Efficient Universal Open-World Object Detection
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
Code for Pre-training Protein Encoder via Siamese Sequence-Structure Diffusion Trajectory Prediction (https://arxiv.org/abs/2301.12068)
code for the paper "DiGress: Discrete Denoising diffusion for graph generation"
Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure
Code implementation of "Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem" https://arxiv.org/abs/2206.04119
Data and code required to reach the main conclusions of the fastsmcg paper
An open source implementation of CLIP.
A curated list for Efficient Large Language Models
[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
RobustSAM: Segment Anything Robustly on Degraded Images (CVPR 2024 Highlight)