-
Xi'an Jiaotong University
- Xi'an, Shaanxi
-
18:33
(UTC -12:00)
Highlights
- Pro
Starred repositories
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Fast and memory-efficient exact attention
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
An open source implementation of CLIP.
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
PyTorch extensions for high performance and large scale training.
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
Dual Attention Network for Scene Segmentation (CVPR2019)
pytorch implementation for "Deep Flow-Guided Video Inpainting"(CVPR'19)
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
🚀 Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
A curated list for Efficient Large Language Models
Real-time and accurate open-vocabulary end-to-end object detection
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
Build high-performance AI models with modular building blocks
code for the paper "DiGress: Discrete Denoising diffusion for graph generation"
RobustSAM: Segment Anything Robustly on Degraded Images (CVPR 2024 Highlight)
Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure