MR-GDINO: Efficient Open-World Continual Object Detection

📖 Preprint | 📖 Arxiv Paper |🚧 Project_page|📊 Checkpoints|

Abstract

Open-world (OW) models show strong zero- and few-shot adaptation abilities, inspiring their use as initializations in continual learning methods to improve accuracy. Despite promising results on seen classes, such abilities of open-world unseen classes are largely degenerated due to catastrophic forgetting. To formulate and tackle this challenge, we propose open-world continual object detection, requiring detectors to generalize to old, new, and unseen categories in continual learning scenarios. Based on this task, we present OW-COD, a challenging yet practical benchmark to assess detection abilities. The goal is to motivate OW detectors to simultaneously preserve learned classes, adapt to new classes, and maintain open-world capabilities under few-shot adaptations. To mitigate forgetting in unseen categories, we propose MR-GDINO, a strong, efficient and scalable baseline via memory and retrieval mechanisms within a highly scalable memory pool. Experimental results show that existing continual open-world detectors suffer from severe forgetting for both seen and unseen categories. In contrast, MR-GDINO mitigates forgetting with only 0.1% activated extra parameters, achieving state-of-the-art performance for old, new, and unseen categories.

Environment

We use PyTorch 2 and CUDA 12.1 to run our code on four RTX A6000 GPUs (for continual learning) and one RTX A6000 GPU (for custom data continual learning).

Our code is based on Grounding-DINO and Open-Grounding-DINO.

Sincerely thank the authors for their great work.

pip install -r requirements.txt

Checkpoints

We release the 5-shot and 10-shot continual learning checkpoints. Download here

Note that for LoRA params, for simplicity, we directly store the whole params of VL enhancer. We ensure that only LoRA params are optimized during training.

If you find the loading issue, try the following code:

a = pickle.loads(open('subtasks_lora_wo_coco_10_shot.pkl', 'rb').read(), encoding='latin1')

Custom Data Training & Evaluation

The configuration of OW-COD training and evaluation data is coming soon. To ensure the deployment of MR-GDINO, we first provide the code for custom data continual learning.

Generally, the configuration of OW-COD benchmark is similar.

Data Preparation

Ensure your training/testing dataset annotations follow COCO-style format (JSON format).
Modify CLASS_NUM variable in tools/custom2odvg.py to match the number of new classes in your task.
Convert your COCO-style annotations:

python tools/custom2odvg.py --input YOUR_CUSTOM_COCO_STYLE_TRAIN_ANNO_FILE_PATH --output TRANSFERD_ANNO_FILE_NAME.json

Move the converted annotation file to the config folder.
Create CUSTOM_label_map.json in config folder with your class mapping:

{
    "0": "dog",
    "1": "cat",
    "2": "bird",
    ...
}

Create train_CUSTOM.json in config folder with dataset configurations:

{
    "train": [
        {
            "root": "TRAIN_IMAGES_ROOT",
            "anno": "/ABSOLUTE/PATH/OF/TRANSFERD_ANNO_FILE_NAME.json",
            "label_map": "/ABSOLUTE/PATH/OF/CUSTOM_label_map.json",
            "dataset_mode": "odvg"
        }
    ],
    "val": [
        {
            "root": "TEST_IMAGES_ROOT",
            "anno": "/ABSOLUTE/PATH/OF/YOUR_CUSTOM_COCO_STYLE_TEST_ANNO_FILE_PATH",
            "label_map": "/ABSOLUTE/PATH/OF/CUSTOM_label_map.json",
            "dataset_mode": "coco"
        }
    ]
}

Training

Modify start_train.sh with your configuration file path and output directory.
Start training:

bash start_train.sh

For multiple tasks, maintain separate annotation files and configurations for each task, and use start_train_more_task.sh as reference.

Inference

After training, the model checkpoint will be saved in the specified OUTPUT_DIR. For inference:

Generate task keys:
- Extract features for each task using extract_feat_scripts/extract_feat.sh
- Calculate mean features using cal_mean_feature_per_task.py
Extract PEFT modules:
- Run save_prompt_pth.py to get multi_model_prompt_params.pkl
- Run save_lora_pth.py to get multi_model_lora_params.pkl
Run inference:
- Modify eval_one_image.sh with your checkpoint path, image path, and text prompt
- Run inference:

bash eval_one_image.sh

Adjust retrieval_tau parameter in tools/inference_on_a_image.py to fine-tune results

Citation

@article{owcod,
    author = {Dong, Bowen and Huang, Zitong and Yang, Guanglei and Zhang, Lei and Zuo, Wangmeng},
    title = {MR-GDINO: Efficient Open-World Continual Object Detection},
    year = {2024},
    url = {https://dongsky.github.io/owcod/data/preprint.pdf}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
config		config
datasets		datasets
demo		demo
extract_feat_py		extract_feat_py
extract_feat_scripts		extract_feat_scripts
groundingdino		groundingdino
models		models
tools		tools
util		util
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
cal_mean_feature_per_task.py		cal_mean_feature_per_task.py
engine.py		engine.py
eval_lvis.sh		eval_lvis.sh
eval_odinw_aerial_ft_wo_coco.sh		eval_odinw_aerial_ft_wo_coco.sh
eval_odinw_aqua_ft_wo_coco.sh		eval_odinw_aqua_ft_wo_coco.sh
eval_odinw_cotton_ft_wo_coco.sh		eval_odinw_cotton_ft_wo_coco.sh
eval_odinw_egohand_ft_wo_coco.sh		eval_odinw_egohand_ft_wo_coco.sh
eval_odinw_mushroom_ft_wo_coco.sh		eval_odinw_mushroom_ft_wo_coco.sh
eval_odinw_package_ft_wo_coco.sh		eval_odinw_package_ft_wo_coco.sh
eval_odinw_pistols_ft_wo_coco.sh		eval_odinw_pistols_ft_wo_coco.sh
eval_odinw_pothole_ft_wo_coco.sh		eval_odinw_pothole_ft_wo_coco.sh
eval_odinw_raccoon_ft_wo_coco.sh		eval_odinw_raccoon_ft_wo_coco.sh
eval_odinw_shellfish_ft_wo_coco.sh		eval_odinw_shellfish_ft_wo_coco.sh
eval_odinw_thermal_ft_wo_coco.sh		eval_odinw_thermal_ft_wo_coco.sh
eval_odinw_vehicles_ft_wo_coco.sh		eval_odinw_vehicles_ft_wo_coco.sh
eval_odinw_voc_ft_wo_coco.sh		eval_odinw_voc_ft_wo_coco.sh
eval_one_image.sh		eval_one_image.sh
main.py		main.py
requirements.txt		requirements.txt
save_lora_pth.py		save_lora_pth.py
save_prompt_pth.py		save_prompt_pth.py
start_train.sh		start_train.sh
start_train_more_task.sh		start_train_more_task.sh
test_ap_on_lvis_single.py		test_ap_on_lvis_single.py
test_ap_on_odinw_aerial_wo_coco.py		test_ap_on_odinw_aerial_wo_coco.py
test_ap_on_odinw_aqua_wo_coco.py		test_ap_on_odinw_aqua_wo_coco.py
test_ap_on_odinw_cotton_wo_coco.py		test_ap_on_odinw_cotton_wo_coco.py
test_ap_on_odinw_egohand_wo_coco.py		test_ap_on_odinw_egohand_wo_coco.py
test_ap_on_odinw_mushroom_wo_coco.py		test_ap_on_odinw_mushroom_wo_coco.py
test_ap_on_odinw_package_wo_coco.py		test_ap_on_odinw_package_wo_coco.py
test_ap_on_odinw_pistols_wo_coco.py		test_ap_on_odinw_pistols_wo_coco.py
test_ap_on_odinw_pothole_wo_coco.py		test_ap_on_odinw_pothole_wo_coco.py
test_ap_on_odinw_raccoon_wo_coco.py		test_ap_on_odinw_raccoon_wo_coco.py
test_ap_on_odinw_shellfish_wo_coco.py		test_ap_on_odinw_shellfish_wo_coco.py
test_ap_on_odinw_thermal_wo_coco.py		test_ap_on_odinw_thermal_wo_coco.py
test_ap_on_odinw_vehicles_wo_coco.py		test_ap_on_odinw_vehicles_wo_coco.py
test_ap_on_odinw_voc_wo_coco.py		test_ap_on_odinw_voc_wo_coco.py
train_dist.sh		train_dist.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MR-GDINO: Efficient Open-World Continual Object Detection

📖 Preprint | 📖 Arxiv Paper |🚧 Project_page|📊 Checkpoints|

Abstract

Environment

Checkpoints

Custom Data Training & Evaluation

Data Preparation

Training

Inference

Citation

About

Releases

Packages

Languages

License

DongSky/MR-GDINO

Folders and files

Latest commit

History

Repository files navigation

MR-GDINO: Efficient Open-World Continual Object Detection

📖 Preprint | 📖 Arxiv Paper |🚧 Project_page|📊 Checkpoints|

Abstract

Environment

Checkpoints

Custom Data Training & Evaluation

Data Preparation

Training

Inference

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages