Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

This repository contains code for CVPR2024 paper:

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
Shuting He, Henghui Ding
CVPR 2024

Installation:

Please see INSTALL.md. Then

pip install -r requirements.txt
python3 -m spacy download en_core_web_sm

Inference

1. Val^u set

Obtain the output masks of Val^u set:

python train_net_dshmp.py \
    --config-file configs/dshmp_swin_tiny.yaml \
    --num-gpus 8 --dist-url auto --eval-only \
    MODEL.WEIGHTS [path_to_weights] \
    OUTPUT_DIR [output_dir]

Obtain the J&F results on Val^u set:

python tools/eval_mevis.py

2. Val set

Obtain the output masks of Val set for CodaLab online evaluation:

python train_net_dshmp.py \
    --config-file configs/dshmp_swin_tiny.yaml \
    --num-gpus 8 --dist-url auto --eval-only \
    MODEL.WEIGHTS [path_to_weights] \
    OUTPUT_DIR [output_dir] DATASETS.TEST '("mevis_test",)'

Training

Firstly, download the backbone weights (model_final_86143f.pkl) and convert it using the script:

wget https://dl.fbaipublicfiles.com/maskformer/mask2former/coco/instance/maskformer2_swin_tiny_bs16_50ep/model_final_86143f.pkl
python tools/process_ckpt.py
python tools/get_refer_id.py

Then start training:

python train_net_dshmp.py \
    --config-file configs/dshmp_swin_tiny.yaml \
    --num-gpus 8 --dist-url auto \
    MODEL.WEIGHTS [path_to_weights] \
    OUTPUT_DIR [path_to_weights]

Note: We train on a 3090 machine using 8 cards with 1 sample on each card, taking about 17 hours.

Models

☁️ Google Drive

Acknowledgement

This project is based on MeViS. Many thanks to the authors for their great works!

BibTeX

Please consider to cite DsHmp if it helps your research.

@inproceedings{DsHmp,
  title={Decoupling static and hierarchical motion perception for referring video segmentation},
  author={He, Shuting and Ding, Henghui},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={13332--13341},
  year={2024}
}

Name	Name	Last commit message	Last commit date
Latest commit heshuting555 Jul 24, 2024 d0c3a39 · Jul 24, 2024 History 4 Commits
configs	configs	update	Jul 20, 2024
datasets	datasets	update	Jul 20, 2024
dshmp	dshmp	update	Jul 20, 2024
mask2former	mask2former	update	Jul 20, 2024
tools	tools	update	Jul 24, 2024
.gitignore	.gitignore	update	Jul 20, 2024
README.md	README.md	Update	Jul 20, 2024
requirements.txt	requirements.txt	update	Jul 20, 2024
train_net_dshmp.py	train_net_dshmp.py	update	Jul 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

Installation:

Inference

1. Val^u set

2. Val set

Training

Models

Acknowledgement

BibTeX

About

Releases

Packages

Contributors 2

Languages

heshuting555/DsHmp

Folders and files

Latest commit

History

Repository files navigation

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

Installation:

Inference

1. Valu set

2. Val set

Training

Models

Acknowledgement

BibTeX

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

1. Val^u set

Packages