NeurIPS, 2024
Tao Zhang
.
Xiangtai Li
·
Hao Fei
·
Haobo Yuan
.
Shengqiong Wu
·
Shunping Ji
·
Chen Change Loy
.
Shuicheng Yan
Wuhan University, Skywork AI, S-Lab, MMlab@NTU, Bytedance
Xiangtai is the project leader and corresponding author.
Please refer to INSTALL.md
We provide scripts for building the Gradio demo. You can deploy OMG-LLaVA locally.
We suggest you use at least 32G GPU for 7B models.
python omg_llava/tools/app.py \
${PATH_TO_CONFIG} \
${PATH_TO_DeepSpeed_PTH}
# for example
python omg_llava/tools/app.py omg_llava/configs/finetune/omg_llava_7b_finetune_8gpus.py \
./pretrained/omg_llava/omg_llava_7b_finetune_8gpus.pth
PYTHONPATH=. NPROC_PER_NODE=${GPUS_NUMBER} xtuner train \
${PATH_TO_CONFIG} \
--deepspeed deepspeed_zero2
# after train, please use the tools to convert deepspeed chekpoint to pth format
PYTHONPATH=. python omg_llava/tools/convert_deepspeed2pth.py
${PATH_TO_CONFIG} \
${PATH_TO_DeepSpeed_PTH} \
--save-path ./pretrained/omg_llava/${PTH_NAME.pth}
# examples
# OMG-LLaVA pretrain
PYTHONPATH=. NPROC_PER_NODE=8 xtuner train \
omg_llava/configs/pretrain/omg_llava_7b_pretrain_8gpus.py \
--deepspeed deepspeed_zero2
# OMG-LLaVA finetune
PYTHONPATH=. NPROC_PER_NODE=8 xtuner train \
omg_llava/configs/finetune/omg_llava_7b_finetune_8gpus.py \
--deepspeed deepspeed_zero2
# finetune on specific tasks, such as RES and GCG
PYTHONPATH=. NPROC_PER_NODE=8 xtuner train \
omg_llava/configs/finetune/specific_tasks_finetune/finetune_refseg.py \
--deepspeed deepspeed_zero2
# for chat
python omg_llava/tools/chat_omg_llava.py \
${PATH_TO_CONFIG} \
${PATH_TO_PTH} \
--image ${PATH_TO_IMAGE}
# the corresponding segmentation masks will be saved at ./output.png
# for evaluation referring expression segmentation
NPROC_PER_NODE=8 xtuner refcoco_omg_seg_llava \
${PATH_TO_CONFIG} \
${PATH_TO_PTH} \
--dataset ${refcoco or refcoco_plus or refcocog} \
--split ${val or testA or testB}
# for evaluation gcg
NPROC_PER_NODE=8 xtuner gcg_omg_seg_llava \
${PATH_TO_CONFIG} \
${PATH_TO_PTH} \
--output-name gcg_pred
python omg_llava/tools/evaluate_gcg.py \
--prediction_dir_path ./work_dirs/gcg_pred/
--gt_dir_path ./data/glamm_data/annotations/gcg_val_test/
--split ${val or test}
# for evaluation region caption
NPROC_PER_NODE=8 xtuner region_cap_mask_omg_seg_llava \
${PATH_TO_CONFIG} \
${PATH_TO_PTH} \
--output-path ./work_dirs/region_cap_pred.json
python omg_llava/tools/evaluate_region_cap.py \
--results_dir ./work_dirs/region_cap_pred.json