DeepLab v3+ in MXNet Gluon
train_multi_gpu.py
: multi-gpu training on Pascal VOC dataset, with validation.train.py
: single-gpu training on Pascal VOC dataset, with validation.evaluate.py
: single-gpu evaluation on Pascal VOC validation.extract_weights.py
: convert the weights from official model release.mylib
: lib-style clean code.workspace
: the notebooks where I did experiments, with messy staffs (ignore them).- GPU version only, but it should be modified easily into a CPU version.
- My running environments, not tested with other environments:
- Python==3.6
- MXNet>=1.2.0 (MXNet==1.3.0 for multi-gpu
SyncBatchNorm
) - gluoncv==0.3.0
- TensorFlow==1.4.0, Keras==2.1.5 (for converting the weights)
- Download the dataset
git clone https://github.com/dmlc/gluon-cv
cd gluon-cv/scripts/datasets
python pascal_voc.py
My porting on Pascal VOC validation:
Model | EvalOS (w/ or w/o inference tricks) | mIoU (%) |
---|---|---|
xception_coco_voc_trainaug (TF release) | 16 (w/o) 8 (w/) |
82.20 83.58 |
xception_coco_voc_trainaug (MXNet porting) | 16 (w/o) 8 (w/o) |
79.19 81.82 |
xception_coco_voc_trainaug (MXNet finetune TrainOS=16) | 16 (w/o) 8 (w/o) |
82.75 82.56 |
xception_coco_voc_trainaug (MXNet finetune TrainOS=8) | 16 (w/o) 8 (w/o) |
82.02 83.14 |
xception_voc_trainaug ImageNet pretrained only, without MSCOCO pretrained |
16 (w/o) 8 (w/o) |
77.06 76.44 |
Measured with fixing batch stats (use_global_stats=True
), just for reference.
Instance | GPUs | Pricing | Train OS | Speed | Train on train_aug | Eval on val | Time per epoch | Cost per epoch |
---|---|---|---|---|---|---|---|---|
p2.8xlarge | K80x8 | 7.20$/h | 16 8 |
1.5s/b16 3.4s/b16 |
17.0min 37.5min |
3.5min 10min (BUGS: gpus do not use sufficiently during eval) |
20.5min 47.5min |
$2.5 $5.7 |
p3.8xlarge | V100x4 | 12.24$/h | 16 8 |
0.5s/b16 3.0s/b12 |
5.5min 44.5min |
0.7min 1.3min |
6.2min 45.8min |
$1.3 $9.3 |
- transfer all the weights
- add OS=8
- test iou on PASCAL val
- add training scripts
- add multi-gpu training scripts
- train more and open source the best models
-
VOCAug dataset pull request -
Model pull request -
Finish pull request to gluoncv
This repository is a part of MXNet summer code hosted by AWS, TuSimple and Jiangmen. Specifically, I would like to thank Hang Zhang (@AWS) and Hengchen Dai (@TuSimple) for kind suggestions on tuning and implementation. Plus, I would like to thank AWS for providing generous credits for tuning the computationally intensive models.