Codes of MVSFormer: Multi-View Stereo by Learning Robust Image Features and Temperature-based Depth
- Releasing codes of training and testing
- Adding dynamic pointcloud fusion for T&T
- Releasing pre-trained models
git clone https://github.com/ewrfcas/MVSFormer.git
cd MVSFormer
pip install -r requirements.txt
We also highly recommend to install fusibile from (https://github.com/YoYo000/fusibile) for the depth fusion.
git clone https://github.com/YoYo000/fusibile.git
cd fusibile
cmake .
make
Tips: Your should revise CUDA_NVCC_FLAGS in CMakeLists.txt according the gpu device you used. We set -gencode arch=compute_70,code=sm_70
instead of -gencode arch=compute_60,code=sm_60
with V100 GPUs.
- Download preprocessed poses from DTU training data, and depth from Depths_raw.
- We also need original rectified images from the official website.
- DTU testing set can be downloaded from MVSNet.
dtu_training
├── Cameras
├── Depths
├── Depths_raw
└── DTU_origin/Rectified (downloaded from the official website with origin image size)
Download high-resolution images from BlendedMVS
BlendedMVS_raw
├── 57f8d9bbe73f6760f10e916a
. └── 57f8d9bbe73f6760f10e916a
. └── 57f8d9bbe73f6760f10e916a
. ├── blended_images
├── cams
└── rendered_depth_maps
Download preprocessed T&T. Note that users should use the short depth range of cameras Run the evaluation script to produce the point clouds.
tankandtemples
├── advanced
│ ├── Auditorium
│ ├── Ballroom
│ ├── ...
│ └── Temple
└── intermediate
├── Family
├── Francis
├── ...
├── Train
└── short_range_cameras
DINO-small (https://github.com/facebookresearch/dino): Weight Link
Twins-small (https://github.com/Meituan-AutoML/Twins): Weight Link
Training MVSFormer (Twins-based) on DTU with 2 32GB V100 GPUs cost 2 days. We set the max epoch=15 in DTU, but it could achieve the best one in epoch=10 in our implementation. You are free to adjust the max epoch, but the learning rate decay may be influenced.
CUDA_VISIBLE_DEVICES=0,1 python train.py --config configs/config_mvsformer.json \
--exp_name MVSFormer \
--data_path ${YOUR_DTU_PATH} \
--DDP
MVSFormer-P (frozen DINO-based).
CUDA_VISIBLE_DEVICES=0,1 python train.py --config configs/config_mvsformer-p.json \
--exp_name MVSFormer-p \
--data_path ${YOUR_DTU_PATH} \
--DDP
We should finetune our model based on BlendedMVS before the testing on T&T.
CUDA_VISIBLE_DEVICES=0,1 python train.py --config configs/config_mvsformer_blendmvs.json \
--exp_name MVSFormer-blendedmvs \
--data_path ${YOUR_BLENDEMVS_PATH} \
--dtu_model_path ${YOUR_DTU_MODEL_PATH} \
--DDP
Our pretrained models will be released soon!
For testing on DTU:
CUDA_VISIBLE_DEVICES=0 python test.py --dataset dtu --batch_size 1 \
--testpath ${dtu_test_path} \
--testlist ./lists/dtu/test.txt \
--resume ${MODEL_WEIGHT_PATH} \
--outdir ${OUTPUT_DIR} \
--fusibile_exe_path ./fusibile/fusibile \
--interval_scale 1.06 --num_view 5 \
--numdepth 192 --max_h 1152 --max_w 1536 --filter_method gipuma \
--disp_threshold 0.1 --num_consistent 2 --prob_threshold 0.5,0.5,0.5,0.5 \
--combine_conf \
--tmps 5.0,5.0,5.0,1.0
For testing on T&T
CUDA_VISIBLE_DEVICES=0 python test.py --dataset tt --batch_size 1 \
--testpath ${tt_test_path}/intermediate(or advanced) \
--testlist ./lists/tanksandtemples/intermediate.txt(or advanced.txt)
--resume ${MODEL_WEIGHT_PATH} \
--outdir ${OUTPUT_DIR} \
--interval_scale 1.0 --num_view 20 --numdepth 256 \
--max_h 1088 --max_w 1920 --filter_method dpcd \
--prob_threshold 0.5,0.5,0.5,0.5 \
--use_short_range --combine_conf --tmps 5.0,5.0,5.0,1.0
Our codes are partially based on CDS-MVSNet, DINO, and Twins.