Code repository for the paper:
Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation
Shanyan Guan, Jingwei Xu, Michelle Z. He, Yunbo Wang†, Bingbing Ni, Xiaokang Yang
We focus on reconstructing human mesh from out-of-domain videos. In our experiments, we train a source model (termed as BaseModel) on Human 3.6M. To produce accurate human mesh on out-of-domain images, we optimize the BaseModel on target images via DynaBOA at test time. Below are the comparison results between BaseModel and the adapted model on the Internet videos with various camera parameters, motion, etc.
DynaBOA has been implemented and tested on Ubuntu 18.04 with python = 3.6.
Clone this repo:
git clone https://github.com/syguan96/DynaBOA.git
Install required packages:
conda create -n DynaBOA-env python=3.6
conda activate DynaBOA-env
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia
pip install -r requirements.txt
install spacepy following https://spacepy.github.io/install_linux.html
Download required file from File 1 and File 2. After unzipping files, rename File 1
to data
and move the files in File 2
to data/retrieval_res
. Finally, they should look like this:
|-- data
| |--dataset_extras
| | |--3dpw_0_0.npz
| | |--3dpw_0_1.npz
| | |--...
| |--retrieval_res
| | |--...
| |--smpl
| | |--...
| |--spin_data
| | |--gmm_08.pkl
| |--basemodel.pt
| |--J_regressor_extra.npy
| |--J_regressor_h36m.npy
| |--smpl_mean_params.npz
Download Human 3.6M using this tool, and then extract images by:
python process_data.py --dataset h36m
Download the 3DPW dataset. Then edit PW3D_ROOT
in the config.py.
Then, run:
bash run_on_3dpw.sh
Method | Protocol | PA-MPJPE | MPJPE | PVE |
---|---|---|---|---|
SPIN | #PS | 59.2 | 96.9 | 135.1 |
PARE | #PS | 46.4 | 79.1 | 94.2 |
Mesh Graphormer | #PS | 45.6 | 74.7 | 87.7 |
DynaBOA (Ours) | #PS | 40.4 | 65.5 | 82.0 |
Place videos into a folder, and record folder path by InternetData_ROOT
in config.py
.
Then extract images by:
python vid2img.py
The images are saved into InternetData_ROOT/images
.
We need 2D keypoint annotations to calculate a bounding box around the person and apply constraints to the optimization process. We use AlphaPose to detect the 2D keypoints of the person. The install instruction can be found here. After installing AlphaPose, you can use it to detect 2D keypoints. For example:
# go to the dictionary of Alphapose
python scripts/demo_inference.py --indir $IMAGES_DIR --outdir $RES_DIR --cfg configs/coco/resnet/256x192_res152_lr1e-3_1x-duc.yaml --checkpoint pretrained_models/fast_421_res152_256x192.pth --save_video --save_img --flip --min_box_area 300
$IMAGES_DIR
is the dictionary of images to be evaluated, and $RES_DIR
is the dictionary to save detected 2D keypoints.
OpenPose also can detect accurate 2D keypoints. If you use OpenPose, you should detect BODY_25
format keypoints.
python process_data.py --dataset internet
bash run_on_internet.sh
- DynaBOA for the internet data.
- DynaBOA for MPI-INF-3DHP and SURREAL
We borrow some code from SPIN and VIBE. Learn2learn is useful to implement bilevel optimization.