Multi-view Hand Reconstruction with a Point-Embedded Transformer

Lixin Yang · Licheng Zhong · Pengxiang Zhu · Xinyu Zhan · Junxiao Kong . Jian Xu . Cewu Lu

POEM is a generalizable multi-view hand mesh reconstruction (HMR) model designed for practical use in real-world hand motion capture scenerios. It embeds a static basis point within the multi-view stereo space to serve as medium for fusing features across different views. To infer accurate 3D hand mesh from multi-view images, POEM introduce a point-embedded transformer decoder. By employing a combination of five large-scale multi-view datasets and sufficient data augmentation, POEM demonstrates superior generalization ability in real-world applications.

🕹️ Instructions

See docs/installation.md to setup the environment and install all the required packages.
See docs/datasets.md to download all the datasets and additional assets required.

🏃 Training and Evaluation

Available models

We provide four models with different configurations for training and evaluation. We have evaluated the models on multiple datasets.

set ${MODEL} as one in [small, medium, medium_MANO, large].
set ${DATASET} as one in [HO3D, DexYCB, Arctic, Interhand, Oakink, Freihand].

Download the pretrained checkpoints at 🔗ckpt_release and move the contents to ./checkpoints.

Command line arguments

-g, --gpu_id, visible GPUs for training, e.g. -g 0,1,2,3. evaluation only supports single GPU.
-w, --workers, num_workers in reading data, e.g. -w 4.
-p, --dist_master_port, port for distributed training, e.g. -p 60011, set different -p for different training processes.
-b, --batch_size, e.g. -b 32, default is specified in config file, but will be overwritten if -b is provided.
--cfg, config file for this experiment, e.g. --cfg config/release/train_${MODEL}.yaml.
--exp_id specify the name of experiment, e.g. --exp_id ${EXP_ID}. When --exp_id is provided, the code requires that no uncommitted change is remained in the git repo. Otherwise, it defaults to 'default' for training and 'eval_{cfg}' for evaluation. All results will be saved in exp/${EXP_ID}*{timestamp}.
--reload, specify the path to the checkpoint (.pth.tar) to be loaded.

Evaluation

Specify the ${PATH_TO_CKPT} to ./checkpoints/${MODEL}.pth.tar. Then, run the following command. Note that we essentially modify the config file in place to suit different configuration settings. view_min and view_max specify the range of views fed into the model. Use --draw option to render the results, note that it is incompatible with the computation of auc metric.

$ python scripts/eval_single.py --cfg config/release/eval_single.yaml
                                -g ${gpu_id}
                                --reload ${PATH_TO_CKPT}
                                --dataset ${DATASET}
                                --view_min ${MIN_VIEW}
                                --view_max ${MAX_VIEW}
                                --model ${MODEL}

The evaluation results will be saved at exp/${EXP_ID}_{timestamp}/evaluations.

Training

We have used the mixature of multiple datasets packed by webdataset for training. Excecute the following command to train a specific model on the provided dataset.

$ python scripts/train_ddp_wds.py --cfg config/release/train_${MODEL}.yaml -g 0,1,2,3 -w 4

Tensorboard

$ cd exp/${EXP_ID}_{timestamp}/runs/
$ tensorboard --logdir .

Checkpoint

All the checkpoints during training are saved at exp/${EXP_ID}_{timestamp}/checkpoints/, where ../checkpoints/checkpoint records the most recent checkpoint.

License

This code and model are available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using the code and model you agree to the terms in the LICENSE.

Citation

@misc{yang2024multiviewhandreconstructionpointembedded,
      title={Multi-view Hand Reconstruction with a Point-Embedded Transformer}, 
      author={Lixin Yang and Licheng Zhong and Pengxiang Zhu and Xinyu Zhan and Junxiao Kong and Jian Xu and Cewu Lu},
      year={2024},
      eprint={2408.10581},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.10581}, 
}

For more questions, please contact Lixin Yang: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
config		config
docs		docs
lib		lib
prepare		prepare
scripts		scripts
server_tool		server_tool
transform		transform
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ddp_python		ddp_python
environment.yml		environment.yml
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-view Hand Reconstruction with a Point-Embedded Transformer

🕹️ Instructions

🏃 Training and Evaluation

Available models

Command line arguments

Evaluation

Training

Tensorboard

Checkpoint

License

Citation

About

Releases

Packages

Languages

License

kelvin34501/POEM-v2

Folders and files

Latest commit

History

Repository files navigation

Multi-view Hand Reconstruction with a Point-Embedded Transformer

🕹️ Instructions

🏃 Training and Evaluation

Available models

Command line arguments

Evaluation

Training

Tensorboard

Checkpoint

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages