3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry
Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann, CGIT Lab at University of Souther California
[paper] [video] [project page]
News [Dec 11, 2024]: Add details of training UV-texture GAN. See the section "Training UV-texture GAN"
News [Jul 10, 2022]: Add simplified api for getting 3d landmarks, face mesh, and face pose in only one line. See "Simplified API" It's convenient if you simply want to plug in this method in your work.
News: Our new work [Cross-Modal Perceptionist] is accepted to CVPR 2022, which is based on this SynergyNet project.
👍 SOTA on all 3D facial alignment, face orientation estimation, and 3D face modeling.
👍 Fast inference with 3000fps on a laptop RTX 2080.
👍 Simple implementation with only widely used operations.
(This project is built/tested on Python 3.8 and PyTorch 1.9 on a compatible GPU)
-
Clone
git clone https://github.com/choyingw/SynergyNet
cd SynergyNet
-
Use conda
conda create --name SynergyNet
conda activate SynergyNet
-
Install pre-requisite common packages
PyTorch 1.9 (should also be compatiable with 1.0+ versions), Torchvision, Opencv, Scipy, Matplotlib, Cython
-
Download data [here] and [here]. Extract these data under the repo root.
These data are processed from [3DDFA] and [FSA-Net].
Download pretrained weights [here]. Put the model under 'pretrained/'
-
Compile Sim3DR and FaceBoxes:
cd Sim3DR
./build_sim3dr.sh
cd ../FaceBoxes
./build_cpu_nms.sh
cd ..
-
Inference
python singleImage.py -f img
The default inference requires a compatible GPU to run. If you would like to run on a CPU, please comment the .cuda() and load the pretrained weights into cpu.
We provide a simple API for convenient usage if you want to plug in this method into your work.
import cv2
from synergy3DMM import SynergyNet
model = SynergyNet()
I = cv2.imread(<your image path>)
# get landmark [[y, x, z], 68 (points)], mesh [[y, x, z], 53215 (points)], and face pose (Euler angles [yaw, pitch, roll] and translation [y, x, z])
lmk3d, mesh, pose = model.get_all_outputs(I)
We provide a simple script in singleImage_simple.py
We also provide a setup.py file. Run pip install -e .
You can do from synergy3DMM import SynergyNet
in other directory. Note that [3dmm_data] and [pretrained weight] (Put the model under 'pretrained/') need to be present.
-
Follow Single Image Inference Demo: Step 1-4
-
Benchmarking
python benchmark.py -w pretrained/best.pth.tar
Print-out results and visualization fo first-50 examples are stored under 'results/' (see 'demo/' for some pre-generated samples as references) are shown.
Updates: Best head pose estimation [pretrained model] (Mean MAE: 3.31) that is better than number reported in paper (3.35). Use -w to load different pretrained models.
-
Follow Single Image Inference Demo: Step 1-4.
-
Download training data from [3DDFA]: train_aug_120x120.zip and extract the zip file under the root folder (Containing about 680K images).
-
bash train_script.sh
-
Please refer to train_script for hyperparameters, such as learning rate, epochs, or GPU device. The default settings take ~19G on a 3090 GPU and about 6 hours for training. If your GPU is less than this size, please decrease the batch size and learning rate proportionally.
-
Follow Single Image Inference Demo: Step 1-5.
-
Download artistic faces data [here], which are from [AF-Dataset]. Download our predicted UV maps [here] by UV-texture GAN. Extract them under the root folder.
-
python artistic.py -f art-all --png
(whole folder)python artistic.py -f art-all/122.png
(single image)
Note that this artistic face dataset contains many different level/style face abstration. If a testing image is close to real, the result is much better than those of highly abstract samples.
-
Follow Single Image Inference Demo: Step 1-5.
-
Download our predicted UV maps and real face images for AFLW2000-3D [here] by UV-texture GAN. Extract them under the root folder.
-
python uv_texture_realFaces.py -f texture_data/real --png
(whole folder)python uv_texture_realFaces.py -f texture_data/real/image00002_real_A.png
(single image)
The results (3D meshes and renderings) are stored under 'inference_output'
-
Acquire AFLW2000-3D dataset and use [MGC-Net] test pipeline to get UV-texture for the AFLW2000 images.
-
Use [Pix2Pix] and train LSGAN with un-paired loss by their training recipe. In the input layer, concat the mean UV-texture and image and also shortcut add the mean texture at the output of generator.
-
The mean UV-texture can be got from original BFM set or from [face3d]
We show a comparison with [DECA] using the top-3 largest roll angle samples in AFLW2000-3D.
Facial alignemnt on AFLW2000-3D (NME of facial landmarks):
Face orientation estimation on AFLW2000-3D (MAE of Euler angles):
Results on artistic faces:
Related Project
[Cross-Modal Perceptionist] (analysis on relation for voice and 3D face)
Bibtex
If you find our work useful, please consider to cite our work
@INPROCEEDINGS{wu2021synergy,
author={Wu, Cho-Ying and Xu, Qiangeng and Neumann, Ulrich},
booktitle={2021 International Conference on 3D Vision (3DV)},
title={Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry},
year={2021}
}
Acknowledgement
The project is developed on [3DDFA] and [FSA-Net]. Thank them for their wonderful work. Thank [3DDFA-V2] for the face detector and rendering codes.