It is a simple demo including face detection and face aligment, and some optimizations were made to make the result better.
The keypoint model encodes and decodes the x and y coordinates using heatmap and offset of x and y, achieving SOTA on WFLW dataset. Like object detection, heatmap predicts which point is a positive sample on the featuremap, represented as a highlighted area, while x and y offsets are responsible for predicting the specific coordinates of these positive samples. And it achieves NME 3.95 on WFLW with no extern data.
click the gif to see the video:
- PyTorch
- onnxruntime
- opencv
- easydict
Refer to TRAIN/face_landmark/README.md to train the model.
WFLW | inputsize | NME | Flops(G) | Params(M) | Pose | Exp. | Ill. | Mu. | Occ. | Blur | pretrained |
---|---|---|---|---|---|---|---|---|---|---|---|
Student | 128x128 | 4.80 | 0.35 | 3.25 | 8.53 | 5.00 | 4.61 | 4.81 | 5.80 | 5.36 | skps |
Teacher | 128x128 | 4.17 | 1.38 | 11.53 | 7.14 | 4.32 | 4.01 | 4.03 | 4.98 | 4.68 | skps |
Student | 256x256 | 4.35 | 1.39 | 3.25 | 7.53 | 4.52 | 4.16 | 4.21 | 5.34 | 4.93 | skps |
Teacher | 256x256 | 3.95 | 5.53 | 11.53 | 7.00 | 4.00 | 3.81 | 3.78 | 4.85 | 4.54 | skps |
I will release new model when there is better one. 7.5K trainning data is not enough for a very good model. Please label more data if needed.
- pretrained models are in ./pretrained, for easy to use ,we convert them to mnn
- run
python demo.py --cam_id 0
use a camera
orpython demo.py --video test.mp4
detect for a video
orpython demo.py --img_dir ./test
detect for images dir no track
orpython demo.py --video test.mp4 --mask True
if u want a face mask
# by code:
from lib import FaceAna
facer = FaceAna()
boxes, landmarks, _ = facer.run(image)