Skip to content

PyTorch implementation of "Super-Realtime Facial Landmark Detection and Shape Fitting by Deep Regression of Shape Model Parameters" predicting facial landmarks with up to 400 FPS

License

Notifications You must be signed in to change notification settings

justusschock/shapenet

Repository files navigation

shapenet

Build Status Documentation Status PyPI version codecov LICENSE

This repository contains the PyTorch implementation of our Paper "SUPER-REALTIME FACIAL LANDMARK DETECTION AND SHAPE FITTING BY DEEP REGRESSION OF SHAPE MODEL PARAMETERS".

Contents

Installation

From Binary:

pip install shapenet

From Source:

pip install git+https://github.com/justusschock/shapenet

Demo

Demonstration Videos comparing our method to dlib can be found here as overlay and here as side-by-side view

Usage

By Scripts

For simplicity we provide several scripts to preprocess the data, train networks, predict from networks and export the network via torch.jit. To get a list of the necessary and accepted arguments, run the script with the -h flag.

Data Preprocessing

  • prepare_all_data: prepares multiple datasets (you can select the datasets to preprocess via arguments passed to this script)
  • prepare_cat_dset: Download and preprocesses the Cat-Dataset
  • prepare_helen_dset: Preprocesses an already downloaded ZIP file of the HELEN Dataset (Download is recommended from here since this already contains the landmarks)
  • prepare_lfpw_dset: Preprocesses an already downloaded ZIP file of the LFPW Dataset (Download is recommended from here since this already contains the landmarks)

Training

  • train_shapenet: Trains the shapenet with the configuration specified in an extra configuration file (exemplaric configuration for all available datasets are provided in the example_configs folder)

Prediction

  • predict_from_net: Predicts all images in a given directory (assumes existing groundtruths for cropping, otherwise the cropping to groundtruth could be replaced by a detector)

JIT-Export

  • export_to_jit: Traces the given model and saves it as jit-ScriptModule, which can be accessed via Python and C++

From Python

This implementation uses the delira-Framework for training and validation handling. It supports mixed precision training and inference via NVIDIA/APEX (must be installed separately). The data-handling is outsourced to shapedata.

The following gives a short overview about the packages and classes.

shapenet.networks

The networks subpackage contains the actual implementation of the shapenet with bindings to integrate the ShapeLayer and other feature extractors (currently the ones registered in torchvision.models).

shapenet.layer

The layer subpackage contains the Python and C++ Implementations of the ShapeLayer and the Affine Transformations. It is supposed to use these Layers as layers in shapenet.networks

shapenet.jit

The jit subpackage is a less flexible reimplementation of the subpackages shapenet.networks and shapenet.layer to export trained weights as jit-ScriptModule

shapenet.utils

The utils subpackage contains everything that did not suit into the scope of any other package. Currently it is mainly responsible for parsing of configuration files.

shapenet.scripts

The scripts subpackage contains all scipts described in Scripts and their helper functions.

Pretrained Weights

Currently Pretrained Weights are available for grayscale faces and cats.

For these Networks the image size is fixed to 224 and the pretrained weights can be loaded via torch.jit.load("PATH/TO/NETWORK/FILE.ptj"). The inputs have to be of type torch.Tensor with dtype torch.float in shape (BATCH_SIZE, 1, 224, 224) and normalized in a range between (0, 1).

Our Paper

If you use our Code for your own research, please cite our paper:

@article{Kopaczka2019,
title = {Super-Realtime Facial Landmark Detection and Shape Fitting by Deep Regression of Shape Model Parameters},
author = {Marcin Kopaczka and Justus Schock and Dorit Merhof},
year = {2019},
journal = {arXiV preprint}
}

The Paper is available as PDF on arXiv.