Welcome to HyperPose’s Documentation!¶
Installation¶
C++ Prediction Library Installation¶
Note that C++ prediction library requires NVIDIA GPU acceleration. HyperPose is developed and frequently tested on Linux platforms (i.e., Ubuntu 18.04). Hence, we recommend you to build HyperPose on Linux.
Container Installation (RECOMMENDED)¶
To ease the installation, you can use HyperPose library in our docker image where the environment is pre-installed (including pretrained models).
Prerequisites¶
To test your docker environment compatibility and get related instructions:
wget https://raw.githubusercontent.com/tensorlayer/hyperpose/master/scripts/test_docker.py -qO- | python
Official Docker Image¶
NVIDIA docker support is required to execute our docker image.
The official image is on DockerHub.
# Pull the latest image.
docker pull tensorlayer/hyperpose
# Dive into the image’s interactive terminal. (Connect local camera and imshow window)
xhost +; docker run --rm --gpus all -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --device=/dev/video0:/dev/video0 --entrypoint /bin/bash tensorlayer/hyperpose
# For users without a camera or X11 server. You may simply run without cameras and imshow:
# docker run --rm --gpus all -it --entrypoint /bin/bash tensorlayer/hyperpose
Note that the entry point is the hyperpose-cli
binary in the build directory (i.e., /hyperpose/build/hyperpose-cli
).
Build docker image from source¶
# Enter the repository folder.
USER_DEF_NAME=my_hyperpose
docker build -t $(USER_DEF_NAME) .
docker run --rm --gpus all $(USER_DEF_NAME)
Build From Source¶
Prerequisites¶
C++ 17 Compiler. (g++7, clang++5.0, MSVC19.0 or newer)
CMake 3.5+
Third-Party
OpenCV3.2+. (OpenCV 4+ is highly recommended)
CUDA related:
(suggested) CUDA 10.2, CuDNN 8.2.0, TensorRT >= 7.1, <= 8.0.
(minimal) CUDA 10.0, CuDNN 7.6.5, TensorRT 7.0.
gFlags (for command-line tool/examples/tests)
Note
Packages of other versions might also work but not tested.
TensorRT Tips
For Linux users, you are highly recommended to install it in a system-wide setting. You can install TensorRT7 via the debian distributions or NVIDIA network repo (CUDA and CuDNN dependency will be automatically installed).
CUDA-CuDNN-TensorRT Compatibility
Different TensorRT version requires specific CUDA and CuDNN version. For specific CUDA and CuDNN requirements of TensorRT7, please refer to this.
Build on Ubuntu 18.04¶
# >>> Install OpenCV3+ and other dependencies.
sudo apt -y install cmake libopencv-dev libgflags-dev
# !Note that the APT version OpenCV3.2 on Ubuntu18.04 has some trouble on Cameras Newer version is suggested.
# You are highly recommended to install OpenCV 4+ from scratch also for better performance.
# >>> Install dependencies to run the scripts in `${REPO}/scripts`
sudo apt install python3-dev python3-pip
# >>> Install CUDA/CuDNN/TensorRT: https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-debian
# >>> Build HyperPose
git clone https://github.com/tensorlayer/hyperpose.git
cd hyperpose
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release && cmake --build .
Build User Codes¶
You can directly write codes and execute it under the hyperpose repository.
Step 1: Write your own codes in
hyperpose/examples/user_codes
with suffix.cpp
.Step 2:
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DBUILD_USER_CODES=ON # BUILD_USER_CODES is by default "ON"
cmake --build .
Step 3: Execute your codes!
Go to Quick Start to test your installation.
Python Training Library Installation¶
Configure CUDA environment¶
You can configure your CUDA either by Anaconda or your system setting.
Using CUDA toolkits from Anaconda (RECOMMENDED)¶
Prerequisites
It is suggested to create new conda environment regarding the CUDA requirements.
# >>> create virtual environment
conda create -n hyperpose python=3.7 -y
# >>> activate the virtual environment, start installation
conda activate hyperpose
# >>> install cudatoolkit and cudnn library using conda
conda install cudatoolkit=10.0.130
conda install cudnn=7.6.0
Warning
It is also possible to install CUDA dependencies without creating a new environment. But it might introduce environment conflicts.
conda install cudatoolkit=10.0.130
conda install cudnn=7.6.0
Using system-wide CUDA toolkits¶
Users may also directly depend on the system-wide CUDA and CuDNN libraries.
HyperPose have been tested on the environments below:
OS | NVIDIA Driver | CUDA Toolkit | GPU |
---|---|---|---|
Ubuntu 18.04 | 410.79 | 10.0 | Tesla V100-DGX |
Ubuntu 18.04 | 440.33.01 | 10.2 | Tesla V100-DGX |
Ubuntu 18.04 | 430.64 | 10.1 | TITAN RTX |
Ubuntu 18.04 | 430.26 | 10.2 | TITAN XP |
Ubuntu 16.04 | 430.50 | 10.1 | RTX 2080Ti |
Check CUDA/CuDNN versions
To test CUDA version, run nvcc --version
: the highlight line in the output indicates that you have CUDA 11.2 installed.
nvcc --version
# ========== Valid output looks like ==========
# nvcc: NVIDIA (R) Cuda compiler driver
# Copyright (c) 2005-2020 NVIDIA Corporation
# Built on Mon_Nov_30_19:08:53_PST_2020
# Cuda compilation tools, release 11.2, V11.2.67
# Build cuda_11.2.r11.2/compiler.29373293_0
To check your system-wide CuDNN version on Linux: the output (in the comment) shows that we have CuDNN 8.0.5.
ls /usr/local/cuda/lib64 | grep libcudnn.so
# === Valid output looks like ===
# libcudnn.so
# libcudnn.so.8
# libcudnn.so.8.0.5
Install HyperPose Python training library¶
Install with pip
¶
To install a stable library from Python Package Index:
pip install -U hyperpose
Or you can install a specific release of hyperpose from GitHub, for example:
export HYPERPOSE_VERSION="2.2.0-alpha"
pip install -U https://github.com/tensorlayer/hyperpose/archive/${HYPERPOSE_VERSION}.zip
More GitHub releases and its version can be found here.
Local installation¶
You can also install HyperPose by installing the raw GitHub repository, this is usually for developers.
# Install the source codes from GitHub
git clone https://github.com/tensorlayer/hyperpose.git
pip install -U -r hyperpose/requirements.txt
# Add `hyperpose/hyperpose` to `PYTHONPATH` to help python find it.
export HYPERPOSE_PYTHON_HOME=$(pwd)/hyperpose
export PYTHONPATH=$HYPERPOSE_PYTHON_HOME/python:${PYTHONPATH}
Check the installation¶
Let’s check whether HyperPose is installed by running following commands:
python -c '
import tensorflow as tf # Test TensorFlow installation
import tensorlayer as tl # Test TensorLayer installation
assert tf.test.is_gpu_available() # Test GPU availability
import hyperpose # Test HyperPose import
'
Optional Setup¶
Extra configurations for exporting models¶
The hypeprose python training library handles the whole pipelines for developing the pose estimation system, including training, evaluating and testing. Its goal is to produce a .npz file that contains the well-trained model weights.
For the training platform, the enviroment configuration above is engough. However, most inference engine accepts ProtoBuf or ONNX format model. For example, the HyperPose C++ inference engine leverages TensorRT as the DNN engine, which takes ONNX models as inputs.
Thus, one need to convert the trained model loaded with .npz file weight to .pb format or .onnx format for further deployment, which need extra configuration below:
Converting a ProtoBuf model¶
To convert the model into ProtoBuf format, we use @tf.function
to decorate the infer
function for each model class, and we then can use the get_concrete_function
function from tensorflow to consctruct the frozen model computation graph and then save it with ProtoBuf format.
We provide a commandline tool to facilitate the conversion. The prerequisite of this tool is a tensorflow library installed along with HyperPose’s dependency.
Converting a ONNX model¶
To convert a trained model into ONNX format, we need to first convert the model into ProtoBuf format, we then convert a ProtoBuf model into ONNX format, which requires an additional library: tf2onnx for converting TensorFlow’s ProtoBuf model into ONNX format.
To install tf2onnx
, we simply run:
pip install -U tf2onnx
Extra configuration for distributed training with KungFu¶
The HyperPose python training library can also perform distributed training with Kungfu. To enable parallel training, please install Kungfu according to its official instructon.
Get Started¶
Quick Start of Prediction Library¶
Prerequisites
Have HyperPose Inference Library installed (HowTo).
Make sure
python3
andpython3-pip
are installed.
For Linux user, we can simply install them with apt
.
sudo apt -y install subversion python3 python3-pip
Warning
HyperPose Inference Library is mostly compatible and tested under Linux, especially Ubuntu 18.04. Please try Ubuntu 18.04 or a docker container for best experience.
Data preparation¶
Test data¶
We install a folder called media/
and put it in ${HyperPose_HOME}/data/media/
.
# cd to the git repo.
sh scripts/download-test-data.sh
Manual installation
If you have trouble installing the test data through command line, you can manually download the data folder from LINK, and put it in ${HyperPose_HOME}/data/media/
.
Install test models¶
The following scripts will download pre-trained under ${HyperPose_HOME}/data/models/
.
# cd to the git repo. And download pre-trained models you want.
sh scripts/download-openpose-thin-model.sh # ~20 MB
sh scripts/download-tinyvgg-model.sh # ~30 MB (UFF model)
sh scripts/download-openpose-res50-model.sh # ~45 MB
sh scripts/download-openpose-coco-model.sh # ~200 MB
sh scripts/download-openpose-mobile-model.sh
sh scripts/download-tinyvgg-v2-model.sh
sh scripts/download-openpose-mobile-model.sh
sh scripts/download-openpifpaf-model.sh # ~98 MB (OpenPifPaf)
sh scripts/download-ppn-res50-model.sh # ~50 MB (PoseProposal)
Manual installation
You can manually install them from our Model Zoo at GoogleDrive.
Predict a sequence of images¶
Note for docker users
The following tutorial commands are based on HyperPose commandline tool hyperpose-cli
, which is also the entry point of the container (locates in /hyperpose/build/hyperpose-cli
). If you are playing with a container, please first get into the container in interactive mode.
# Without imshow/camera functionality
docker run --rm --gpus all -it tensorlayer/hyperpose
# With imshow/camera functionality
xhost +; docker run --rm --gpus all -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --device=/dev/video0:/dev/video0 --entrypoint /bin/bash tensorlayer/hyperpose
# Once get inside the image
cd /hyperpose/build
Using a fast model¶
# cd to your build directory.
# Predict all images in `../data/media`
./hyperpose-cli --source ../data/media --model ../data/models/lopps-resnet50-V2-HW=368x432.onnx --w 368 --h 432
# The source flag can be ignored as the default value is `../data/media`.
The output images will dumped into the build folder by default. For more models, please look at /hyperpose/data/models
. Their name indicates their input tensor shape.
Ignore error message from uff
models.
If you are using the TinyVGG-V1-HW=256x384.uff
(.uff
models are going to be deprecated by TensorRT), you may meet logging messages like ERROR: Tensor image cannot be both input and output
. This is harmless and please just ignore it.
Table of flags for hyperpose-cli
¶
Note that the entry point of our official docker image is also hyperpose-cli
in the /hyperpose/build
folder.
Flag | Meaning | Default |
---|---|---|
model |
Path to your model. | ../data/models/TinyVGG-V2-HW=342x368.onnx |
source |
Path to your source. The source can be a folder path (automatically glob all images), a video path, an image path or the key word camera to open your camera. |
../data/media/video.avi |
post |
Post-processing methods. This key can be paf or ppn . |
paf |
keep_ratio |
The DNN takes a fixed input size, where the images must resize to fit that input resolution. However, not hurt the original human scale, we may want to resize by padding. And this is flag enable you to do inference without break original human ratio. (Good for accuracy) | true |
w |
The input width of your model. Currently, the trained models we provided all have specific requirements for input resolution. | 432 (for the tiny-vgg model) |
h |
The input height of your model. | 368 (for the tiny-vgg model) |
max_batch_size |
Maximum batch size for inference engine to execute. | 8 |
runtime |
Which runtime type to use. This can be operator or stream . If you want to open your camera or producing imshow window, please use operator . For better processing throughput on videos, please use stream . |
operator |
imshow |
true | Whether to open an imshow window. |
saving_prefix |
The output media resource will be named after $(saving_prefix)_$(ID).$(format) |
"output" |
alpha |
The weight of key point visualization. (from 0 to 1) | 0.5 |
logging |
Print the internal logging information or not. | false |
See also
Run ./hyperpose-cli --help
for the usage.
Using OpenPose-based (PAF) models¶
./hyperpose-cli --model ../data/models/openpose-thin-V2-HW=368x432.onnx --w 432 --h 368
./hyperpose-cli --model ../data/models/openpose-coco-V2-HW=368x656.onnx --w 656 --h 368
Use PifPaf model¶
Set --post
flag to pifpaf
to enable a PifPaf model processing.
./hyperpose-cli --model ../data/models/openpifpaf-resnet50-HW=368x432.onnx --w 368 --h 432 --post pifpaf
Convert models into TensorRT Engine Protobuf format¶
You may find that it takes minutes before the prediction really starts. This is because TensorRT will try to profile the model to get a optimized runtime model.
You can pre-compile it in advance, to save the model conversion time.
./example.gen_serialized_engine --model_file ../data/models/openpose-coco-V2-HW=368x656.onnx --input_width 656 --input_height 368 --max_batch_size 16
# You'll get ../data/models/openpose-coco-V2-HW=368x656.onnx.trt
# If you only want to do inference on single images(batch size = 1), please use `--max_batch_size 1` and this will improve the engine's performance.
# Use the converted model to do prediction
./hyperpose-cli --model ../data/models/openpose-coco-V2-HW=368x656.onnx.trt --w 656 --h 368
Caution
Currently, we support models in TensorRT float32
mode.
Other data types (e.g., int8
) are not supported at this point (welcome to contribute!).
Predict a video using Operator API¶
./hyperpose-cli --runtime=operator --source=../data/media/video.avi
The output video will be in the build folder.
Predict a video using Stream API (higher throughput)¶
./hyperpose-cli --runtime=stream --source=../data/media/video.avi
# In stream API, the imshow functionality will be closed.
Play with the camera¶
./hyperpose-cli --source=camera
# Note that camera mode is not compatible with Stream API. If you want to do inference on your camera in real time, the Operator API is designed for it.
Quick Start of Training Library¶
Prerequisites
Make sure you have installed HyperPose Python Training Library (HowTo).
If your are using commandline tools, make sure you are executing scripts under the root directory of the project (to directly call
train.py
andeval.py
).
Model training¶
The training procedure of HypePose is as simple as 3 configuration steps:
choose the pose algorithm
set the model backbone
select target dataset
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | from hyperpose import Config, Model, Dataset
# set model name to distinguish different models
Config.set_model_name("MyLightweightOpenPose")
Config.set_model_type(Config.MODEL.LightweightOpenpose) # set pose algorithm
Config.set_model_backbone(Config.BACKBONE.Vggtiny) # set model backbone
Config.set_dataset_type(Config.DATA.MSCOCO) # set target dataset
# use one GPU for training
Config.set_train_type(Config.TRAIN.Single_train)
# configuration is done, get config object and assemble the system
config = Config.get_config()
model = Model.get_model(config)
dataset = Dataset.get_dataset(config)
Model.get_train(config)(model, dataset) # start training!
|
For each model, HyperPose will save all the related files in the directory:
./save_dir/${MODEL_NAME}
, where ${MODEL_NAME}
is set in line 4 of above code sample (i.e., “MyLightweightOpenPose”).
The directory regarding training results are listed below:
Folder Name |
Path to what |
---|---|
|
Model checkpoints |
|
Visualized training samples for debugging. See debugging sample figure. |
|
Visualized evaluation samples for debugging. See debugging sample figure. |
|
Visualized testing samples for debugging. See debugging sample figure. |
|
Visualized annotated dataset samples. See annotated sample figure. |
|
Training logs (e.g., loss). |

Visualized training/evaluation/testing sample.¶

Visualized annotated dataset sample.¶
We also provide a helpful training commandline tool (train.py) to quickly train pose esitmation models. For detailed usage, please refer to this.
Model evaluation¶
The evaluate procedure of HyperPose looks quite alike to training one.
Given the model name, model checkpoint will be loaded from ./save_dir/${MODEL_NAME}/model_dir/newest_model.npz
.
from hyperpose import Config, Model, Dataset
Config.set_model_name("MyLightweightOpenPose")
Config.set_model_type(Config.MODEL.LightweightOpenpose) # set pose algorithm
Config.set_model_backbone(Config.BACKBONE.Vggtiny) # set model backbone
Config.set_dataset_type(Config.DATA.MSCOCO) # set target dataset
# configuration is done, get config object and assemble the system
config=Config.get_config()
model=Model.get_model(config)
dataset=Dataset.get_dataset(config)
Model.get_eval(config)(model, dataset) # start evaluation!
Then the integrated evaluation pipeline will start, the final evaluate metrics will be output at last.
Note
For the same model name, the algorithm, backbone, and dataset type are expected to be the consistent in training and evaluation.
The evaluation metric follows the official evaluation metric of give dataset.
Like the training commandline tool, we also have one for evaluation (eval.py).
Exporting a model¶
The trained model weight is saved as a NPZ(.npz
) file. For further deployment, the weight from NPZ can be coverted into ONNX format.
To export a model trained by HyperPose, please follow these 2 steps:
Step 1: convert the trained NPZ model into ProtoBuf format¶
We first use the @tf.function
decorator to produce the static computation graph and save it into the ProtoBuf format.
We already provide a script with cli to facilitate conversion, which located at export_pb.py.
After marking the decorators, we can use export_pb.py
to start model conversion.
# FLAGS: --model_type=${ALGORITHM_TYPE} --model_name=${MODEL_NAME} --model_backbone={BACKBONE_TYPE}
python export_pb.py --model_name=MyLightweightOpenpose --model_type=LightweightOpenpose --model_backbone=Vggtiny
Then the ProtoBuf model will be stored at ./save_dir/${MODEL_NAME}/frozen_${MODEL_NAME}.pb
.
Step 2: convert the frozen ProtoBuf format model into ONNX format¶
Note
Make sure you have installed the extra dependencies for exporting models according to training installation.
We use tf2onnx
library to convert the ProtoBuf format model into ONNX format.
However, to actually convert a model, we need to know its input/output node names.
After running Step 1, we should see output like:
...
Exported graph INPUT nodes: ['x']
Exported graph OUTPUT nodes: ['Identity', 'Identity_1']
In this example, we found the name of input/output nodes, and we need to pass those names as arguments during conversion.
# The input/output names of our example.
export INPUT_NODE_NAME=x
export INPUT_NODE_NAME0=Identity
export INPUT_NODE_NAME1=Identity_1
export OUTPUT_ONNX_MODEL=my_output_model.onnx
python -m tf2onnx.convert --graphdef frozen_${MODEL_NAME}.pb \
--output ${OUTPUT_ONNX_MODEL} \
--inputs ${INPUT_NODE_NAME}:0 \
--outputs ${INPUT_NODE_NAME0}:0,${INPUT_NODE_NAME1}:0
We then will see the converted ONNX model named ${OUTPUT_ONNX_MODEL} (‘my_output_model.onnx’ in our example).
Congratulations! now you are able to use the onnx model for Hyperpose prediction library.
Next step¶
For in-depth usage of HyperPose Training Library, please refer to our training tutorial.
Tutorials¶
Tutorial for Prediction Library¶
The prediction library of hyperpose provides 2 APIs styles:
Operator API: Imperative style. (more user manipulation space)
Stream API: Declarative style. (faster and simpler)
This tutorial will show you how to use them in C++ step by step. For more detailed instructions, please refer to our C++ API documents.
End-2-end Prediction Using Stream API¶
In this section, we’ll try to process a video via Stream API.
Before all, please make sure you have the library successfully built(See installation). And we encourage you to build the tutorial examples under the folder
hyperpose/examples/user_codes
.
cd examples/user_codes
touch main.cpp
# Open your editor and do coding.
cd ../..
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=RELEASE -DBUILD_USER_CODES=ON # BUILD_USER_CODES is by default on
cmake --build .
# Execute your codes.
Include the header¶
#include <hyperpose/hyperpose.hpp>
The file arrangement of include files:
hyperpose
hyperpose.hpp (include all necessary headers)
operator (headers about Operator API)
stream (headers about Stream API)
utility
Usually, you only need to include hyperpose/hyperpose.hpp
.
Prepare Your Model¶
We support 3 types of model file:
Uff: Users need to specify the input/output nodes of the network compute graph.
ONNX: No input/output nodes information is required.
Cuda Engine Protobuf: When importing Uff / ONNX models, TensorRT will do profiling to build the “best” runtime engine. To save the building time, we can export the model to
Cuda Engine Protobuf
format and reload it in next execution.
using namespace hyperpose;
// To use a Uff model, users needs to specify the input/output nodes.
// Here, `image` is the input node name, and `outputs/conf` and `outputs/paf` are the output feature maps. (related to the PAF algorithm)
const dnn::uff uff_model{ "../data/models/TinyVGG-V1-HW=256x384.uff", "image", {"outputs/conf", "outputs/paf"} };
Create Input / Output Stream¶
We support std::vector<cv::Mat>
, cv::Mat
, cv::VideoCapture
as the inputs of input stream.
We also support cv::VideoWriter
, NameGenerator
(a callable object which generate next name for the output image) as output streams.
// For best performance, HyperPose only allows models who have fixed input network resolution.
// What is "network resolution"? Say that the input of networks are NCHW format, the "HW" is the network resolution.
const cv::Size network_resolution{384, 256};
// * Input video.
auto capture = cv::VideoCapture("../data/media/video.avi");
// * Output video.
auto writer = cv::VideoWriter(
"output.avi",
capture.get(cv::CAP_PROP_FOURCC), capture.get(cv::CAP_PROP_FPS),
network_resolution); // Here we use the network resolution as the output video resolution.
Create DNN Engine and Post-processing Parser¶
// * Create TensorRT engine.
dnn::tensorrt engine(uff_model, network_resolution);
// * post-processing: Using paf.
parser::paf parser{};
Create Stream Scheduler¶
// * Create stream
auto stream = make_stream(engine, parser);
Connect the Stream¶
// * Connect input stream.
stream.async() << capture;
// * Connect ouput stream and wait.
stream.sync() >> writer;
We provides 2 ways of stream connection.
.async()
Input Stream: The stream scheduler will push images asynchronously. (not blocked)
Output Stream: The stream scheduler will generate results asynchronously. (not blocked)
.sync()
Input Stream: Blocked. This may cause deadlock if you trying to push a big number of images in a synchronous way(the buffer queue is of fixed size).
Output Stream: Blocked until all outputs are generated.
We recommend you to set inputs via
async()
and generate results viasync()
.
Full example¶
Full examples are available here.
Prediction Using Operator API¶
Preparation¶
#include <hyperpose/hyperpose.hpp>
int main() {
using namespace hyperpose;
const cv::Size network_resolution{384, 256};
const dnn::uff uff_model{ "../data/models/TinyVGG-V1-HW=256x384.uff", "image", {"outputs/conf", "outputs/paf"} };
// * Input video.
auto capture = cv::VideoCapture("../data/media/video.avi");
// * Output video.
auto writer = cv::VideoWriter(
"output.avi", capture.get(cv::CAP_PROP_FOURCC), capture.get(cv::CAP_PROP_FPS), network_resolution);
// * Create TensorRT engine.
dnn::tensorrt engine(uff_model, network_resolution);
// * post-processing: Using paf.
parser::paf parser{};
while (capture.isOpened()) {
// ..... Applying pose estimation in one batch.
}
}
Apply One Batch of Frames¶
Accumulate One Batch¶
std::vector<cv::Mat> batch;
// The .max_batch_size() of dnn::tensorrt is set in the initializer. (by default -> 8)
// initializer(model_config, input_size, max_batch_size = 8, dtype = float, factor = 1./255, flip_rgb = true)
for (int i = 0; i < engine.max_batch_size(); ++i) {
cv::Mat mat;
capture >> mat;
if (mat.empty()) // If the video ends, break.
break;
batch.push_back(mat);
}
// Now we got a batch of images. -> batch.
Get Feature/Activation Maps¶
// * TensorRT Inference.
std::vector<internal_t> feature_map_packets = engine.inference(batch);
// using internal_t = std::vector<feature_map_t>
// Here, feature_map_packets = batch_size * feature_map_count(conf and paf) * feature_map.
Get Human Topology¶
One image may contain many humans. So the return type is std::vector<human_t>
.
// * Paf.
std::vector<std::vector<human_t>> pose_vectors; // image_count * humans
for (auto& packet : feature_map_packets)
pose_vectors.push_back(parser.process(packet[0]/* conf */, packet[1] /* paf */));
Visualization¶
// * Visualization
for (size_t i = 0; i < batch.size(); ++i) {
cv::resize(batch[i], batch[i], network_resolution);
for (auto&& pose : pose_vectors[i])
draw_human(batch[i], pose); // Visualization.
writer << batch[i];
}
Full example¶
Full examples are available here.
Tutorial for Training Library¶
Overview¶
HyperPose Python Training library provides a one-step yet flexible platform for developing pose estimation models.
Based on the intended usage, there are two major categories of user requirements regarding developing a pose estimation system:
Adapting existing algorithms to specific deployment scenarios: e.g., select the pose estimation model architecture with best accuracy conditioned on the limited hardware resources.
Developing customized pose estimation algorithms.: e.g., explore new pose estimation algorithms with the help of existing dataset generation pipeline and model architectures.
To meet these 2 kinds of user requirements, HyperPose provides both rich high-level APIs with integrated pipelines (for the first kind of requirement) and fine-grained APIs with in-depth customisation (for second kind of requirement).
Model/Algorithm/Dataset Supports¶
5 model algorithm classes¶
Algorithm |
Class API |
Description |
---|---|---|
|
Original OpenPose algorithm. |
|
|
A light-weight variant of OpenPose with optimized prediction branch, designed for fast processing. |
|
|
A light-weight variant of OpenPose with adapted MobilNet backbone, featured with fast inference. |
|
|
A pose estimation algorithm which models key point detection as object detection with bounding box, featured with fast inference and post-processing. |
|
|
An accurate pose esitmation algorithm that generates high resolution confidence map, featured with high accuracy over low resolution images. |
10 model backbone classes¶
Model family |
Backbone class APIs |
---|---|
Vgg Backbones |
|
Resnet Backbones |
|
Mobilnet Backbones |
|
2 dataset classes¶
Dataset name |
Version |
Size |
---|---|---|
|
11’827 train images, 5’000 validation images, 40’670 test images |
|
MPII 2014 (version published in 2014) |
25’000 train images, 3’000 validation images, 7’000 test images. |
Training Options¶
Parallel Training
Backbone Pretraining
Domain Adaptation
Extensions¶
Customized dataset
User-supplemented dataset
User may supplement their self-collected data into preset dataset generation pipeline for training and evaluation.
User-defined dataset
User may define their own dataset class to take over the whole dataset generation pipeline.
Customized model
User define thier own model class to take over the model forwarding and loss calculation procedure.
Customized pipeline
User use the provided pre-processors, post-processors and visualizers to assemble their own training or evalution pipeline.
Integrated pipeline¶
HyperPose integrates training, evaluating and testing pipeline with various High-level APIs for quickly adapting the existing pose estimation algorithms for their customized usage scenarios.
The whole procedure can be devided into two parts:
In the first part, users use the set
APIs of the Config module to set up the components of the pipeline. User can set up from the general settings such as algorithm type, network architecture and dataset type, to the detailed settings including training batch size, save interval and learning rate etc.
In the second part, users use the get
APIs of the Model module and the Dataset module to assemble the system. After the configuration is finished, user could get a config object containing all the configuration. Pass the config object to the get
APIs, users get the configured model, dataset, and the train or evaluate pipeline.
The critical set
APIs are below: (necessary)
Config.set_model_name
Receive a string, which is used to uniquely identify the model with a name(string).
Model name is important as all the files generated during train, evaluate and test procedure of the specific model will be stored at ./save_dir/${MODEL_NAME} directory.
Precisely, the related paths of the model with name of ${MODEL_NAME} are below:
Related paths to store files of the specific model.¶ Folder Name
Path to what
./save_dir/${MODEL_NAME}/model_dir
Model checkpoints.
./save_dir/${MODEL_NAME}/train_vis_dir
Directory to save the train visualization images.
./save_dir/${MODEL_NAME}/eval_vis_dir
Directory to save the evaluate visualization images.
./save_dir/${MODEL_NAME}/test_vis_dir
Directory to save the test visualization images.
./save_dir/${MODEL_NAME}/data_vis_dir
Directory to save the dataset label visualization images.
./save_dir/${MODEL_NAME}/frozen_${MODEL_NAME}.pb
The default path to save the exported ProtoBuf format model.
./save_dir/${MODEL_NAME}/log.txt
The default path to save the training logs (e.g., loss).
Config.set_model_type
Receive an Enum value from Config.MODEL , which is used to determine the algorithm type to use.
Available options are:
Available options of Config.set_model_type¶ Available option
Description
Config.MODEL.OpenPose
OpenPose algorithm
Config.MODEL.LightweightOpenpose
Lightweight OpenPose algorithm
Config.MODEL.MobilnetThinOpenpose
MobilenetThin OpenPose algorithm
Conifg.MODEL.Poseproposal
Pose Proposal Network algorithm
Config.MODEL.Pifpaf
Pifpaf algorithm
Config.set_model_backbone
Receive an Enum value from Config.BACKBONE , which is used to determine the network backbone to use.
Different backbones will result in huge difference of the required computation resources. Each algorithm type has a default model backbone, while HyperPose also provides other backbones for replacement.
Available options are:
Available options of Config.set_model_backbone¶ Available option
Description
Config.BACKBONE.Default
Use the default backbone of the preset algorithm
Config.BACKBONE.Vggtiny
Adapted Vggtiny backbone
Config.BACKBONE.Vgg16
Vgg16
Config.BACKBONE.Vgg19
Vgg19
Config.BACKBONE.Resnet18
Resnet18
Config.BACKBONE.Resnet50
Resnet50
Config.BACKBONE.Mobilenetv1
Mobilenetv1
Config.BACKBONE.Mobilenetv2
Mobilenetv2
Config.set_dataset_type
Receive an Enum value from Config.DATA , which is used to determine the dataset to use.
Different dataset will result in different train and evalution images and different evaluation metrics.
Available options are:
Available options of Config.set_model_backbone¶ Available option
Description
Config.DATA.COCO
Config.DATA.MPII
Config.DATA.USERDEF
Use user defined dataset.
Use the necessary set
APIs above, the basic model and dataset configuration is done, users can get the config object which contains all the configurations using the Config.get_config API:
Config.get_config
Receive nothing, return the config object.
Then user can get the model and dataset object for either train or evaluating using the get
APIs.
The critical Get APIs are below:
Model.get_model
Receive the config object and return a configured model object.
The model object comes from the Tensorlayer.Model class and should have the following functions:
Functions of the model object¶ Function name
Function utility
forward
Input image.
Return predicted heat map.
cal_loss
Input predicted heat map and ground truth heat map.
Return calcuated loss value.
save_weights
Input save path and save format.
Save the trained model weight.
Dataset.get_dataset
Receive the config object and return a configured dataset object.
The dataset object should have the following functions:
Functions of the dataset object¶ Function name
Function utility
get_parts
Input nothing.
Return pre-defined keypoints in the format of an Enum object.
get_colors
Input nothing.
Return pre-defined corresponding colors of keypoints in the format of a list.
generate_train_data
Input nothing.
Return a train image path list and the corresponding target list. Each target in the target list is a dict object with keys of kpt (keypoint), mask (mask of unwanted image area) and bbx (keypoint bounding box)
generate_eval_data
Input nothing.
Return a eval image path list and the corresponding image-id list.
generate_test_data
Input nothing.
Return a test image path list and the corresponding image-id list.
No matter using the train or evaluate pipeline, the above set
and get
process is always neccesarry to get the specific model and config object.
User can either assemble thier own pipeline use the model and dataset object at hand, or they can use more fine-grained APIs to control the train and evaluate pipeline before use the Config.get_config API, so that they could use the config object to obain preset integrated train and evaluate pipeline for easy development.
How to use the Integrated train and evaluate pipeline are below.
Integrated train pipeline¶
As mentioned above, the above usage of set
and get
APIs to get the model and dataset object are always necessary in HyperPose.
To use the integrated train pipeline, the extra configuration APIs provided are below: (Unnecessary for integrated train)
Config.set_batch_size
Receive a integer, which is used as batch size in train procedure.
Config.set_learning_rate
Receive a floating point, which is used as the learning arte in train procedure.
Config.set_log_interval
Receive a integer, which is used as the interval bwteen logging loss information.
Config.set_train_type
Receive an Enum value from Config.TRAIN, which is used to determine the parallel training strategy.
Available options:
Config.TRAIN.Single_train
Use single GPU for training.
Config.TRAIN.Parallel_train
Use mutiple GPUs for parallel training.(Using Kungfu distributed training library)
Config.set_kungfu_option
Receive an Enum value from Config.KUNGFU, which is used to determine the optimize startegy of parallel training.
Available options:
Config.KUNGFU.Sync_sgd
Config.KUNGFU.Sync_avg
Config.KUNGFU.Pair_avg
Then we need to use the get
API to get the train pipeline from the model module:(necessary for integrated train)
Model.get_train
Receive the config object, return a training function.
The training function takes the model object and the dataset object and automatically start training.
The basic code to use the integrate training pipeline is below:
# set the train configuartion using 'set' APIs
# import modules of hyperpose
from hyperpose import Config,Model,Dataset
# set model name
Config.set_model_name(args.model_name)
# set model architecture
Config.set_model_type(Config.MODEL.LightweightOpenpose)
# set model backbone
Config.set_model_backbone(Config.BACKBONE.Vggtiny)
# set dataset to use
Config.set_dataset_type(Config.DATA.COCO)
# set training type
Config.set_train_type(Config.TRAIN.Single_train)
# assemble the system using 'get' APIs
# get config object
config=Config.get_config()
# get model object
model=Model.get_model(config)
# get dataset object
dataset=Dataset.get_dataset(config)
# get train pipeline
train=Model.get_train(config)
# train!
train(model,dataset)
To enable the parallel training, install the Kungfu library according to the installation guide, and using the following command when run your program.
# Assuming we have 4 GPUs and train.py is the python script that contain your HyperPose code
CUDA_VISIBLE_DEVICES=0,1,2,3 kungfu-run -np 4 python train.py
Integrated evaluate pipeline¶
The usage of the integrated evaluate pipeline is similar to the usage pf the integrated training pipeline.
The differences is that we use get
APIs to get the evaluate pipeline.
(Remeber, we still need the set
and get
APIs used to get the model and dataset object as in how we use the integrate train pipeline.)
The API that we use get
API to get the evaluate pipeline is below:
Model.get_eval
Receive the config object, return a evaluate function.
The evaluate function take the model object and the dataset object, and automatically start evaluating.
The basic code to use the integrate evaluate pipeline is below:
# set the evaluate pipeline using 'set' APIs
# import modules of hyperpose
from hyperpose import Config,Model,Dataset
# set model name to be eval
Config.set_model_name(args.model_name)
# set the model architecture and backbone according to the training configuration of the model to be evaluated.
Config.set_model_type(Config.MODEL.LightweightOpenpose)
Config.set_model_backbone(Config.BACKBONE.Vggtiny)
# set dataset to use
Config.set_dataset_type(Config.DATA.MSCOCO)
# assemble the system using 'get' APIs
# get config object
config=Config.get_config()
# get model object
model=Model.get_model(config)
# get dataset object
dataset=Dataset.get_dataset(config)
# get evaluate pipeline
eval=Model.get_eval(config)
# evaluate!
eval(model,dataset)
It should be noticed that:
the model architecture, model backbone, dataset type should be the same with the configuration under which model was trained.
the model to evaluate will be loaded from the ./save_dir/${MODEL_NAME}/model_dir/newest_model.npz.
the evaluation metrics will follow the official evaluation metrics of dataset.
User-defined model architecture¶
HyperPose leaves freedom for user to define thier own model architecture but use the provided integrated model pipeline at the same time, the following points should be considered:
1.the model should be an object of a tensorlayer.models.Model class (or inherite from this class)
2.the model should have foward and cal_loss functions that has exactly the same input and output format with the preset model architectures. one can refer Model.LightweightOpenpose class for reference.
To do this, user still need to set model_type to determine the training pipeline, here the model_type should be the model that has the similair data processing pipeline with the user’s own model. Then he can use the set_model_arch function to pass his own model object
Config.set_model_name(your_model_name)
Config.set_model_type(similiar_ model_type)
Config.set_model_arch(your_model_arch)
The other configuration procedures are the same with the integrated training pipeline.
User-defined dataset¶
HyperPose allows user to use their own dataset to be integrated with the training and evaluating pipeline, as long as it has the following attribute and functions:
get_train_dataset:
Return a tensorflow dataset object where each element should be a image path and a serialized dict(using _pickle library to serialize) which at least have the three key-value pairs:
“kpt”: a list contains keyspoints for each labeled human, for example:[[kpt1,kpt2,…,kptn],[kpt1,kpt2,…,kptn]] is a list with two labeld humans, where each kpt is a [x,y] coordinate such as [234,526],etc
“bbx”: a list contains boundingbox for each labeled human, for example:[bbx1,bbx2] is a list with two labeled humans, where each bbx is a [x,y,w,h] array such as [234,526,60,80], necessary for pose proposal network, could be set to None for others
“mask”: a mask (in mscoco polynomial format) used to cover unlabeled people, could be set to None
get_eval_dataset:
Return a tensorflow dataset object where each element should be a image path and its image id.
get_input_kpt_cvter(optional):
Return a function which changes the kpt value in your dataset dict element,used to enable your dataset keypoint annotation being in line with your model keypoint setting, or combined with other dataset with different keypoint annotation.
get_output_kpt_cvter(optional):
Return a function which changes the model predict result to a format that easy to evaluate, used to enable your datatset to be evaluate at a formal COCO standard (using MAP) or MPII standard (using MPCH).
User-defined dataset filter¶
HyperPose also leave freedom for user to define thier own dataset filter to filter the dataset as they want using set_dataset_filter function.
To use this, a user should know the follow information:
HyperPose organize the annotations of one image in one dataset in the similiar meta classes.
For COCO dataset, it is COCOMeta; For MPII dataset, it is MPIIMeta.
Meta classes will have some common information such as image_id, joint_list etc, they also have some dataset-specific imformation, such as mask, is_crowd, headbbx_list etc.
The dataset_fiter will perform on the Meta objects of the corresponding dataset, if it returns True, the image and annotaions the Meta object related will be kept, otherwise it will be filtered out. Please refer the Dataset.xxxMeta classes for better use.
Please refer Dataset.COCOMeta,Dataset.MPIIMeta classes for better use.
def my_dataset_filter(coco_meta):
if(len(coco_meta.joint_list)<5 and (not coco_meta.is_crowd)):
return True
else:
return False
Config.set_dataset_filter(my_dataset_filter)
User-defined train pipeline¶
HyperPose also provides three low level functions to help user consturct thier own pipeline. For each class of model, functions of preprocess, postprocess and visualize are provided.
Model.get_preprocess Receive an Enum value from Config.MODEL, return a preprocess function.
The preprocess function is able to convert the annotaion into targets used for training the model.
preprocess=Model.get_preprocess(Config.MODEL.LightweightOPenpose)
conf_map,paf_map=preprocess(annos,img_height,img_width,model_hout,model_wout,Config.DATA.MSCOCO,data_format="channels_first")
pd_conf_map,pd_paf_map=my_model.forward(input_image[np.newaxis,...])
my_model.cal_loss(conf_map,pd_conf_map,paf_map,pd_paf_map)
Model.get_postprocess Receive an Enum value from Config.MODEL, return a postprocess function.
The postprocess function is able to convert the model output into parsed human objects for evaluating and visualizing.
pd_conf_map,pd_paf_map=my_model.forward(input_image[np.newaxis,...])
postprocess=Model.get_postprocess(Config.MODEL.LightweightOpenpose)
pd_humans=postprocess(pd_conf_map,pd_paf_map,dataset_type,data_format="channels_first")
for pd_human in pd_humans:
pd_human.draw(input_image)
get_visualize Receive an Enum value from Config.MODEL, return a visualize function.
The visualize function is able visualize model’s ouput feature map.
pd_conf_map,pd_paf_map=my_model.forward(input_image[np.newaxis,...])
visualize=Model.get_visualize(Config.MODEL.LightweightOpenpose)
visualize(input_image,pd_conf_map,pd_paf_map,save_name="my_visual",save_dir="./vis_dir")
API Reference¶
C++ Prediction API¶
Python Training API¶
hyperpose package¶
Subpackages¶
hyperpose.Config package¶
-
class
hyperpose.Config.define.
BACKBONE
(value)¶ Bases:
enum.Enum
An enumeration.
-
Default
= 0¶
-
Mobilenetv1
= 1¶
-
Mobilenetv2
= 6¶
-
Resnet18
= 3¶
-
Resnet50
= 4¶
-
Vgg16
= 7¶
-
Vgg19
= 2¶
-
Vggtiny
= 5¶
-
-
class
hyperpose.Config.define.
DATA
(value)¶ Bases:
enum.Enum
An enumeration.
-
MPII
= 1¶
-
MSCOCO
= 0¶
-
MULTIPLE
= 3¶
-
USERDEF
= 2¶
-
-
class
hyperpose.Config.define.
KUNGFU
(value)¶ Bases:
enum.Enum
An enumeration.
-
Pair_avg
= 2¶
-
Sync_avg
= 1¶
-
Sync_sgd
= 0¶
-
-
class
hyperpose.Config.define.
MODEL
(value)¶ Bases:
enum.Enum
An enumeration.
-
LightweightOpenpose
= 1¶
-
MobilenetThinOpenpose
= 3¶
-
Openpose
= 0¶
-
Pifpaf
= 4¶
-
PoseProposal
= 2¶
-
-
hyperpose.Config.
get_config
()¶ get the config object with all the configuration information
get the config object based on the previous setting functions, the config object will be passed to the functions of Model and Dataset module to construct the system.
only the setting functions called before this get_config function is valid, thus use this function after all configuration done.
- Parameters
- None
- Returns
- config object
an edict object contains all the configuration information.
-
hyperpose.Config.
set_batch_size
(batch_size)¶ set the batch size in training
- Parameters
- arg1int
batch_size
- Returns
- None
-
hyperpose.Config.
set_data_format
(data_format)¶ set model dataformat
set the channel order of current model:
“channels_first” dataformat is faster in deployment“channels_last” dataformat is more common the integrated pipeline will automaticly adapt to the chosen data format- Parameters
- arg1string
available input:
- | ‘channels_first’: data_shape N*C*H*W
- | ‘channels_last’: data_shape N*H*W*C
- Returns
- None
-
hyperpose.Config.
set_dataset_filter
(dataset_filter)¶ set the user defined dataset filter
set the dataset filter as the input function. to uniformly format different dataset, Hyperpose organize the annotations of one image in one dataset in the similiar meta classes. for COCO dataset, it is COCOMeta; for MPII dataset, it is MPIIMeta. Meta classes will have some common information such as image_id, joint_list etc, they also have some dataset-specific imformation, such as mask, is_crowd, headbbx_list etc.
the dataset_fiter will perform on the Meta objects of the corresponding dataset, if it returns True, the image and annotaions the Meta object related will be kept, otherwise it will be filtered out. Please refer the Dataset.xxxMeta classes for better use.
- Parameters
- arg1function
a function receive a meta object as input, return a bool value indicates whether the meta should be kept or filtered out. return Ture for keeping and False for depricating the object. default: None
- Returns
- None
-
hyperpose.Config.
set_dataset_path
(dataset_path)¶ set the path of the dataset
set the path of the directory where dataset is,if the dataset doesn’t exist in this directory, then it will be automaticly download in this directory and decoded.
- Parameters
- arg1String
a string indicates the path of the dataset, default: ./data
- Returns
- None
-
hyperpose.Config.
set_dataset_type
(dataset_type)¶ set the dataset for train and evaluate
set which dataset to use, the process of downlaoding, decoding, reformatting of different type of dataset is automatic. the evaluation metric of different dataset follows their official metric, for COCO is MAP, for MPII is MPCH.
This API also receive user-defined dataset class, which should implement the following functions
__init__: take the config object with all configuration to init the datasetget_parts: return a enum class which defines the key point definition of the datasetget_limbs: return a [2*num_limbs] array which defines the limb definition of the datasetget_colors: return a list which defines the visualization color of the limbsget_train_dataset: return a tensorflow dataset which contains elements for training. each element should contains an image path and a target dict decoded in bytes by _pickleget_eval_dataset: return a tensorflow dataset which contains elements for evaluating. each element should contains an image path and an image idofficial_eval: if want to evaluate on this user-defined dataset, evalutation function should be implemented. one can refer the Dataset.mpii_dataset and Dataset.mscoco_dataset for detailed information.- Parameters
- arg1Config.DATA
a enum value of enum class Config.DATA or user-defined dataset available options:
- | Config.DATA.MSCOCO
- | Config.DATA.MPII
- | user-defined dataset
- Returns
- None
-
hyperpose.Config.
set_dataset_version
(dataset_version)¶
-
hyperpose.Config.
set_domainadapt_dataset
(domainadapt_train_img_paths, domainadapt_scale_rate=1)¶
-
hyperpose.Config.
set_kungfu_option
(kungfu_option)¶ set the optimizor of parallel training
kungfu distribute training library needs to wrap tensorflow optimizor in kungfu optimizor, this function is to choose kungfu optimizor wrap type
- Parameters
- arg1Config.KUNGFU
a enum value of enum class Config.KUNGFU available options:
- | Config.KUNGFU.Sync_sgd (SynchronousSGDOptimizer, hyper-parameter-robus)
- | Config.KUNGFU.Sync_avg (SynchronousAveragingOptimizer)
- | Config.KUNGFU.Pair_avg (PairAveragingOptimizer, communication-efficient)
- Returns
- None
-
hyperpose.Config.
set_learning_rate
(learning_rate)¶ set the learning rate in training
- Parameters
- arg1float
learning rate
- Returns
- None
-
hyperpose.Config.
set_log_interval
(log_interval)¶ set the frequency of logging
set the how many iteration intervals between two log information
- Parameters
- arg1Int
a int value indicates the iteration number bwteen two logs default: 1
- Returns
- None
-
hyperpose.Config.
set_model_arch
(model_arch)¶ set user defined model architecture
replace default model architecture with user-defined model architecture, use it in the following training and evaluation
- Parameters
- arg1tensorlayer.models.MODEL
An object of a model class inherit from tensorlayer.models.MODEL class, should implement forward function and cal_loss function to make it compatible with the existing pipeline
The forward funtion should follow the signature below:
- | openpose models: def forward(self,x,is_train=False) ,return conf_map,paf_map,stage_confs,stage_pafs
- | poseproposal models: def forward(self,x,is_train=False), return pc,pi,px,py,pw,ph,pe
The cal_loss function should follow the signature below:
- | openpose models: def cal_loss(self,stage_confs,stage_pafs,gt_conf,gt_paf,mask), return loss,loss_confs,loss_pafs
- | poseproposal models: def cal_loss(self,tc,tx,ty,tw,th,te,te_mask,pc,pi,px,py,pw,ph,pe):
return loss_rsp,loss_iou,loss_coor,loss_size,loss_limb
- Returns
- None
-
hyperpose.Config.
set_model_backbone
(model_backbone)¶ set preset model backbones
set current model backbone to other common backbones different backbones have different computation complexity this enable dynamicly adapt the model architecture to approriate size.
- Parameters
- arg1Config.BACKBONE
a enum value of enum class Config.BACKBONE available options:
- | Config.BACKBONE.DEFUALT (default backbone of the architecture)
- | Config.BACKBONE.MobilenetV1
- | Config.BACKBONE.MobilenetV2
- | Config.BACKBONE.Vggtiny
- | Config.BACKBONE.Vgg16
- | Config.BACKBONE.Vgg19
- | Config.BACKBONE.Resnet18
- | Config.BACKBONE.Resnet50
- Returns
- None
-
hyperpose.Config.
set_model_limbs
(userdef_limbs)¶
-
hyperpose.Config.
set_model_name
(model_name)¶ set the name of model
the models are distinguished by their names,so it is necessary to set model’s name when train multiple models at the same time. each model’s ckpt data and log are saved on the ‘save_dir/model_name’ directory, the following directory are determined:
directory to save model ./save_dir/model_name/model_dirdirectory to save train result ./save_dir/model_name/train_vis_dirdirectory to save evaluate result ./save_dir/model_name/eval_vis_dirdirectory to save dataset visualize result ./save_dir/model_name/data_vis_dirfile path to save train log ./save_dir/model_name/log.txt- Parameters
- arg1string
name of the model
- Returns
- None
-
hyperpose.Config.
set_model_parts
(userdef_parts)¶
-
hyperpose.Config.
set_model_type
(model_type)¶ set preset model architecture
configure the model architecture as one of the desired preset model architectures
- Parameters
- arg1Config.MODEL
a enum value of enum class Config.MODEL, available options:
- | Config.MODEL.Openpose (original Openpose)
- | Config.MODEL.LightweightOpenpose (lightweight variant version of Openpose,real-time on cpu)
- | Config.MODEL.PoseProposal (pose proposal network)
- | Config.MODEL.MobilenetThinOpenpose (lightweight variant version of openpose)
- Returns
- None
-
hyperpose.Config.
set_multiple_dataset
(multiple_dataset_configs)¶
-
hyperpose.Config.
set_official_dataset
(official_flag)¶
-
hyperpose.Config.
set_optim_type
(optim_type)¶
-
hyperpose.Config.
set_pretrain
(enable)¶
-
hyperpose.Config.
set_pretrain_dataset_path
(pretrain_dataset_path)¶
-
hyperpose.Config.
set_save_interval
(save_interval)¶
-
hyperpose.Config.
set_train_type
(train_type)¶ set single_train or parallel train
default using single train, which train the model on one GPU. set parallel train will use Kungfu library to accelerate training on multiple GPU.
to use parallel train better, it is also allow to set parallel training optimizor by set_kungfu_option.
- Parameters
- arg1Config.TRAIN
a enum value of enum class Config.TRAIN,available options:
- | Config.TRAIN.Single_train
- | Config.TRAIN.Parallel_train
- Returns
- None
-
hyperpose.Config.
set_useradd_data
(useradd_train_img_paths, useradd_train_targets, useradd_scale_rate=1)¶
-
hyperpose.Config.
set_userdef_dataset
(userdef_dataset)¶
hyperpose.Dataset package¶
-
class
hyperpose.Dataset.mpii_dataset.dataset.
MPII_dataset
(config, input_kpt_cvter=None, output_kpt_cvter=None, dataset_filter=None)¶ Bases:
hyperpose.Dataset.base_dataset.Base_dataset
a dataset class specified for mpii dataset, provides uniform APIs
Methods
get_eval_dataset
(self[, in_list])provide uniform tensorflow dataset for evaluating
get_train_dataset
(self[, in_list, …])provide uniform tensorflow dataset for training
official_eval
(self, pd_anns[, eval_dir])providing official evaluation of MPII dataset
prepare_dataset
(self)download,extract, and reformat the dataset the official dataset is in .mat format, format it into json format automaticly.
visualize
(self[, vis_num])visualize annotations of the train dataset
generate_eval_data
generate_test_data
generate_train_data
get_colors
get_dataset_type
get_eval_datasize
get_input_kpt_cvter
get_output_kpt_cvter
get_parts
get_test_dataset
get_test_datasize
get_train_datasize
official_test
set_dataset_version
set_input_kpt_cvter
set_output_kpt_cvter
-
generate_eval_data
(self)¶
-
generate_test_data
(self)¶
-
generate_train_data
(self)¶
-
get_colors
(self)¶
-
get_dataset_type
(self)¶
-
get_input_kpt_cvter
(self)¶
-
get_output_kpt_cvter
(self)¶
-
get_parts
(self)¶
-
official_eval
(self, pd_anns, eval_dir='./eval_dir')¶ providing official evaluation of MPII dataset
output model metrics of PCHs on mpii evaluation dataset(split automaticly)
- Parameters
- arg1String
A string path of the json file in the same format of cocoeval annotation file(person_keypoints_val2017.json) which contains predicted results. one can refer the evaluation pipeline of models for generation procedure of this json file.
- arg2String
A string path indicates where the result json file which contains MPII PCH metrics of various keypoint saves.
- Returns
- None
-
official_test
(self, pd_anns, test_dir='./test_dir')¶
-
prepare_dataset
(self)¶ download,extract, and reformat the dataset the official dataset is in .mat format, format it into json format automaticly.
- Parameters
- None
- Returns
- None
-
set_input_kpt_cvter
(self, input_kpt_cvter)¶
-
set_output_kpt_cvter
(self, output_kpt_cvter)¶
-
visualize
(self, vis_num=10)¶ visualize annotations of the train dataset
visualize the annotation points in the image to help understand and check annotation the visualized image will be saved in the “data_vis_dir” of the corresponding model directory(specified by model name). the visualized annotations are from the train dataset.
- Parameters
- arg1Int
An integer indicates how many images with their annotations are going to be visualized.
- Returns
- None
-
-
hyperpose.Dataset.mpii_dataset.dataset.
init_dataset
(config)¶
-
class
hyperpose.Dataset.mpii_dataset.define.
MpiiPart
(value)¶ Bases:
enum.Enum
An enumeration.
-
Headtop
= 9¶
-
LAnkle
= 5¶
-
LElbow
= 14¶
-
LHip
= 3¶
-
LKnee
= 4¶
-
LShoulder
= 13¶
-
LWrist
= 15¶
-
Pelvis
= 6¶
-
RAnkle
= 0¶
-
RElbow
= 11¶
-
RHip
= 2¶
-
RKnee
= 1¶
-
RShoulder
= 12¶
-
RWrist
= 10¶
-
Thorax
= 7¶
-
UpperNeck
= 8¶
-
static
from_coco
(human)¶
-
-
hyperpose.Dataset.mpii_dataset.define.
opps_input_converter
(mpii_kpts)¶
-
hyperpose.Dataset.mpii_dataset.define.
opps_output_converter
(kpt_list)¶
-
hyperpose.Dataset.mpii_dataset.define.
ppn_input_converter
(coco_kpts)¶
-
hyperpose.Dataset.mpii_dataset.define.
ppn_output_converter
(kpt_list)¶
-
class
hyperpose.Dataset.mpii_dataset.format.
MPIIMeta
(image_path, annos_list)¶ Bases:
object
Methods
to_anns_list
-
to_anns_list
(self)¶
-
-
class
hyperpose.Dataset.mpii_dataset.format.
PoseInfo
(image_dir, annos_path, dataset_filter=None)¶ Bases:
object
Methods
get_center_list
get_headbbx_list
get_image_annos
get_image_id_list
get_image_list
get_kpt_list
get_scale_list
-
get_center_list
(self)¶
-
get_headbbx_list
(self)¶
-
get_image_annos
(self)¶
-
get_image_id_list
(self)¶
-
get_image_list
(self)¶
-
get_kpt_list
(self)¶
-
get_scale_list
(self)¶
-
-
hyperpose.Dataset.mpii_dataset.format.
generate_json
(mat_path, is_test=False)¶
-
hyperpose.Dataset.mpii_dataset.generate.
generate_eval_data
(eval_images_path, eval_annos_path, dataset_filter=None)¶
-
hyperpose.Dataset.mpii_dataset.generate.
generate_test_data
(test_images_path, test_annos_path)¶
-
hyperpose.Dataset.mpii_dataset.generate.
generate_train_data
(train_images_path, train_annos_path, dataset_filter=None, input_kpt_cvter=<function <lambda> at 0x7f6740a9fbf8>)¶
-
hyperpose.Dataset.mpii_dataset.prepare.
prepare_dataset
(dataset_path)¶
-
hyperpose.Dataset.mpii_dataset.utils.
affine_transform
(pt, t)¶
-
hyperpose.Dataset.mpii_dataset.utils.
get_3rd_point
(a, b)¶
-
hyperpose.Dataset.mpii_dataset.utils.
get_affine_transform
(center, scale, rot, output_size, shift=array([0.0, 0.0], dtype=float32), inv=0)¶
-
hyperpose.Dataset.mpii_dataset.utils.
get_dir
(src_point, rot_rad)¶
-
class
hyperpose.Dataset.mscoco_dataset.dataset.
MSCOCO_dataset
(config, input_kpt_cvter=None, output_kpt_cvter=None, dataset_filter=None)¶ Bases:
hyperpose.Dataset.base_dataset.Base_dataset
a dataset class specified for coco dataset, provides uniform APIs
Methods
get_eval_dataset
(self[, in_list])provide uniform tensorflow dataset for evaluating
get_train_dataset
(self[, in_list, …])provide uniform tensorflow dataset for training
official_eval
(self, pd_anns[, eval_dir])providing official evaluation of COCO dataset
prepare_dataset
(self)download,extract, and reformat the dataset the official format is in zip format, extract it into json files and image files.
visualize
(self, vis_num)visualize annotations of the train dataset
generate_eval_data
generate_test_data
generate_train_data
get_colors
get_dataset_type
get_eval_datasize
get_input_kpt_cvter
get_output_kpt_cvter
get_parts
get_test_dataset
get_test_datasize
get_train_datasize
official_test
set_dataset_version
set_input_kpt_cvter
set_output_kpt_cvter
-
generate_eval_data
(self)¶
-
generate_test_data
(self)¶
-
generate_train_data
(self)¶
-
get_colors
(self)¶
-
get_dataset_type
(self)¶
-
get_input_kpt_cvter
(self)¶
-
get_output_kpt_cvter
(self)¶
-
get_parts
(self)¶
-
official_eval
(self, pd_anns, eval_dir='./eval_dir')¶ providing official evaluation of COCO dataset
using pycocotool.cocoeval class to perform official evaluation. output model metrics of MAPs on coco evaluation dataset
- Parameters
- arg1String
A string path of the json file in the same format of cocoeval annotation file(person_keypoints_val2017.json) which contains predicted results. one can refer the evaluation pipeline of models for generation procedure of this json file.
- arg2String
A string path indicates where the json files of filtered intersection part of predict results and ground truth the filtered prediction file is stored in eval_dir/pd_ann.json the filtered ground truth file is stored in eval_dir/gt_ann.json
- Returns
- None
-
official_test
(self, pd_anns, test_dir='./test_dir')¶
-
prepare_dataset
(self)¶ download,extract, and reformat the dataset the official format is in zip format, extract it into json files and image files.
- Parameters
- None
- Returns
- None
-
set_input_kpt_cvter
(self, input_kpt_cvter)¶
-
set_output_kpt_cvter
(self, output_kpt_cvter)¶
-
visualize
(self, vis_num)¶ visualize annotations of the train dataset
visualize the annotation points in the image to help understand and check annotation the visualized image will be saved in the “data_vis_dir” of the corresponding model directory(specified by model name). the visualized annotations are from the train dataset.
- Parameters
- arg1Int
An integer indicates how many images with their annotations are going to be visualized.
- Returns
- None
-
-
hyperpose.Dataset.mscoco_dataset.dataset.
init_dataset
(config)¶
-
class
hyperpose.Dataset.mscoco_dataset.define.
CocoPart
(value)¶ Bases:
enum.Enum
An enumeration.
-
LAnkle
= 15¶
-
LEar
= 3¶
-
LElbow
= 7¶
-
LHip
= 11¶
-
LKnee
= 13¶
-
LShoulder
= 5¶
-
LWrist
= 9¶
-
Leye
= 1¶
-
Nose
= 0¶
-
RAnkle
= 16¶
-
REar
= 4¶
-
RElbow
= 8¶
-
RHip
= 12¶
-
RKnee
= 14¶
-
RShoulder
= 6¶
-
RWrist
= 10¶
-
Reye
= 2¶
-
-
hyperpose.Dataset.mscoco_dataset.define.
opps_input_converter
(coco_kpts)¶
-
hyperpose.Dataset.mscoco_dataset.define.
opps_output_converter
(kpt_list)¶
-
hyperpose.Dataset.mscoco_dataset.define.
pifpaf_input_converter
(coco_kpts)¶
-
hyperpose.Dataset.mscoco_dataset.define.
pifpaf_output_converter
(kpt_list)¶
-
hyperpose.Dataset.mscoco_dataset.define.
ppn_input_converter
(coco_kpts)¶
-
hyperpose.Dataset.mscoco_dataset.define.
ppn_output_converter
(kpt_list)¶
-
class
hyperpose.Dataset.mscoco_dataset.format.
CocoMeta
(image_id, img_url, img_meta, kpts_infos, masks, bbxs, is_crowd)¶ Bases:
object
Be used in PoseInfo.
-
class
hyperpose.Dataset.mscoco_dataset.format.
PoseInfo
(image_base_dir, anno_path, with_mask=True, dataset_filter=None, eval=False)¶ Bases:
object
Use COCO for pose estimation, returns images with people only.
Methods
get_image_annos
(self)Read JSON file, and get and check the image list.
get_bbx_list
get_bbxs
get_image_id_list
get_image_list
get_keypoints
get_kpt_list
get_mask_list
load_images
-
get_bbx_list
(self)¶
-
static
get_bbxs
(annos_info)¶
-
get_image_annos
(self)¶ Read JSON file, and get and check the image list. Skip missing images.
-
get_image_id_list
(self)¶
-
get_image_list
(self)¶
-
static
get_keypoints
(annos_info)¶
-
get_kpt_list
(self)¶
-
get_mask_list
(self)¶
-
load_images
(self)¶
-
-
hyperpose.Dataset.mscoco_dataset.generate.
generate_eval_data
(val_imgs_path, val_anns_path, dataset_filter=None)¶
-
hyperpose.Dataset.mscoco_dataset.generate.
generate_test_data
(test_imgs_path, test_anns_path)¶
-
hyperpose.Dataset.mscoco_dataset.generate.
generate_train_data
(train_imgs_path, train_anns_path, dataset_filter=None, input_kpt_cvter=<function <lambda> at 0x7f6740ad8950>)¶
-
hyperpose.Dataset.mscoco_dataset.prepare.
prepare_dataset
(data_path='./data', version='2017', task='person')¶ Download MSCOCO Dataset. Both 2014 and 2017 dataset have train, validate and test sets, but 2017 version put less data into the validation set (115k train, 5k validate) i.e. has more training data.
- Parameters
- pathstr
The path that the data is downloaded to, defaults is
data/mscoco...
.- datasetstr
The MSCOCO dataset version, 2014 or 2017.
- taskstr
person for pose estimation, caption for image captioning, instance for segmentation.
- Returns
- train_image_pathstr
Folder path of all training images.
- train_ann_pathstr
File path of training annotations.
- val_image_pathstr
Folder path of all validating images.
- val_ann_pathstr
File path of validating annotations.
- test_image_pathstr
Folder path of all testing images.
- test_ann_pathNone
File path of testing annotations, but as the test sets of MSCOCO 2014 and 2017 do not have annotation, returns None.
References
Examples
>>> train_im_path, train_ann_path, val_im_path, val_ann_path, _, _ = ... tl.files.load_mscoco_dataset('data', '2017')
-
hyperpose.Dataset.common.
file_log
(log_file, msg)¶
-
hyperpose.Dataset.common.
get_domainadapt_targets
(domainadapt_img_paths)¶
-
hyperpose.Dataset.common.
imread_rgb_float
(image_path, data_format='channels_first')¶
-
hyperpose.Dataset.common.
imwrite_rgb_float
(image, image_path, data_format='channels_first')¶
-
hyperpose.Dataset.common.
unzip
(path_to_zip_file, directory_to_extract_to)¶
-
hyperpose.Dataset.common.
visualize
(vis_dir, vis_num, dataset, parts, colors, dataset_name='default')¶
-
hyperpose.Dataset.
enum2dataset
(dataset_type)¶
-
hyperpose.Dataset.
get_dataset
(config)¶ get dataset object based on the config object
consturct and return a dataset object based on the config. No matter what the bottom dataset type is, the APIs of the returned dataset object are uniform, they are the following APIs:
visualize: visualize annotations of the train dataset and save it in “data_vir_dir” get_dataset_type: return the type of the bottom dataset. get_train_dataset: return a uniform tensorflow dataset object for training. get_val_dataset: return a uniform tensorflow dataset object for evaluating. official_eval: perform official evaluation on this dataset.
The construction pipeline of this dataset object is below:
1.check whether the dataset file(official zip or mat) is under data_path, if it isn’t, download it from official website automaticly
2.decode the official dataset file, organize the annotations in corresponding Meta classes, conveniet for processing.
3.based on annotation, split train and evaluat part for furthur use.
if user defined thier own dataset_filter, it will be executed in the train dataset or evaluate dataset generating procedure.
use the APIs of this returned dataset object, the difference of different dataset is minimized.
- Parameters
- arg1config object
the config object return by Config.get_config() function, which includes all the configuration information.
- Returns
- dataset
a dataset object with unifrom APIs: visualize, get_dataset_type, get_train_dataset, get_val_dataset,official_eval
-
hyperpose.Dataset.
get_pretrain_dataset
(config)¶
hyperpose.Model package¶
-
hyperpose.Model.openpose.model.lw_openpose.
conv_block
(n_filter, in_channels, filter_size=3, 3, strides=1, 1, dilation_rate=1, 1, W_init=tensorlayer.initializers.truncated_normal, b_init=tensorlayer.initializers.truncated_normal, padding='SAME', data_format='channels_first')¶
-
hyperpose.Model.openpose.model.lw_openpose.
dw_conv_block
(n_filter, in_channels, filter_size=3, 3, strides=1, 1, dilation_rate=1, 1, W_init=tensorlayer.initializers.truncated_normal, b_init=tensorlayer.initializers.truncated_normal, data_format='channels_first')¶
-
hyperpose.Model.openpose.model.lw_openpose.
nobn_dw_conv_block
(n_filter, in_channels, filter_size=3, 3, strides=1, 1, W_init=tensorlayer.initializers.truncated_normal, b_init=tensorlayer.initializers.truncated_normal, data_format='channels_first')¶
-
hyperpose.Model.openpose.model.mbv2_sm_openpose.
conv_block
(n_filter=32, in_channels=3, filter_size=3, 3, strides=1, 1, act=tensorflow.nn.relu, padding='SAME', data_format='channels_first')¶
-
hyperpose.Model.openpose.model.mbv2_sm_openpose.
separable_block
(n_filter=32, in_channels=3, filter_size=3, 3, strides=1, 1, act=tensorflow.nn.relu, padding='SAME', data_format='channels_first')¶
-
hyperpose.Model.openpose.model.mbv2_th_openpose.
conv_block
(n_filter=32, in_channels=3, filter_size=3, 3, strides=1, 1, act=tensorflow.nn.relu, padding='SAME', data_format='channels_first')¶
-
hyperpose.Model.openpose.model.mbv2_th_openpose.
separable_block
(n_filter=32, in_channels=3, filter_size=3, 3, strides=1, 1, dilation_rate=1, 1, act=tensorflow.nn.relu, data_format='channels_first')¶
-
class
hyperpose.Model.openpose.define.
CocoPart
(value)¶ Bases:
enum.Enum
An enumeration.
-
Background
= 18¶
-
LAnkle
= 13¶
-
LEar
= 17¶
-
LElbow
= 6¶
-
LEye
= 15¶
-
LHip
= 11¶
-
LKnee
= 12¶
-
LShoulder
= 5¶
-
LWrist
= 7¶
-
Neck
= 1¶
-
Nose
= 0¶
-
RAnkle
= 10¶
-
REar
= 16¶
-
RElbow
= 3¶
-
REye
= 14¶
-
RHip
= 8¶
-
RKnee
= 9¶
-
RShoulder
= 2¶
-
RWrist
= 4¶
-
-
class
hyperpose.Model.openpose.define.
MpiiPart
(value)¶ Bases:
enum.Enum
An enumeration.
-
Background
= 15¶
-
Center
= 14¶
-
Headtop
= 0¶
-
LAnkle
= 13¶
-
LElbow
= 6¶
-
LHip
= 11¶
-
LKnee
= 12¶
-
LShoulder
= 5¶
-
LWrist
= 7¶
-
Neck
= 1¶
-
RAnkle
= 10¶
-
RElbow
= 3¶
-
RHip
= 8¶
-
RKnee
= 9¶
-
RShoulder
= 2¶
-
RWrist
= 4¶
-
-
hyperpose.Model.openpose.define.
get_coco_flip_list
()¶
-
hyperpose.Model.openpose.define.
get_mpii_flip_list
()¶
-
hyperpose.Model.openpose.eval.
evaluate
(model, dataset, config, vis_num=30, total_eval_num=10000, enable_multiscale_search=True)¶ evaluate pipeline of Openpose class models
input model and dataset, the evaluate pipeline will start automaticly the evaluate pipeline will: 1.loading newest model at path ./save_dir/model_name/model_dir/newest_model.npz 2.perform inference and parsing over the chosen evaluate dataset 3.visualize model output in evaluation in directory ./save_dir/model_name/eval_vis_dir 4.output model metrics by calling dataset.official_eval()
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- arg3Int
an Integer indicates how many model output should be visualized
- arg4Int
an Integer indicates how many images should be evaluated
- Returns
- None
-
hyperpose.Model.openpose.eval.
infer_one_img
(model, post_processor, img, img_id=- 1, enable_multiscale_search=False, is_visual=False, save_dir='./vis_dir')¶
-
hyperpose.Model.openpose.eval.
multiscale_search
(img, model)¶
-
hyperpose.Model.openpose.eval.
test
(model, dataset, config, vis_num=30, total_test_num=10000, enable_multiscale_search=True)¶ evaluate pipeline of Openpose class models
input model and dataset, the evaluate pipeline will start automaticly the evaluate pipeline will: 1.loading newest model at path ./save_dir/model_name/model_dir/newest_model.npz 2.perform inference and parsing over the chosen evaluate dataset 3.visualize model output in evaluation in directory ./save_dir/model_name/eval_vis_dir 4.output model metrics by calling dataset.official_eval()
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- arg3Int
an Integer indicates how many model output should be visualized
- arg4Int
an Integer indicates how many images should be evaluated
- Returns
- None
-
hyperpose.Model.openpose.eval.
visualize
(img, img_id, humans, conf_map, paf_map, save_dir)¶
-
hyperpose.Model.openpose.train.
get_paramed_map_fn
(augmentor, preprocessor, data_format='channels_first')¶
-
hyperpose.Model.openpose.train.
parallel_train
(train_model, dataset, config)¶ Parallel train pipeline of openpose class models
input model and dataset, the train pipeline will start automaticly the train pipeline will: 1.store and restore ckpt in directory ./save_dir/model_name/model_dir 2.log loss information in directory ./save_dir/model_name/log.txt 3.visualize model output periodly during training in directory ./save_dir/model_name/train_vis_dir the newest model is at path ./save_dir/model_name/model_dir/newest_model.npz
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- Returns
- None
-
hyperpose.Model.openpose.train.
single_train
(train_model, dataset, config)¶ Single train pipeline of Openpose class models
input model and dataset, the train pipeline will start automaticly the train pipeline will: 1.store and restore ckpt in directory ./save_dir/model_name/model_dir 2.log loss information in directory ./save_dir/model_name/log.txt 3.visualize model output periodly during training in directory ./save_dir/model_name/train_vis_dir the newest model is at path ./save_dir/model_name/model_dir/newest_model.npz
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- Returns
- None
-
hyperpose.Model.openpose.utils.
cal_vectormap_fast
(vectormap, countmap, i, v_start, v_end)¶
-
hyperpose.Model.openpose.utils.
cal_vectormap_ori
(vectormap, countmap, i, v_start, v_end)¶
-
hyperpose.Model.openpose.utils.
draw_results
(images, heats_ground, heats_result, pafs_ground, pafs_result, masks, save_dir, name='', data_format='channels_first')¶ Save results for debugging.
- Parameters
- imagesa list of RGB images
- heats_grounda list of keypoint heat maps or None
- heats_resulta list of keypoint heat maps or None
- pafs_grounda list of paf vector maps or None
- pafs_resulta list of paf vector maps or None
- masksa list of mask for people
-
hyperpose.Model.openpose.utils.
get_colors
(dataset_type)¶
-
hyperpose.Model.openpose.utils.
get_flip_list
(dataset_type)¶
-
hyperpose.Model.openpose.utils.
get_heatmap
(annos, height, width, hout, wout, parts, limbs, data_format='channels_first')¶
-
hyperpose.Model.openpose.utils.
get_limbs
(dataset_type)¶
-
hyperpose.Model.openpose.utils.
get_parts
(dataset_type)¶
-
hyperpose.Model.openpose.utils.
get_vectormap
(annos, height, width, hout, wout, parts, limbs, data_format='channels_first')¶
-
hyperpose.Model.openpose.utils.
postprocess
(conf_map, paf_map, img_h, img_w, parts, limbs, data_format='channels_first', colors=None)¶ postprocess function of openpose class models
take model predicted feature maps, output parsed human objects, each one contains all detected keypoints of the person
- Parameters
- arg1numpy array
model predicted conf_map, heatmaps of keypoints, shape C*H*W(channels_first) or H*W*C(channels_last)
- arg2numpy array
model predicted paf_map, heatmaps of limbs, shape C*H*W(channels_first) or H*W*C(channels_last)
- arg3Config.DATA
an enum value of enum class Config.DATA width of the model output, will be the width of the generated maps
- arg4string
data format speficied for channel order available input: ‘channels_first’: data_shape C*H*W ‘channels_last’: data_shape H*W*C
- Returns
- list
contain object of humans,see Model.Human for detail information of Human object
-
hyperpose.Model.openpose.utils.
preprocess
(annos, img_height, img_width, model_hout, model_wout, parts, limbs, data_format='channels_first')¶ preprocess function of openpose class models
take keypoints annotations, image height and width, model input height and width, and dataset type, return the constructed conf_map and paf_map
- Parameters
- arg1list
a list of annotation, each annotation is a list of keypoints that belongs to a person, each keypoint follows the format (x,y), and x<0 or y<0 if the keypoint is not visible or not annotated. the annotations must from a known dataset_type, other wise the keypoint and limbs order will not be correct.
- arg2Int
height of the input image, need this to make use of keypoint annotation
- arg3Int
width of the input image, need this to make use of keypoint annotation
- arg4Int
height of the model output, will be the height of the generated maps
- arg5Int
width of the model output, will be the width of the generated maps
- arg6Config.DATA
a enum value of enum class Config.DATA dataset_type where the input annotation list from, because to generate correct conf_map and paf_map, the order of keypoints and limbs should be awared.
- arg7string
data format speficied for channel order available input: ‘channels_first’: data_shape C*H*W ‘channels_last’: data_shape H*W*C
- Returns
- list
including two element conf_map: heatmaps of keypoints, shape C*H*W(channels_first) or H*W*C(channels_last) paf_map: heatmaps of limbs, shape C*H*W(channels_first) or H*W*C(channels_last)
-
hyperpose.Model.openpose.utils.
put_heatmap
(heatmap, plane_idx, center, stride, sigma)¶
-
hyperpose.Model.openpose.utils.
vis_annos
(image, annos, save_dir, name='')¶ Save results for debugging.
- Parameters
- imagessingle RGB image
- annosannotation, list of lists
-
hyperpose.Model.openpose.utils.
visualize
(img, conf_map, paf_map, save_name='maps', save_dir='./save_dir/vis_dir', data_format='channels_first', save_tofile=True)¶ visualize function of openpose class models
take model predict feature maps, output visualized image. the image will be saved at ‘save_dir’/’save_name’_visualize.png
- Parameters
- arg1numpy array
image
- arg2numpy array
model output conf_map, heatmaps of keypoints, shape C*H*W(channels_first) or H*W*C(channels_last)
- arg3numpy array
model output paf_map, heatmaps of limbs, shape C*H*W(channels_first) or H*W*C(channels_last)
- arg4String
specify output image name to distinguish.
- arg5String
specify which directory to save the visualized image.
- arg6string
data format speficied for channel order available input: ‘channels_first’: data_shape C*H*W ‘channels_last’: data_shape H*W*C
- Returns
- None
-
class
hyperpose.Model.pose_proposal.define.
CocoPart
(value)¶ Bases:
enum.Enum
An enumeration.
-
Instance
= 1¶
-
LAnkle
= 13¶
-
LEar
= 17¶
-
LElbow
= 6¶
-
LEye
= 15¶
-
LHip
= 11¶
-
LKnee
= 12¶
-
LShoulder
= 5¶
-
LWrist
= 7¶
-
Nose
= 0¶
-
RAnkle
= 10¶
-
REar
= 16¶
-
RElbow
= 3¶
-
REye
= 14¶
-
RHip
= 8¶
-
RKnee
= 9¶
-
RShoulder
= 2¶
-
RWrist
= 4¶
-
-
class
hyperpose.Model.pose_proposal.define.
MpiiPart
(value)¶ Bases:
enum.Enum
An enumeration.
-
Center
= 14¶
-
Headtop
= 0¶
-
Instance
= 15¶
-
LAnkle
= 13¶
-
LElbow
= 6¶
-
LHip
= 11¶
-
LKnee
= 12¶
-
LShoulder
= 5¶
-
LWrist
= 7¶
-
Neck
= 1¶
-
RAnkle
= 10¶
-
RElbow
= 3¶
-
RHip
= 8¶
-
RKnee
= 9¶
-
RShoulder
= 2¶
-
RWrist
= 4¶
-
-
hyperpose.Model.pose_proposal.define.
get_coco_flip_list
()¶
-
hyperpose.Model.pose_proposal.define.
get_mpii_flip_list
()¶
-
hyperpose.Model.pose_proposal.eval.
evaluate
(model, dataset, config, vis_num=30, total_eval_num=10000, enable_multiscale_search=False)¶ evaluate pipeline of poseProposal class models
input model and dataset, the evaluate pipeline will start automaticly the evaluate pipeline will: 1.loading newest model at path ./save_dir/model_name/model_dir/newest_model.npz 2.perform inference and parsing over the chosen evaluate dataset 3.visualize model output in evaluation in directory ./save_dir/model_name/eval_vis_dir 4.output model metrics by calling dataset.official_eval()
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- arg3Int
an Integer indicates how many model output should be visualized
- arg4Int
an Integer indicates how many images should be evaluated
- Returns
- None
-
hyperpose.Model.pose_proposal.eval.
infer_one_img
(model, postprocessor, img, img_id=- 1, is_visual=False, save_dir='./vis_dir/pose_proposal')¶
-
hyperpose.Model.pose_proposal.eval.
test
(model, dataset, config, vis_num=30, total_test_num=10000, enable_multiscale_search=False)¶ evaluate pipeline of poseProposal class models
input model and dataset, the evaluate pipeline will start automaticly the evaluate pipeline will: 1.loading newest model at path ./save_dir/model_name/model_dir/newest_model.npz 2.perform inference and parsing over the chosen evaluate dataset 3.visualize model output in evaluation in directory ./save_dir/model_name/eval_vis_dir 4.output model metrics by calling dataset.official_eval()
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- arg3Int
an Integer indicates how many model output should be visualized
- arg4Int
an Integer indicates how many images should be evaluated
- Returns
- None
-
hyperpose.Model.pose_proposal.eval.
visualize
(img, img_id, humans, predicts, hnei, wnei, hout, wout, limbs, save_dir)¶
-
hyperpose.Model.pose_proposal.train.
get_paramed_map_fn
(augmentor, preprocessor, data_format='channels_first')¶
-
hyperpose.Model.pose_proposal.train.
parallel_train
(train_model, dataset, config)¶ Parallel train pipeline of PoseProposal class models
input model and dataset, the train pipeline will start automaticly the train pipeline will: 1.store and restore ckpt in directory ./save_dir/model_name/model_dir 2.log loss information in directory ./save_dir/model_name/log.txt 3.visualize model output periodly during training in directory ./save_dir/model_name/train_vis_dir the newest model is at path ./save_dir/model_name/model_dir/newest_model.npz
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- Returns
- None
-
hyperpose.Model.pose_proposal.train.
regulize_loss
(target_model, weight_decay_factor)¶
-
hyperpose.Model.pose_proposal.train.
single_train
(train_model, dataset, config)¶ Single train pipeline of PoseProposal class models
input model and dataset, the train pipeline will start automaticly the train pipeline will: 1.store and restore ckpt in directory ./save_dir/model_name/model_dir 2.log loss information in directory ./save_dir/model_name/log.txt 3.visualize model output periodly during training in directory ./save_dir/model_name/train_vis_dir the newest model is at path ./save_dir/model_name/model_dir/newest_model.npz
- Parameters
- arg1tensorlayer.models.MODEL
a preset or user defined model object, obtained by Model.get_model() function
- arg2dataset
a constructed dataset object, obtained by Dataset.get_dataset() function
- Returns
- None
-
hyperpose.Model.pose_proposal.utils.
cal_iou
(bbx1, bbx2)¶
-
hyperpose.Model.pose_proposal.utils.
draw_bbx
(img, img_pc, rx, ry, rw, rh, threshold=0.7)¶
-
hyperpose.Model.pose_proposal.utils.
draw_edge
(img, img_e, rx, ry, rw, rh, hnei, wnei, hout, wout, limbs, threshold=0.7)¶
-
hyperpose.Model.pose_proposal.utils.
draw_results
(img, predicts, targets, parts, limbs, save_dir, threshold=0.3, name='', is_train=True, data_format='channels_first')¶
-
hyperpose.Model.pose_proposal.utils.
get_colors
(dataset_type)¶
-
hyperpose.Model.pose_proposal.utils.
get_flip_list
(dataset_type)¶
-
hyperpose.Model.pose_proposal.utils.
get_limbs
(dataset_type)¶
-
hyperpose.Model.pose_proposal.utils.
get_parts
(dataset_type)¶
-
hyperpose.Model.pose_proposal.utils.
get_pose_proposals
(kpts_list, bbxs, hin, win, hout, wout, hnei, wnei, parts, limbs, img_mask=None, data_format='channels_first')¶
-
hyperpose.Model.pose_proposal.utils.
non_maximium_supress
(bbxs, scores, thres)¶
-
hyperpose.Model.pose_proposal.utils.
postprocess
(predicts, parts, limbs, data_format='channels_first', colors=None)¶ postprocess function of poseproposal class models
take model predicted feature maps of delta,tx,ty,tw,th,te,te_mask, output parsed human objects, each one contains all detected keypoints of the person
- Parameters
- arg1list
a list of model output: delta,tx,ty,tw,th,te,te_mask delta: keypoint confidence feature map, shape [C,H,W](channels_first) or [H,W,C](channels_last) tx: keypoints bbx center x coordinates, divided by gridsize, shape [C,H,W](channels_first) or [H,W,C](channels_last) ty: keypoints bbx center y coordinates, divided by gridsize, shape [C,H,W](channels_first) or [H,W,C](channels_last) tw: keypoints bbxs width w, divided by image width, shape [C,H,W](channels_first) or [H,W,C](channels_last) th: keypoints bbxs height h, divided by image width, shape [C,H,W](channels_first) or [H,W,C](channels_last) te: edge confidence feature map, shape [C,H,W,Hnei,Wnei](channels_first) or [H,W,Hnei,Wnei,C](channels_last) te_mask: mask of edge confidence feature map, used for loss caculation, shape [C,H,W,Hnei,Wnei](channels_first) or [H,W,Hnei,Wnei,C](channels_last)
- arg2: Config.DATA
a enum value of enum class Config.DATA dataset_type where the input annotation list from, because to generate correct conf_map and paf_map, the order of keypoints and limbs should be awared.
- arg3string
data format speficied for channel order available input: ‘channels_first’: data_shape C*H*W ‘channels_last’: data_shape H*W*C
- Returns
- list
contain object of humans,see Model.Human for detail information of Human object
-
hyperpose.Model.pose_proposal.utils.
preprocess
(annos, bbxs, model_hin, modeL_win, model_hout, model_wout, model_hnei, model_wnei, parts, limbs, data_format='channels_first')¶ preprocess function of poseproposal class models
take keypoints annotations, bounding boxs annotatiosn, model input height and width, model limbs neighbor area height, model limbs neighbor area width and dataset type return the constructed targets of delta,tx,ty,tw,th,te,te_mask
- Parameters
- arg1list
a list of keypoint annotations, each annotation is a list of keypoints that belongs to a person, each keypoint follows the format (x,y), and x<0 or y<0 if the keypoint is not visible or not annotated. the annotations must from a known dataset_type, other wise the keypoint and limbs order will not be correct.
- arg2list
a list of bounding box annotations, each bounding box is of format [x,y,w,h]
- arg3Int
height of the model input
- arg4Int
width of the model input
- arg5Int
height of the model output
- arg6Int
width of the model output
- arg7Int
model limbs neighbor area height, determine the neighbor area to macth limbs, see pose propsal paper for detail information
- arg8Int
model limbs neighbor area width, determine the neighbor area to macth limbs, see pose propsal paper for detail information
- arg9Config.DATA
a enum value of enum class Config.DATA dataset_type where the input annotation list from, because to generate correct conf_map and paf_map, the order of keypoints and limbs should be awared.
- arg10string
data format speficied for channel order available input: ‘channels_first’: data_shape C*H*W ‘channels_last’: data_shape H*W*C
- Returns
- list
including 7 elements delta: keypoint confidence feature map, shape [C,H,W](channels_first) or [H,W,C](channels_last) tx: keypoints bbx center x coordinates, divided by gridsize, shape [C,H,W](channels_first) or [H,W,C](channels_last) ty: keypoints bbx center y coordinates, divided by gridsize, shape [C,H,W](channels_first) or [H,W,C](channels_last) tw: keypoints bbxs width w, divided by image width, shape [C,H,W](channels_first) or [H,W,C](channels_last) th: keypoints bbxs height h, divided by image width, shape [C,H,W](channels_first) or [H,W,C](channels_last) te: edge confidence feature map, shape [C,H,W,Hnei,Wnei](channels_first) or [H,W,Hnei,Wnei,C](channels_last) te_mask: mask of edge confidence feature map, used for loss caculation, shape [C,H,W,Hnei,Wnei](channels_first) or [H,W,Hnei,Wnei,C](channels_last)
-
hyperpose.Model.pose_proposal.utils.
restore_coor
(x, y, w, h, win, hin, wout, hout, data_format='channels_first')¶
-
hyperpose.Model.pose_proposal.utils.
visualize
(img, predicts, parts, limbs, save_name='bbxs', save_dir='./save_dir/vis_dir', data_format='channels_first', save_tofile=True)¶ visualize function of poseproposal class models
take model predicted feature maps of delta,tx,ty,tw,th,te,te_mask, output visualized image. the image will be saved at ‘save_dir’/’save_name’_visualize.png
- Parameters
- arg1numpy array
image
- arg2list
a list of model output: delta,tx,ty,tw,th,te,te_mask delta: keypoint confidence feature map, shape [C,H,W](channels_first) or [H,W,C](channels_last) tx: keypoints bbx center x coordinates, divided by gridsize, shape [C,H,W](channels_first) or [H,W,C](channels_last) ty: keypoints bbx center y coordinates, divided by gridsize, shape [C,H,W](channels_first) or [H,W,C](channels_last) tw: keypoints bbxs width w, divided by image width, shape [C,H,W](channels_first) or [H,W,C](channels_last) th: keypoints bbxs height h, divided by image width, shape [C,H,W](channels_first) or [H,W,C](channels_last) te: edge confidence feature map, shape [C,H,W,Hnei,Wnei](channels_first) or [H,W,Hnei,Wnei,C](channels_last) te_mask: mask of edge confidence feature map, used for loss caculation, shape [C,H,W,Hnei,Wnei](channels_first) or [H,W,Hnei,Wnei,C](channels_last)
- arg3: Config.DATA
a enum value of enum class Config.DATA dataset_type where the input annotation list from
- arg4String
specify output image name to distinguish.
- arg5String
specify which directory to save the visualized image.
- arg6string
data format speficied for channel order available input: ‘channels_first’: data_shape C*H*W ‘channels_last’: data_shape H*W*C
- Returns
- None
-
class
hyperpose.Model.common.
MPIIPart
(value)¶ Bases:
enum.Enum
An enumeration.
-
Head
= 13¶
-
LAnkle
= 5¶
-
LElbow
= 10¶
-
LHip
= 3¶
-
LKnee
= 4¶
-
LShoulder
= 9¶
-
LWrist
= 11¶
-
Neck
= 12¶
-
RAnkle
= 0¶
-
RElbow
= 7¶
-
RHip
= 2¶
-
RKnee
= 1¶
-
RShoulder
= 8¶
-
RWrist
= 6¶
-
static
from_coco
(human)¶
-
-
class
hyperpose.Model.common.
Profiler
¶ Bases:
object
Methods
__call__
(self, name, duration)Call self as a function.
report
-
report
(self)¶
-
-
hyperpose.Model.common.
draw_humans
(npimg, humans)¶
-
hyperpose.Model.common.
get_op
(graph, name)¶
-
hyperpose.Model.common.
get_optim
(optim_type)¶
-
hyperpose.Model.common.
get_sample_images
(w, h)¶
-
hyperpose.Model.common.
init_log
(config)¶
-
hyperpose.Model.common.
load_graph
(model_file)¶ Load a freezed graph from file.
-
hyperpose.Model.common.
log
(msg)¶
-
hyperpose.Model.common.
measure
(f, name=None)¶
-
hyperpose.Model.common.
pad_image
(img, stride, pad_value=0.0)¶
-
hyperpose.Model.common.
pad_image_shape
(img, shape, pad_value=0.0)¶
-
hyperpose.Model.common.
plot_humans
(image, heatMat, pafMat, humans, name)¶
-
hyperpose.Model.common.
read_imgfile
(path, width, height, data_format='channels_last')¶ Read image file and resize to network input size.
-
hyperpose.Model.common.
regulize_loss
(target_model, weight_decay_factor)¶
-
hyperpose.Model.common.
rename_tensor
(x, name)¶
-
hyperpose.Model.common.
scale_image
(image, hin, win, scale_rate=0.95)¶
-
hyperpose.Model.common.
tf_repeat
(tensor, repeats)¶ Args:
input: A Tensor. 1-D or higher. repeats: A list. Number of repeat for each dimension, length must be the same as the number of dimensions in input
Returns:
A Tensor. Has the same type as input. Has the shape of tensor.shape * repeats
-
class
hyperpose.Model.human.
BodyPart
(parts, u_idx, part_idx, x, y, score, w=- 1, h=- 1)¶ Bases:
object
part_idx : part index(eg. 0 for nose) x, y: coordinate of body part score : confidence score
Methods
get_part_name
get_x
get_y
-
get_part_name
(self)¶
-
get_x
(self)¶
-
get_y
(self)¶
-
-
class
hyperpose.Model.human.
Human
(parts, limbs, colors)¶ Bases:
object
body_parts: list of BodyPart
Methods
bias
draw_human
get_area
get_bbx
get_global_id
get_partnum
get_score
print
scale
-
bias
(self, bias_w, bias_h)¶
-
draw_human
(self, img)¶
-
get_area
(self)¶
-
get_bbx
(self)¶
-
get_global_id
(self)¶
-
get_partnum
(self)¶
-
get_score
(self)¶
-
print
(self)¶
-
scale
(self, scale_w, scale_h)¶
-
-
hyperpose.Model.
get_evaluate
(config)¶ get evaluate pipeline based on config object
construct evaluate pipeline based on the chosen model_type and dataset_type, the evaluation metric fellows the official metrics of the chosen dataset.
the returned evaluate pipeline can be easily used by evaluate(model,dataset), where model is obtained by Model.get_model(), dataset is obtained by Dataset.get_dataset()
the evaluate pipeline will: 1.loading newest model at path ./save_dir/model_name/model_dir/newest_model.npz 2.perform inference and parsing over the chosen evaluate dataset 3.visualize model output in evaluation in directory ./save_dir/model_name/eval_vis_dir 4.output model metrics by calling dataset.official_eval()
- Parameters
- arg1config object
the config object return by Config.get_config() function, which includes all the configuration information.
- Returns
- function
a evaluate pipeline function which takes model and dataset as input, and output model metrics
-
hyperpose.Model.
get_model
(config)¶ get model based on config object
construct and return a model based on the configured model_type and model_backbone. each preset model architecture has a default backbone, replace it with chosen common model_backbones allow user to change model computation complexity to adapt to application scene.
- Parameters
- arg1config object
the config object return by Config.get_config() function, which includes all the configuration information.
- Returns
- tensorlayer.models.MODEL
a model object inherited from tensorlayer.models.MODEL class, has configured model architecture and chosen model backbone. can be user defined architecture by using Config.set_model_architecture() function.
-
hyperpose.Model.
get_postprocessor
(model_type)¶ get a postprocessor class based on the specified model_type
get the postprocessor class of the specified kind of model to help user directly construct their own evaluate pipeline(rather than using the integrated evaluate pipeline) or infer pipeline(to check the model utility) when in need.
the postprocessor is able to parse the model output feature map and output parsed human objects of Human class, which contains all dectected keypoints.
- Parameters
- arg1Config.MODEL
a enum value of enum class Config.MODEL
- Returns
- function
a postprocessor class of the specified kind of model
-
hyperpose.Model.
get_preprocessor
(model_type)¶ get a preprocessor class based on the specified model_type
get the preprocessor class of the specified kind of model to help user directly construct their own train pipeline(rather than using the integrated train pipeline) when in need.
the preprocessor class is able to construct a preprocessor object that could convert the image and annotation to the model output format for training.
- Parameters
- arg1Config.MODEL
a enum value of enum class Config.MODEL
- Returns
- class
a preprocessor class of the specified kind of model
-
hyperpose.Model.
get_pretrain
(config)¶
-
hyperpose.Model.
get_test
(config)¶ get test pipeline based on config object
construct test pipeline based on the chosen model_type and dataset_type, the test metric fellows the official metrics of the chosen dataset.
the returned test pipeline can be easily used by test(model,dataset), where model is obtained by Model.get_model(), dataset is obtained by Dataset.get_dataset()
the test pipeline will: 1.loading newest model at path ./save_dir/model_name/model_dir/newest_model.npz 2.perform inference and parsing over the chosen test dataset 3.visualize model output in test in directory ./save_dir/model_name/test_vis_dir 4.output model test result file at path ./save_dir/model_name/test_vis_dir/pd_ann.json 5.the test dataset ground truth is often preserved by the dataset creator, you may need to upload the test result file to the official server to get model test metrics
- Parameters
- arg1config object
the config object return by Config.get_config() function, which includes all the configuration information.
- Returns
- function
a test pipeline function which takes model and dataset as input, and output model metrics
-
hyperpose.Model.
get_train
(config)¶ get train pipeline based on config object
construct train pipeline based on the chosen model_type and dataset_type, default is single train pipeline performed on single GPU, can be parallel train pipeline use function Config.set_train_type()
the returned train pipeline can be easily used by train(model,dataset), where model is obtained by Model.get_model(), dataset is obtained by Dataset.get_dataset()
the train pipeline will: 1.store and restore ckpt in directory ./save_dir/model_name/model_dir 2.log loss information in directory ./save_dir/model_name/log.txt 3.visualize model output periodly during training in directory ./save_dir/model_name/train_vis_dir the newest model is at path ./save_dir/model_name/model_dir/newest_model.npz
- Parameters
- arg1config object
the config object return by Config.get_config() function, which includes all the configuration information.
- Returns
- function
a train pipeline function which takes model and dataset as input, can be either single train or parallel train pipeline.
-
hyperpose.Model.
get_visualize
(model_type)¶ get visualize function based model_type
get the visualize function of the specified kind of model to help user construct thier own evaluate pipeline rather than using the integrated train or evaluate pipeline directly when in need
the visualize function is able to visualize model’s output feature map, which is helpful for training and evaluation analysis.
- Parameters
- arg1Config.MODEL
a enum value of enum class Config.MODEL
- Returns
- function
a visualize function of the specified kind of model
Module contents¶
Performance and Supports¶
Supports¶
Prediction Library¶
Supported Language¶
C++
Supported DNN Engine & Model Format¶
TensorRT
ONNX
Uff
CUDA Engine Protobuf
Supported Post-Processing Methods¶
Part Association Field(PAF)
Pose Proposal Networks
Released Prediction Models¶
We released the models on Google Drive. .onnx
and .uff
files are for inference.
Performance of Prediction Library¶
Result¶
We compare the prediction performance of HyperPose with OpenPose 1.6 and TF-Pose. We implement the OpenPose algorithms with different configurations in HyperPose.
The test-bed has Ubuntu18.04, 1070Ti GPU, Intel i7 CPU (12 logic cores).
HyperPose Configuration | DNN Size | Input Size | HyperPose | Baseline |
---|---|---|---|---|
OpenPose (VGG) | 209.3MB | 656 x 368 | 27.32 FPS | 8 FPS (OpenPose) |
OpenPose (TinyVGG) | 34.7 MB | 384 x 256 | 124.925 FPS | N/A |
OpenPose (MobileNet) | 17.9 MB | 432 x 368 | 84.32 FPS | 8.5 FPS (TF-Pose) |
OpenPose (ResNet18) | 45.0 MB | 432 x 368 | 62.52 FPS | N/A |
OpenPifPaf (ResNet50) | 97.6 MB | 97 x 129 | 178.6 FPS | 35.3 |
Environment: System@Ubuntu18.04, GPU@1070Ti, CPU@i7(12 logic cores).
Tested Video Source: Crazy Updown Funk(resolution@640x360, frame_count@7458, source@YouTube)
OpenPose performance is not tested with batch processing as it seems not to be implemented. (see here)
Insights¶
Overview¶
Why HyperPose¶
HyperPose provides:
Flexible training
Well abstracted APIs (Python) to help you manage the pose estimation pipeline quickly and directly
Dataset (COCO, MPII)
Pose Estimation Methods
Backbones: ResNet, VGG (Tiny/Normal/Thin), Pose Proposal Network, OpenPifPaf.
Post-Processing: Part Association Field (PAF), Pose Proposal Networks, PifPaf.
Fast Prediction
Rich operator APIs for you to do fast DNN inference and post-processing.
2 API styles:
Operator API (Imperative): HyperPose provides basic operators to do DNN inference and post processing.
Stream API (Declarative): HyperPose provides a streaming processing runtime scheduler where users only need to specify the engine, post-processing methods and input/output streams.
Model format supports:
Uff.
ONNX.
Cuda Engine Protobuf.
Good performance. (see here)
Prediction Library Design¶
HyperPose Prediction Pipeline¶
HyperPose supports prediction pipelines described in the image blow. (Mainly for bottom-up approaches)
Operator API & Stream API¶
Operator API provides basic operators for users to manipulate the pipeline. And Stream API is based on Operator API and makes higher level scheduling on it.
Minimum Example For Operator API¶
To apply pose estimation to a video using Operator API:
#include <hyperpose/hyperpose.hpp>
int main() {
using namespace hyperpose;
const cv::Size network_resolution{384, 256};
const dnn::uff uff_model{ "../data/models/TinyVGG-V1-HW=256x384.uff", "image", {"outputs/conf", "outputs/paf"} };
// * Input video.
auto capture = cv::VideoCapture("../data/media/video.avi");
// * Output video.
auto writer = cv::VideoWriter(
"output.avi", capture.get(cv::CAP_PROP_FOURCC), capture.get(cv::CAP_PROP_FPS), network_resolution);
// * Create TensorRT engine.
dnn::tensorrt engine(uff_model, network_resolution);
// * post-processing: Using paf.
parser::paf parser{};
while (capture.isOpened()) {
std::vector<cv::Mat> batch;
for (int i = 0; i < engine.max_batch_size(); ++i) {
cv::Mat mat;
capture >> mat;
if (mat.empty())
break;
batch.push_back(mat);
}
if (batch.empty())
break;
// * TensorRT Inference.
auto feature_map_packets = engine.inference(batch);
// * Paf.
std::vector<std::vector<human_t>> pose_vectors;
pose_vectors.reserve(feature_map_packets.size());
for (auto&& packet : feature_map_packets)
pose_vectors.push_back(parser.process(packet[0], packet[1]));
// * Visualization
for (size_t i = 0; i < batch.size(); ++i) {
cv::resize(batch[i], batch[i], network_resolution);
for (auto&& pose : pose_vectors[i])
draw_human(batch[i], pose);
writer << batch[i];
}
}
}
Minimum Example For Stream API¶
To apply pose estimation to a video using Stream API:
#include <hyperpose/hyperpose.hpp>
int main() {
using namespace hyperpose;
const cv::Size network_resolution{384, 256};
const dnn::uff uff_model{ "../data/models/TinyVGG-V1-HW=256x384.uff", "image", {"outputs/conf", "outputs/paf"} };
// * Input video.
auto capture = cv::VideoCapture("../data/media/video.avi");
// * Output video.
auto writer = cv::VideoWriter(
"output.avi", capture.get(cv::CAP_PROP_FOURCC), capture.get(cv::CAP_PROP_FPS), network_resolution);
// * Create TensorRT engine.
dnn::tensorrt engine(uff_model, network_resolution);
// * post-processing: Using paf.
parser::paf parser{};
// * Create stream
auto stream = make_stream(engine, parser);
// * Connect input stream.
stream.async() << capture;
// * Connect ouput stream and wait.
stream.sync() >> writer;
}
Using the Stream API, it is much faster and with less codes!
Stream Processing in Stream API¶
Every worker is a thread.
Every worker communicates via a FIFO queue.
The DNN inference worker will do greedy batching for the inputs.
Frequently Asked Questions(FAQs)¶
Frequently Asked Questions¶
Installation¶
No C++17 Compiler(Linux)?¶
Using
apt
as your package manager?Install from
ppa
.Helpful link: LINK.
Otherwise
Build a C++17 compiler from source.
Build without examples/tests?¶
cmake .. -DBUILD_EXAMPLES=OFF -DBUILD_TESTS=OFF
Build OpenCV from source?¶
Refer to here.
Network problem when installing the test models/data from the command line?¶
Download them manually:
All prediction models are available on Google Drive.
The test data are taken from the OpenPose Project.
Training¶
Prediction¶
TensorRT Error?¶
See the
tensorrt.log
. (it contains more informations about logging and is located in where you execute the binary)You may meet
ERROR: Tensor image cannot be both input and output
when using theTinyVGG-V1-HW=256x384.uff
model. And just ignore it.
Performance?¶
Usually the 1st try of execution(cold start) on small amount of data tends to be slow. You can use a longer video/more images to test the performance(or run it more than once).
The performance is mainly related to the followings(you can customize the followings):
The complexity of model(not only FLOPS but also parameter numbers): smaller is usually better.
The model network resolution(alse see here): smaller is better.
Batch size: bigger is faster(higher throughput). (For details, you can refer to Shen’s dissertation)
The input / output size(this mainly involves in the speed of
cv::resize
): smaller is better.The upsampling factor of the feature map when doing post processing: smaller is better. (By default the PAF parser will upsample the feature map by 4x. We did this according to the Lightweight-OpenPose paper.)
Use better hardware(Good CPUs can make the post-processing faster!).
Use SIMD instructions of your CPU. (Compile OpenCV from source and enable the instruction sets in cmake configuration)