Tutorial for Training Library

Up to now, Hyperpose provides:

  • 5 types of preset model architectures:

Openpose LightweightOpenpos Poseproposal MobilenetThinOpenpose Pifpaf

  • 10 types of common model backbone for backbone replacement:

MobilenetV1, MobilenetV2 Vggtiny, Vgg16, Vgg19 Resnet18, Resnet50 Mobilenet variants(Dilated Mobilenet,MobilenetThin,MobilenetSmall, located in the preset model architectures)

  • 2 types of popular dataset

COCO MPII

  • extensions

user-defined dataset user-defined model architecture pre-processors and post-processors

Integrated pipeline

Hyperpose extract similiar models into a model class. For now, there are two classes: Openpose classes and Poseproposal classes. all model architecture can be devided into one of them.

For each model class, Hyperpose privide a integrated pipeline.

Integrated train pipeline

The usage of integrated training procedure of Hyperpose can be devided into two parts:
setting configuration using APIs of Config module, and getting the configured system from the Model and dataset module.

  • setting parts mainly concern: model_name, model_type, model_backbone, dataset_type and train_type

set_model_name will determine what the path the model related file will be put to set_model_type will adopt the chosen preset model architecture
(use enum value of enum class Config.MODEL) set_model_backbone will replace the backbone of chosen preset model architeture
(use enum value of enum class Config.BACKBONE) set_dataset_type will change the dataset in the training pipeline
(use enum value of enum class Config.DATA) set_train_type is to choose whether use single GPU for single training or multiple GPUs for parallel training
(use enum value of enum class Config.TRAIN)

The conbination of different model architectures and model backbones will lead to huge difference of countructed model’s computation complexity.

For example,Openpose architecture with default Vgg19 backbone is 200MB, while MobilenetThinOpenpose with mobilenet-variant backbone is only 18MB.

Thus the available configuraions could cover a great range of possible hardware computation resources at hand.

for more detailed information, please refer the API documents.

The basic training pipeline configuration is below:

# >>> import modules of hyperpose
from hyperpose import Config,Model,Dataset
# >>> set model name is necessary to distinguish models (neccesarry)
Config.set_model_name(args.model_name)
# >>> set model architecture using Config.MODEL enum class (neccesarry)
Config.set_model_type(Config.MODEL.LightweightOpenpose)
# >>> set model backbone using Config.BACKBONE enum class (not neccessary, each model has its default backbone)
Config.set_model_backbone(Config.BACKBONE.Vggtiny)
# >>> set dataset to use (neccesarry)
Config.set_dataset_type(Config.DATA.COCO)
# >>> set training type (not neccesary, default is single training, can use parallel training)
Config.set_train_type(Config.TRAIN.Single_train)
# >>> congratulations!, the simplest configuration is done, it's time to assemble the model and training pipeline

to use parallel training, one should set train type at first, and then choose kungfu optimizor wrap function, replace the set_train_type function as below, Kungfu also have three option: Sync_sgd,Sync_avg,Pair_avg

Config.set_train_type(Config.TRAIN.Parallel_train)
Config.set_kungfu_option(Config.KUNGFU.Sync_sgd)

And when run your program, using the following command(assuming we have 4 GPUs)

CUDA_VISIBLE_DEVICES=0,1,2,3 kungfu-run -np 4 python train.py
  • getting parts mainly concern: pass the configuration to Model module and Dataset module to assemble the system

Config.get_config will return a config object which contains all the configuration and is the core of the getting functions Model.get_model will return a configrued model object which can forward and calcaulate loss Datset.get_dataset will return a configured dataset object which can generate tensorflow dataset object used for train and evaluate, it can also visualize the dataset annotation. Model.get_train will return a training pipeline, which could start running as long as receive the model object and dataset object

The basic training pipeline assembling is below:

# >>> get config object at first (neccesarry)
config=Config.get_config()
# >>> get model object (neccesarry)
model=Model.get_model(config)
# >>> get dataset object (neccesarry)
dataset=Dataset.get_dataset(config)
# >>> get train pipeline (neccesarry)
train=Model.get_train(config)
# >>> train!
train(model,dataset)

Integrated evaluate pipeline

The configuration part and assembling part of evaluate pipeline is very similair to the train pipeline’s.
The only difference is that:

  • In setting parts, one doesn’t need to set train type(and thus kungfu option)

  • In getting parts, one should use Model.get_eval to get evaluate pipeline(rather than train pipeline) Thus the evaluate code should be as follows:

# >>> import modules of hyperpose
from hyperpose import Config,Model,Dataset
# >>> set model name to be eval
Config.set_model_name(args.model_name)
# >>> the model architecture and backbone setting should be the same with the training configuration of the model to be evaluated.
Config.set_model_type(Config.MODEL.LightweightOpenpose)
Config.set_model_backbone(Config.BACKBONE.Vggtiny)
# >>> set dataset to use
Config.set_dataset_type(Config.DATA.MSCOCO)
# >>> configuration is done, get config object to assemble the system
config=Config.get_config()
model=Model.get_model(config)
dataset=Dataset.get_dataset(config)
eval=Model.get_eval(config)
# >>> eval
eval(model,dataset)

the model will be loaded from the ./save_dir/model_name/model_dir/newest_model.npz and evaluated. It should be noted that:

  • 1.the model architecture, model backbone, dataset type should be the same with the configuration under which model was trained.

  • 2.the evaluation metrics will follow the official evaluation metrics of dataset

User-defined model architecture

Hyperpose leaves freedom for user to define thier own model architecture but use the provided integrated model pipeline at the same time, the following points should be considered:

  • 1.the model should be an object of a tensorlayer.models.Model class (or inherite from this class)

  • 2.the model should have foward and cal_loss functions that has exactly the same input and output format with the preset model architectures. one can refer Model.LightweightOpenpose class for reference.
    to do this, user still need to set model_type to determine the training pipeline, here the model_type should be the model that has the similair data processing pipeline with the user’s own model. Then he can use the set_model_arch function to pass his own model object

    Config.set_model_name(your_model_name)
    Config.set_model_type(similiar_ model_type)
    Config.set_model_arch(your_model_arch)

the other configuration procedures are the same with the integrated training pipeline.

User-defined dataset

Hyperpose allows user to use their own dataset to be integrated with the training and evaluating pipeline, as long as it has the following attribute and functions:

  • 1.get_train_dataset:
    return a tensorflow dataset object where each element should be a image path and a serialized dict(using _pickle library to serialize) which at least have the three key-value pairs:
    1.1 “kpt”-a list contains keyspoints for each labeled human, for example:[[kpt1,kpt2,…,kptn],[kpt1,kpt2,…,kptn]] is a list with two labeld humans, where each kpt is a [x,y] coordinate such as [234,526],etc
    1.2 “bbx”-a list contains boundingbox for each labeled human, for example:[bbx1,bbx2] is a list with two labeled humans, where each bbx is a [x,y,w,h] array such as [234,526,60,80], necessary for pose proposal network, could be set to None for others
    1.3 “mask”-a mask (in mscoco polynomial format) used to cover unlabeled people, could be set to None

  • 2.get_eval_dataset:
    return a tensorflow dataset object where each element should be a image path and its image id.

  • 3.get_input_kpt_cvter(optional):
    return a function which changes the kpt value in your dataset dict element,used to enable your dataset keypoint annotation being in line with your model keypoint setting, or combined with other dataset with different keypoint annotation.

  • 4.get_output_kpt_cvter(optional):
    return a function which changes the model predict result to a format that easy to evaluate, used to enable your datatset to be evaluate at a formal COCO standard (using MAP) or MPII standard (using MPCH).

User-defined dataset filter

Hyperpose also leave freedom for user to define thier own dataset filter to filter the dataset as they want using set_dataset_filter function. to use this, a user should know the follow information:

  • 1.Hyperpose organize the annotations of one image in one dataset in the similiar meta classes. for COCO dataset, it is COCOMeta; for MPII dataset, it is MPIIMeta. Meta classes will have some common information such as image_id, joint_list etc, they also have some dataset-specific imformation, such as mask, is_crowd, headbbx_list etc.

  • 2.the dataset_fiter will perform on the Meta objects of the corresponding dataset, if it returns True, the image and annotaions the Meta object related will be kept, otherwise it will be filtered out. Please refer the Dataset.xxxMeta classes for better use. please refer Dataset.COCOMeta,Dataset.MPIIMeta classes for better use.

    def my_dataset_filter(coco_meta):
        if(len(coco_meta.joint_list)<5 and (not coco_meta.is_crowd)):
            return True
        else:
            return False
    Config.set_dataset_filter(my_dataset_filter)

User-defined train pipeline

Hyperpose also provides three low level functions to help user consturct thier own pipeline. For each class of model, functions of preprocess, postprocess and visualize are provided.

  • get_preprocess receives model_type from Config.MODEL and return a preprocess function The preprocess function is able to convert the annotaion into targets used for training the model.

preprocess=Model.get_preprocess(Config.MODEL.LightweightOPenpose)
conf_map,paf_map=preprocess(annos,img_height,img_width,model_hout,model_wout,Config.DATA.MSCOCO,data_format="channels_first")
pd_conf_map,pd_paf_map=my_model.forward(input_image[np.newaxis,...])
my_model.cal_loss(conf_map,pd_conf_map,paf_map,pd_paf_map)
  • get_postprocess receives model_type from Config.MODEL and return a postprocess function The postprocess function is able to convert the model output into parsed human objects for evaluating and visualizing

pd_conf_map,pd_paf_map=my_model.forward(input_image[np.newaxis,...])
postprocess=Model.get_postprocess(Config.MODEL.LightweightOpenpose)
pd_humans=postprocess(pd_conf_map,pd_paf_map,dataset_type,data_format="channels_first")
for pd_human in pd_humans:
    pd_human.draw(input_image)
  • get_visualize receives model_type from Config.MODEL and return a visualize function The visualize function is able visualize model’s ouput feature map

pd_conf_map,pd_paf_map=my_model.forward(input_image[np.newaxis,...])
visualize=Model.get_visualize(Config.MODEL.LightweightOpenpose)
visualize(input_image,pd_conf_map,pd_paf_map,save_name="my_visual",save_dir="./vis_dir")