Ensemble Convolutional Neural Networks for Pose Estimation

Yuki Kawana   Norimichi Ukita   Jia-Bin Huang   Ming-Hsuan Yang


Human pose estimation is a challenging task due to significant appearance variations. An ensemble of models, each of which is optimized for a limited variety of poses, is capable of modeling a large variety of human body configurations. However, ensembling models is not a straightforward task due to the complex interdependence among noisy and ambiguous pose estimation predictions acquired by each model. We propose to capture this complex interdependence using a convolutional neural network. Our network achieves this interdependence representation using a combination of deep convolution and deconvolution layers for robust and accurate pose estimation. We evaluate the proposed ensemble model on publicly available datasets and show that our model compares favorably against baseline models and state-of-the-art methods.