Unite the People: Closing the Loop Between 3D and 2D Human Representations

Computer Vision and Pattern Recognition (CVPR), 2017

10 January 2017

Abstract

3D models provide the common ground for different representations of human bodies. In turn, robust 2D estimation has proven to be a powerful tool to obtain 3D fits "in-the-wild". However, depending on the level of detail, it can be hard to impossible to obtain labeled representations on large scale. We propose a hybrid approach to this problem: with an extended version of the recently introduced SMPLify method, we obtain high quality 3D body model fits to the core human pose datasets. Human annotators solely sort good and bad fits. This enables us to efficiently build a large dataset with a rich representation. In a comprehensive set of experiments, we show how we can make use of this data to push the limits of discriminative models. With segmentation into 31 body parts and keypoint detection with 91 landmarks, we present compelling results for human analysis at an unprecedented level of detail. Using our dense landmark set, we present state-of-the art results for 3D human pose and shape estimation, while having used an order of magnitude less training data and making no assumptions about gender or pose in the fitting procedure. We show that the initial dataset can be enhanced with these improved fits to grow in quantity and quality, which makes the system deployable on large scale.

View on arXiv

Comments on this paper