Segmentation-Aware Convolutional Nets

13 November 2015

Abstract

This paper proposes a new deep convolutional neural network (DCNN) architecture for learning semantic segmentation. The main idea is to train the DCNN to produce internal representations that respect object boundaries. That is, for any two pixels on the same object, the DCNN is trained to produce nearly-identical internal representations; conversely, the DCNN is trained to produce dissimilar representations for pixels coming from differing objects. This strategy is complementary to many others pursued in semantic segmentation, making its integration with existing systems very straightforward. Experimental results show that when this approach is combined with a pre-trained state-of-the-art segmentation system, per-pixel classification accuracy improves, and the resulting segmentations are qualitatively sharper. When combined with a dense conditional random field, this approach exceeds the prior state-of-the-art on the PASCAL VOC2012 segmentation task. Further experiments show that the internal representations learned by the network make state-of-the-art features for patch-based stereo correspondence and motion tracking.

View on arXiv

Comments on this paper