Modular network for high accuracy object detection

We present a novel modular object detection convolutional neural network that significantly improves the accuracy of object detection. The network consists of two stages in a hierarchical structure. The first stage is a network that detects general classes. The second stage consists of separate networks to refine the classification and localization of each of the general classes objects. Compared to a state of the art object detection networks the classification error in the modular network is improved by approximately 3-5 times, from 12% to 2.5 %-4.5%. This network is easy to implement and has a 0.94 mAP. The network architecture can be a platform to improve the accuracy of widespread state of the art object detection networks and other kinds of deep learning networks. We show that a deep learning network initialized by transfer learning becomes more accurate as the number of classes it later trained to detect becomes smaller.
View on arXiv