386

How Trustworthy are the Existing Performance Evaluations for Basic Vision Tasks?

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Abstract

This paper examines performance evaluation criteria for basic vision tasks involving sets of objects namely, object detection, instance-level segmentation and multi-object tracking. The rankings of algorithms by current criteria fluctuate with different choices of parameters, e.g. Intersection over Union (IoU) threshold, making their evaluations unreliable. More importantly, there is no means to even verify whether we can trust the evaluations of a criterion. This work advocates a notion of trustworthiness for criteria, which requires (i) robustness to parameters for reliability, (ii) contextual meaningfulness in sanity tests, and (iii) consistency with mathematical requirements such as the metric properties. We show that such requirements were overlooked by many widely-used criteria. We also explore alternative criteria using metrics for sets of shapes, and assess them against these requirements to find trustworthy criteria.

View on arXiv
Comments on this paper