The Many-to-Many Mapping Between Concordance Correlation Coefficient and Mean Square Error

14 February 2019

Björn Schuller

Abstract

While the mean square error (MSE) continues to retain its place as one of the most popular loss functions today, the concordance correlation coefficient (CCC) is one of the most widely used reproducibility indices and performance measures, introduced by Lin in 1989. Surprisingly enough, we are yet to witness a formally established relationship between these two popular utility functions, despite their ubiquitous and ever-growing simultaneous usage in much of the correlation research, e.g. interrater agreement, multivariate predictions and assay validation. While minimisation of $L_p$ norm of the errors or of its positive powers (e.g. MSE) is effectively aimed at CCC maximisation, we establish in this paper the sheer ineffectiveness of this popular strategy, with underlying concrete reasons. To this end, for the very first time, we derive and present the formulation for many-to-many mapping existing between the MSE and the CCC. As a consequence, we propose the effective loss function to be $\ |\frac{MSE(x,y)}{cov(x,y)}\ |$ . We also establish conditions for CCC optimisation when given a fixed MSE; and then as a logical next step, when given a fixed set of error coefficients. We present a few interesting mathematical paradoxes (albeit apparent) we discovered through this CCC optimisation endeavour. This newly discovered mapping does not only uncover a counter-intuitive revelation that ' $MSE_1$ < $MSE_2$ may \emph{not} necessarily translate to $CCC_1$ > $CCC_2$ ', but it also provides us with the precise range for the possible CCC values, given MSE. Thereby, the study also inspires and anticipates to pioneer the growing use of CCC-inspired loss functions such as $\ |\frac{MSE(x,y)}{cov(x,y)}\ |$ replacing the traditional $L_p$ error loss function usage for multivariate regressions in general.

View on arXiv

Comments on this paper