ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.08255
27
1

Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth

19 December 2018
Jacob Whitehill
Anand Ramakrishnan
ArXivPDFHTML
Abstract

Automatic machine learning-based detectors of various psychological and social phenomena (e.g., emotion, stress, engagement) have great potential to advance basic science. However, when a detector ddd is trained to approximate an existing measurement tool (e.g., a questionnaire, observation protocol), then care must be taken when interpreting measurements collected using ddd since they are one step further removed from the underlying construct. We examine how the accuracy of ddd, as quantified by the correlation qqq of ddd's outputs with the ground-truth construct UUU, impacts the estimated correlation between UUU (e.g., stress) and some other phenomenon VVV (e.g., academic performance). In particular: (1) We show that if the true correlation between UUU and VVV is rrr, then the expected sample correlation, over all vectors Tn\mathcal{T}^nTn whose correlation with UUU is qqq, is qrqrqr. (2) We derive a formula for the probability that the sample correlation (over nnn subjects) using ddd is positive given that the true correlation is negative (and vice-versa); this probability can be substantial (around 20−30%20-30\%20−30%) for values of nnn and qqq that have been used in recent affective computing studies. %We also show that this probability decreases monotonically in nnn and in qqq. (3) With the goal to reduce the variance of correlations estimated by an automatic detector, we show that training multiple neural networks d(1),…,d(m)d^{(1)},\ldots,d^{(m)}d(1),…,d(m) using different training architectures and hyperparameters for the same detection task provides only limited ``coverage'' of Tn\mathcal{T}^nTn.

View on arXiv
Comments on this paper