Humans, Machine Learning, and Language Models in Union: A Cognitive Study on Table Unionability

15 June 2025

Main:7 Pages

3 Figures

9 Tables

Abstract

Data discovery and table unionability in particular became key tasks in modern Data Science. However, the human perspective for these tasks is still under-explored. Thus, this research investigates the human behavior in determining table unionability within data discovery. We have designed an experimental survey and conducted a comprehensive analysis, in which we assess human decision-making for table unionability. We use the observations from the analysis to develop a machine learning framework to boost the (raw) performance of humans. Furthermore, we perform a preliminary study on how LLM performance is compared to humans indicating that it is typically better to consider a combination of both. We believe that this work lays the foundations for developing future Human-in-the-Loop systems for efficient data discovery.

View on arXiv

@article{marimuthu2025_2506.12990,
  title={ Humans, Machine Learning, and Language Models in Union: A Cognitive Study on Table Unionability },
  author={ Sreeram Marimuthu and Nina Klimenkova and Roee Shraga },
  journal={arXiv preprint arXiv:2506.12990},
  year={ 2025 }
}

Comments on this paper