ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.04026
86
0

On the Usage of Gaussian Process for Efficient Data Valuation

4 June 2025
Clément Bénesse
Patrick Mesana
Athénaïs Gautier
Sébastien Gambs
    TDI
ArXiv (abs)PDFHTML
Main:14 Pages
7 Figures
Bibliography:1 Pages
1 Tables
Appendix:5 Pages
Abstract

In machine learning, knowing the impact of a given datum on model training is a fundamental task referred to as Data Valuation. Building on previous works from the literature, we have designed a novel canonical decomposition allowing practitioners to analyze any data valuation method as the combination of two parts: a utility function that captures characteristics from a given model and an aggregation procedure that merges such information. We also propose to use Gaussian Processes as a means to easily access the utility function on ``sub-models'', which are models trained on a subset of the training set. The strength of our approach stems from both its theoretical grounding in Bayesian theory, and its practical reach, by enabling fast estimation of valuations thanks to efficient update formulae.

View on arXiv
@article{bénesse2025_2506.04026,
  title={ On the Usage of Gaussian Process for Efficient Data Valuation },
  author={ Clément Bénesse and Patrick Mesana and Athénaïs Gautier and Sébastien Gambs },
  journal={arXiv preprint arXiv:2506.04026},
  year={ 2025 }
}
Comments on this paper