ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.04307
24
2

The Everlasting Database: Statistical Validity at a Fair Price

12 March 2018
Blake E. Woodworth
Vitaly Feldman
Saharon Rosset
Nathan Srebro
ArXivPDFHTML
Abstract

The problem of handling adaptivity in data analysis, intentional or not, permeates a variety of fields, including test-set overfitting in ML challenges and the accumulation of invalid scientific discoveries. We propose a mechanism for answering an arbitrarily long sequence of potentially adaptive statistical queries, by charging a price for each query and using the proceeds to collect additional samples. Crucially, we guarantee statistical validity without any assumptions on how the queries are generated. We also ensure with high probability that the cost for MMM non-adaptive queries is O(log⁡M)O(\log M)O(logM), while the cost to a potentially adaptive user who makes MMM queries that do not depend on any others is O(M)O(\sqrt{M})O(M​).

View on arXiv
Comments on this paper