ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.05315
29
1

Large-scale adaptive multiple testing for sequential data controlling false discovery and nondiscovery rates

8 June 2023
Rahul Roy
Shyamal K. De
S. K. Bhandari
ArXivPDFHTML
Abstract

In modern scientific experiments, we frequently encounter data that have large dimensions, and in some experiments, such high dimensional data arrive sequentially rather than full data being available all at a time. We develop multiple testing procedures with simultaneous control of false discovery and nondiscovery rates when mmm-variate data vectors X1,X2,…\mathbf{X}_1, \mathbf{X}_2, \dotsX1​,X2​,… are observed sequentially or in groups and each coordinate of these vectors leads to a hypothesis testing. Existing multiple testing methods for sequential data uses fixed stopping boundaries that do not depend on sample size, and hence, are quite conservative when the number of hypotheses mmm is large. We propose sequential tests based on adaptive stopping boundaries that ensure shrinkage of the continue sampling region as the sample size increases. Under minimal assumptions on the data sequence, we first develop a test based on an oracle test statistic such that both false discovery rate (FDR) and false nondiscovery rate (FNR) are nearly equal to some prefixed levels with strong control. Under a two-group mixture model assumption, we propose a data-driven stopping and decision rule based on local false discovery rate statistic that mimics the oracle rule and guarantees simultaneous control of FDR and FNR asymptotically as mmm tends to infinity. Both the oracle and the data-driven stopping times are shown to be finite (i.e., proper) with probability 1 for all finite mmm and converge to a finite constant as mmm grows to infinity. Further, we compare the data-driven test with the existing gap rule proposed in He and Bartroff (2021) and show that the ratio of the expected sample sizes of our method and the gap rule tends to zero as mmm goes to infinity. Extensive analysis of simulated datasets as well as some real datasets illustrate the superiority of the proposed tests over some existing methods.

View on arXiv
Comments on this paper