ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.03425
14
9

Kernel Two-Sample Tests for Manifold Data

7 May 2021
Xiuyuan Cheng
Yao Xie
ArXivPDFHTML
Abstract

We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize the test level and power in relation to the kernel bandwidth, the number of samples, and the intrinsic dimensionality of the manifold. Specifically, when data densities ppp and qqq are supported on a ddd-dimensional sub-manifold M{M}M embedded in an mmm-dimensional space and are H\"older with order β\betaβ (up to 2) on M{M}M, we prove a guarantee of the test power for finite sample size nnn that exceeds a threshold depending on ddd, β\betaβ, and Δ2\Delta_2Δ2​ the squared L2L^2L2-divergence between ppp and qqq on the manifold, and with a properly chosen kernel bandwidth γ\gammaγ. For small density departures, we show that with large nnn they can be detected by the kernel test when Δ2\Delta_2Δ2​ is greater than n−2β/(d+4β)n^{- { 2 \beta/( d + 4 \beta ) }}n−2β/(d+4β) up to a certain constant and γ\gammaγ scales as n−1/(d+4β)n^{-1/(d+4\beta)}n−1/(d+4β). The analysis extends to cases where the manifold has a boundary and the data samples contain high-dimensional additive noise. Our results indicate that the kernel two-sample test has no curse-of-dimensionality when the data lie on or near a low-dimensional manifold. We validate our theory and the properties of the kernel test for manifold data through a series of numerical experiments.

View on arXiv
Comments on this paper