ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.05964
18
20

Efficient average-case population recovery in the presence of insertions and deletions

12 July 2019
F. Ban
Xi Chen
Rocco A. Servedio
S. Sinha
ArXivPDFHTML
Abstract

Several recent works have considered the \emph{trace reconstruction problem}, in which an unknown source string x∈{0,1}nx\in\{0,1\}^nx∈{0,1}n is transmitted through a probabilistic channel which may randomly delete coordinates or insert random bits, resulting in a \emph{trace} of xxx. The goal is to reconstruct the original string~xxx from independent traces of xxx. While the best algorithms known for worst-case strings use exp⁡(O(n1/3))\exp(O(n^{1/3}))exp(O(n1/3)) traces \cite{DOS17,NazarovPeres17}, highly efficient algorithms are known \cite{PZ17,HPP18} for the \emph{average-case} version, in which xxx is uniformly random. We consider a generalization of this average-case trace reconstruction problem, which we call \emph{average-case population recovery in the presence of insertions and deletions}. In this problem, there is an unknown distribution D\cal{D}D over sss unknown source strings x1,…,xs∈{0,1}nx^1,\dots,x^s \in \{0,1\}^nx1,…,xs∈{0,1}n, and each sample is independently generated by drawing some xix^ixi from D\cal{D}D and returning an independent trace of xix^ixi. Building on \cite{PZ17} and \cite{HPP18}, we give an efficient algorithm for this problem. For any support size s≤exp⁡(Θ(n1/3))s \leq \smash{\exp(\Theta(n^{1/3}))}s≤exp(Θ(n1/3)), for a 1−o(1)1-o(1)1−o(1) fraction of all sss-element support sets {x1,…,xs}⊂{0,1}n\{x^1,\dots,x^s\} \subset \{0,1\}^n{x1,…,xs}⊂{0,1}n, for every distribution D\cal{D}D supported on {x1,…,xs}\{x^1,\dots,x^s\}{x1,…,xs}, our algorithm efficiently recovers D{\cal D}D up to total variation distance ϵ\epsilonϵ with high probability, given access to independent traces of independent draws from D\cal{D}D. The algorithm runs in time poly(n,s,1/ϵ)(n,s,1/\epsilon)(n,s,1/ϵ) and its sample complexity is poly(s,1/ϵ,exp⁡(log⁡1/3n)).(s,1/\epsilon,\exp(\log^{1/3}n)).(s,1/ϵ,exp(log1/3n)). This polynomial dependence on the support size sss is in sharp contrast with the \emph{worst-case} version (when x1,…,xsx^1,\dots,x^sx1,…,xs may be any strings in {0,1}n\{0,1\}^n{0,1}n), in which the sample complexity of the most efficient known algorithm \cite{BCFSS19} is doubly exponential in sss.

View on arXiv
Comments on this paper