19
91

Online Learning of Quantum States

Abstract

Suppose we have many copies of an unknown nn-qubit state ρ\rho. We measure some copies of ρ\rho using a known two-outcome measurement E1E_{1}, then other copies using a measurement E2E_{2}, and so on. At each stage tt, we generate a current hypothesis σt\sigma_{t} about the state ρ\rho, using the outcomes of the previous measurements. We show that it is possible to do this in a way that guarantees that Tr(Eiσt)Tr(Eiρ)|\operatorname{Tr}(E_{i} \sigma_{t}) - \operatorname{Tr}(E_{i}\rho) |, the error in our prediction for the next measurement, is at least ε\varepsilon at most O ⁣(n/ε2)\operatorname{O}\!\left(n / \varepsilon^2 \right) times. Even in the "non-realizable" setting---where there could be arbitrary noise in the measurement outcomes---we show how to output hypothesis states that do significantly worse than the best possible states at most O ⁣(Tn)\operatorname{O}\!\left(\sqrt {Tn}\right) times on the first TT measurements. These results generalize a 2007 theorem by Aaronson on the PAC-learnability of quantum states, to the online and regret-minimization settings. We give three different ways to prove our results---using convex optimization, quantum postselection, and sequential fat-shattering dimension---which have different advantages in terms of parameters and portability.

View on arXiv
Comments on this paper