16
0

Near-Optimal Procedures for Model Discrimination with Non-Disclosure Properties

Abstract

Let θ0,θ1Rd\theta_0,\theta_1 \in \mathbb{R}^d be the population risk minimizers associated to some loss :Rd×ZR\ell:\mathbb{R}^d\times \mathcal{Z}\to\mathbb{R} and two distributions P0,P1\mathbb{P}_0,\mathbb{P}_1 on Z\mathcal{Z}. The models θ0,θ1\theta_0,\theta_1 are unknown, and P0,P1\mathbb{P}_0,\mathbb{P}_1 can be accessed by drawing i.i.d samples from them. Our work is motivated by the following model discrimination question: "What sizes of the samples from P0\mathbb{P}_0 and P1\mathbb{P}_1 allow to distinguish between the two hypotheses θ=θ0\theta^*=\theta_0 and θ=θ1\theta^*=\theta_1 for given θ{θ0,θ1}\theta^*\in\{\theta_0,\theta_1\}?" Making the first steps towards answering it in full generality, we first consider the case of a well-specified linear model with squared loss. Here we provide matching upper and lower bounds on the sample complexity as given by min{1/Δ2,r/Δ}\min\{1/\Delta^2,\sqrt{r}/\Delta\} up to a constant factor; here Δ\Delta is a measure of separation between P0\mathbb{P}_0 and P1\mathbb{P}_1 and rr is the rank of the design covariance matrix. We then extend this result in two directions: (i) for general parametric models in asymptotic regime; (ii) for generalized linear models in small samples (nrn\le r) under weak moment assumptions. In both cases we derive sample complexity bounds of a similar form while allowing for model misspecification. In fact, our testing procedures only access θ\theta^* via a certain functional of empirical risk. In addition, the number of observations that allows us to reach statistical confidence does not allow to "resolve" the two models - that is, recover θ0,θ1\theta_0,\theta_1 up to O(Δ)O(\Delta) prediction accuracy. These two properties allow to use our framework in applied tasks where one would like to identify\textit{identify} a prediction model, which can be proprietary, while guaranteeing that the model cannot be actually inferred\textit{inferred} by the identifying agent.

View on arXiv
Comments on this paper