36
43

Fast Distribution To Real Regression

Abstract

We study the problem of distribution to real-value regression, where one aims to regress a mapping ff that takes in a distribution input covariate PIP\in \mathcal{I} (for a non-parametric family of distributions I\mathcal{I}) and outputs a real-valued response Y=f(P)+ϵY=f(P) + \epsilon. This setting was recently studied, and a "Kernel-Kernel" estimator was introduced and shown to have a polynomial rate of convergence. However, evaluating a new prediction with the Kernel-Kernel estimator scales as Ω(N)\Omega(N). This causes the difficult situation where a large amount of data may be necessary for a low estimation risk, but the computation cost of estimation becomes infeasible when the data-set is too large. To this end, we propose the Double-Basis estimator, which looks to alleviate this big data problem in two ways: first, the Double-Basis estimator is shown to have a computation complexity that is independent of the number of of instances NN when evaluating new predictions after training; secondly, the Double-Basis estimator is shown to have a fast rate of convergence for a general class of mappings fFf\in\mathcal{F}.

View on arXiv
Comments on this paper