47
5
v1v2 (latest)

On a phase transition in general order spline regression

Abstract

In the Gaussian sequence model Y=θ0+εY= \theta_0 + \varepsilon in Rn\mathbb{R}^n, we study the fundamental limit of approximating the signal θ0\theta_0 by a class Θ(d,d0,k)\Theta(d,d_0,k) of (generalized) splines with free knots. Here dd is the degree of the spline, d0d_0 is the order of differentiability at each inner knot, and kk is the maximal number of pieces. We show that, given any integer d0d\geq 0 and d0{1,0,,d1}d_0\in\{-1,0,\ldots,d-1\}, the minimax rate of estimation over Θ(d,d0,k)\Theta(d,d_0,k) exhibits the following phase transition: \begin{equation*} \begin{aligned} \inf_{\widetilde{\theta}}\sup_{\theta\in\Theta(d,d_0, k)}\mathbb{E}_\theta\|\widetilde{\theta} - \theta\|^2 \asymp_d \begin{cases} k\log\log(16n/k), & 2\leq k\leq k_0,\\ k\log(en/k), & k \geq k_0+1. \end{cases} \end{aligned} \end{equation*} The transition boundary k0k_0, which takes the form (d+1)/(dd0)+1\lfloor{(d+1)/(d-d_0)\rfloor} + 1, demonstrates the critical role of the regularity parameter d0d_0 in the separation between a faster loglog(16n)\log \log(16n) and a slower log(en)\log(en) rate. We further show that, once encouraging an additional 'dd-monotonicity' shape constraint (including monotonicity for d=0d = 0 and convexity for d=1d=1), the above phase transition is eliminated and the faster kloglog(16n/k)k\log\log(16n/k) rate can be achieved for all kk. These results provide theoretical support for developing 0\ell_0-penalized (shape-constrained) spline regression procedures as useful alternatives to 1\ell_1- and 2\ell_2-penalized ones.

View on arXiv
Comments on this paper