109
52

Degrees of Freedom and Model Search

Abstract

Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed by a given procedure. But, despite this fundamental role in statistics, its behavior not completely well-understood, even in some fairly basic settings. For example, it may seem intuitively obvious that the best subset selection fit with subset size k has degrees of freedom larger than k, but this has not been formally verified, nor has is been precisely studied. In large part, the current paper is motivated by this particular problem, and we derive an exact expression for the degrees of freedom of best subset selection in a restricted setting (orthogonal predictor variables). Along the way, we develop a concept that we name "search degrees of freedom"; intuitively, for adaptive regression procedures that perform variable selection, this is a part of the (total) degrees of freedom that we attribute entirely to the model selection mechanism. Finally, we establish a modest extension of Stein's formula to cover discontinuous functions, and discuss its potential role in degrees of freedom and search degrees of freedom calculations.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.