Bayesian Model Averaging with Exponentiated Least Square Loss

The model averaging problem is to average multiple models to achieve a prediction accuracy not much worse than that of the best single model in terms of mean squared error. It is known that if the models are misspecified, model averaging is superior to model selection. Specifically, let be the sample size, then the worst case regret of the former decays at a rate of while the worst case regret of the latter decays at a rate of . The recently proposed -aggregation algorithm \citep{DaiRigZhang12} solves the model averaging problem with the optimal regret of both in expectation and in deviation; however it suffers from two limitations: (1) for continuous dictionary, the proposed greedy algorithm for solving -aggregation is not applicable; (2) the formulation of -aggregation appears ad hoc without clear intuition. This paper examines a different approach to model averaging by considering a Bayes estimator for deviation optimal model averaging by using exponentiated least squares loss. We establish a primal-dual relationship of this estimator and that of -aggregation and propose new greedy procedures that satisfactorily resolve the above mentioned limitations of -aggregation.
View on arXiv