77

Joint Optimization of Neural Autoregressors via Scoring rules

Jonas Landsgesell
Main:13 Pages
2 Figures
Bibliography:2 Pages
1 Tables
Abstract

Non-parametric distributional regression has achieved significant milestones in recent years. Among these, the Tabular Prior-Data Fitted Network (TabPFN) has demonstrated state-of-the-art performance on various benchmarks. However, a challenge remains in extending these grid-based approaches to a truly multivariate setting. In a naive non-parametric discretization with NN bins per dimension, the complexity of an explicit joint grid scales exponentially and the paramer count of the neural networks rise sharply. This scaling is particularly detrimental in low-data regimes, as the final projection layer would require many parameters, leading to severe overfitting and intractability.

View on arXiv
Comments on this paper