Joint Optimization of Neural Autoregressors via Scoring rules
Jonas Landsgesell
- BDLUQCV
Main:13 Pages
2 Figures
Bibliography:2 Pages
1 Tables
Abstract
Non-parametric distributional regression has achieved significant milestones in recent years. Among these, the Tabular Prior-Data Fitted Network (TabPFN) has demonstrated state-of-the-art performance on various benchmarks. However, a challenge remains in extending these grid-based approaches to a truly multivariate setting. In a naive non-parametric discretization with bins per dimension, the complexity of an explicit joint grid scales exponentially and the paramer count of the neural networks rise sharply. This scaling is particularly detrimental in low-data regimes, as the final projection layer would require many parameters, leading to severe overfitting and intractability.
View on arXivComments on this paper
