Joint Optimization of Neural Autoregressors via Scoring rules

9 January 2026

Jonas Landsgesell

BDL

UQCV

ArXiv (abs)PDF HTML

Main:13 Pages

2 Figures

Bibliography:2 Pages

1 Tables

Abstract

Non-parametric distributional regression has achieved significant milestones in recent years. Among these, the Tabular Prior-Data Fitted Network (TabPFN) has demonstrated state-of-the-art performance on various benchmarks. However, a challenge remains in extending these grid-based approaches to a truly multivariate setting. In a naive non-parametric discretization with $N$ bins per dimension, the complexity of an explicit joint grid scales exponentially and the paramer count of the neural networks rise sharply. This scaling is particularly detrimental in low-data regimes, as the final projection layer would require many parameters, leading to severe overfitting and intractability.

View on arXiv

Comments on this paper