v1v2 (latest)

LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

24 March 2026

Royden Wagner

Omer Sahin Tas

Jaime Villa

Felix Hauser

Yinzhe Shen

Marlon Steiner

Dominik Strutz

Carlos Fernandez

Christian Kinzig

Guillermo S. Guitierrez-Cabello

Hendrik Königshof

Fabian Immel

Richard Schwarzkopf

Nils Alexander Rack

Kevin Rösch

Kaiwen Wang

Jan-Hendrik Pauls

Martin Lauer

Igor Gilitschenski

Holger Caesar

Christoph Stiller

AILaw

LRM

ArXiv (abs)PDF HTML Github

Main:11 Pages

7 Figures

Bibliography:4 Pages

10 Tables

Appendix:6 Pages

Abstract

In real-world domains such as self-driving, generalization to rare scenarios remains a fundamental challenge. To address this, we introduce a new dataset designed for end-to-end driving that focuses on long-tail driving events. We provide multi-view video data, trajectories, high-level instructions, and detailed reasoning traces, facilitating in-context learning and few-shot generalization. The resulting benchmark for multimodal models, such as VLMs and VLAs, goes beyond safety and comfort metrics by evaluating instruction following and semantic coherence between model outputs. The multilingual reasoning traces in English, Spanish, and Chinese are from domain experts with diverse cultural backgrounds. Thus, our dataset is a unique resource for studying how different forms of reasoning affect driving competence. Our dataset is available at:this https URL

View on arXiv

Comments on this paper