Corrector Sampling in Language Models

6 June 2025

Main:10 Pages

2 Figures

Bibliography:3 Pages

2 Tables

Appendix:5 Pages

Abstract

Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. This method can be integrated into existing autoregressive models, preserving their next-token-prediction quality and speed. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.

View on arXiv

@article{gat2025_2506.06215,
  title={ Corrector Sampling in Language Models },
  author={ Itai Gat and Neta Shaul and Uriel Singer and Yaron Lipman },
  journal={arXiv preprint arXiv:2506.06215},
  year={ 2025 }
}

Comments on this paper