ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.17022
140
91
v1v2v3 (latest)

Controlled Decoding from Language Models

25 October 2023
Sidharth Mudgal
Jong Lee
H. Ganapathy
Yaguang Li
Tao Wang
Yanping Huang
Zhifeng Chen
Heng-Tze Cheng
Michael Collins
Trevor Strohman
Jilin Chen
Alex Beutel
Ahmad Beirami
ArXiv (abs)PDFHTML
Abstract

We propose controlled decoding (CD), a novel off-policy reinforcement learning method to control the autoregressive generation from language models towards high reward outcomes. CD solves an off-policy reinforcement learning problem through a value function for the reward, which we call a prefix scorer. The prefix scorer is used at inference time to steer the generation towards higher reward outcomes. We show that the prefix scorer may be trained on (possibly) off-policy data to predict the expected reward when decoding is continued from a partially decoded response. We empirically demonstrate that CD is effective as a control mechanism on Reddit conversations corpus. We also show that the modularity of the design of CD makes it possible to control for multiple rewards, effectively solving a multi-objective reinforcement learning problem with no additional complexity. Finally, we show that CD can be applied in a novel blockwise fashion at inference-time, again without the need for any training-time changes, essentially bridging the gap between the popular best-of-KKK strategy and token-level reinforcement learning. This makes CD a promising approach for alignment of language models.

View on arXiv
Comments on this paper