Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.11978
Cited By
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
17 June 2024
Kenneth Li
Yiming Wang
Fernanda Viégas
Martin Wattenberg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner"
14 / 14 papers shown
Title
Convert Language Model into a Value-based Strategic Planner
Xiaoyu Wang
Yue Zhao
Qingqing Gu
Zhonglin Jiang
X. Chen
Yong Chen
Luo Ji
LLMAG
20
0
0
11 May 2025
Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents
Minzheng Wang
Y. Li
H. Wang
Xinghua Zhang
Nan Xu
Bingli Wu
Fei Huang
Haiyang Yu
Wenji Mao
LLMAG
LRM
38
1
0
04 May 2025
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Yifei Zhou
Andrea Zanette
Jiayi Pan
Sergey Levine
Aviral Kumar
65
47
0
29 Feb 2024
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
232
1,734
0
07 Apr 2023
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Atticus Geiger
Zhengxuan Wu
Christopher Potts
Thomas F. Icard
Noah D. Goodman
CML
75
98
0
05 Mar 2023
CORL: Research-oriented Deep Offline Reinforcement Learning Library
Denis Tarasov
Alexander Nikulin
Dmitry Akimov
Vladislav Kurenkov
Sergey Kolesnikov
OffRL
48
78
0
13 Oct 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
225
443
0
23 Aug 2022
Offline RL for Natural Language Generation with Implicit Language Q Learning
Charles Burton Snell
Ilya Kostrikov
Yi Su
Mengjiao Yang
Sergey Levine
OffRL
125
101
0
05 Jun 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,915
0
04 Mar 2022
Control Prefixes for Parameter-Efficient Text Generation
Jordan Clive
Kris Cao
Marek Rei
42
32
0
15 Oct 2021
All Bark and No Bite: Rogue Dimensions in Transformer Language Models Obscure Representational Quality
William Timkey
Marten van Schijndel
213
110
0
09 Sep 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
334
1,951
0
04 May 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,587
0
18 Sep 2019
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
214
1,327
0
05 Jun 2016
1