Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.01541
Cited By
Deep Reinforcement Learning for Dialogue Generation
5 June 2016
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Reinforcement Learning for Dialogue Generation"
50 / 165 papers shown
Title
On The Statistical Complexity of Offline Decision-Making
Thanh Nguyen-Tang
R. Arora
OffRL
43
1
0
10 Jan 2025
A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation
W. Zhang
Yiming Cui
Kaiyan Zhang
Yifa Wang
Qingfu Zhu
Lingzhi Li
Ting Liu
55
8
0
28 Oct 2024
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Wenhong Zhu
Zhiwei He
Xiaofeng Wang
Pengfei Liu
Rui Wang
OSLM
47
3
0
24 Oct 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
58
3
0
06 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
56
4
0
03 Oct 2024
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
48
4
0
24 Sep 2024
Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation
Hui Ma
Bo Zhang
Bo Xu
Jian Wang
Hongfei Lin
Xiao Sun
52
1
0
06 Aug 2024
Self-Emotion Blended Dialogue Generation in Social Simulation Agents
Qiang Zhang
Jason Naradowsky
Yusuke Miyao
20
2
0
03 Aug 2024
On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
Deng Cai
Huayang Li
Tingchen Fu
Siheng Li
Weiwen Xu
...
Leyang Cui
Yan Wang
Lemao Liu
Taro Watanabe
Shuming Shi
KELM
30
2
0
24 Jun 2024
CET2: Modelling Topic Transitions for Coherent and Engaging Knowledge-Grounded Conversations
Lin Xu
Qixian Zhou
Jinlan Fu
See-Kiong Ng
34
0
0
04 Mar 2024
Runtime Verification of Learning Properties for Reinforcement Learning Algorithms
T. Mannucci
Julio de Oliveira Filho
OffRL
6
0
0
16 Nov 2023
Iteratively Learn Diverse Strategies with State Distance Information
Wei Fu
Weihua Du
Jingwei Li
Sunli Chen
Jingzhao Zhang
Yi Wu
45
3
0
23 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
24
5
0
09 Oct 2023
Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization
Helena Bonaldi
Giuseppe Attanasio
Debora Nozza
Marco Guerini
20
6
0
05 Sep 2023
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
46
10
0
28 Aug 2023
Prompt-Based Length Controlled Generation with Reinforcement Learning
Renlong Jie
Xiaojun Meng
Lifeng Shang
Xin Jiang
Qun Liu
17
8
0
23 Aug 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
Decision-Oriented Dialogue for Human-AI Collaboration
Jessy Lin
Nicholas Tomlin
Jacob Andreas
J. Eisner
LLMAG
20
26
0
31 May 2023
A Framework for Incentivized Collaborative Learning
Xinran Wang
Qi Le
Ahmad Faraz Khan
Jie Ding
A. Anwar
FedML
37
4
0
26 May 2023
Model-Based Simulation for Optimising Smart Reply
Benjamin Towle
Ke Zhou
30
1
0
26 May 2023
Deep RL with Hierarchical Action Exploration for Dialogue Generation
Itsugun Cho
Ryota Takahashi
Yusaku Yanase
Hiroaki Saito
19
2
0
22 Mar 2023
Selective experience replay compression using coresets for lifelong deep reinforcement learning in medical imaging
Guangyao Zheng
Samson Zhou
Vladimir Braverman
M. Jacobs
V. Parekh
OffRL
CLL
11
3
0
22 Feb 2023
IC3: Image Captioning by Committee Consensus
David M. Chan
Austin Myers
Sudheendra Vijayanarasimhan
David A. Ross
John F. Canny
26
17
0
02 Feb 2023
Gradient Imitation Reinforcement Learning for General Low-Resource Information Extraction
Xuming Hu
Shiao Meng
Chenwei Zhang
Xiangli Yang
Lijie Wen
Irwin King
Philip S. Yu
44
0
0
11 Nov 2022
Syntax-Aware On-the-Fly Code Completion
Wannita Takerngsaksiri
C. Tantithamthavorn
Yuankui Li
24
17
0
09 Nov 2022
Active Countermeasures for Email Fraud
Wentao Chen
Fuzhou Wang
Matthew Edwards
20
5
0
26 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Baihan Lin
OffRL
AI4TS
26
27
0
24 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
27
107
0
13 Oct 2022
Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence
Chris Callison-Burch
Gaurav Singh Tomar
Lara J. Martin
Daphne Ippolito
Suma Bailis
David Reitter
16
46
0
13 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
31
239
0
03 Oct 2022
Prompting for a conversation: How to control a dialog model?
Josef Valvoda
Yimai Fang
David Vandyke
56
5
0
22 Sep 2022
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Waiman Si
Michael Backes
Jeremy Blackburn
Emiliano De Cristofaro
Gianluca Stringhini
Savvas Zannettou
Yang Zhang
29
58
0
07 Sep 2022
CrossDial: An Entertaining Dialogue Dataset of Chinese Crosstalk
Baizhou Huang
Shikang Du
Xiao-Yi Wan
14
0
0
03 Sep 2022
Why is constrained neural language generation particularly challenging?
Cristina Garbacea
Qiaozhu Mei
59
14
0
11 Jun 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
15
49
0
01 Jun 2022
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang
Sungdong Kim
Jin-Hwa Kim
Donghyun Kwak
Byoung-Tak Zhang
24
10
0
25 May 2022
CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models
Bishal Santra
Ravi Ghadia
Manish Gupta
Pawan Goyal
OffRL
15
0
0
21 May 2022
DxFormer: A Decoupled Automatic Diagnostic System Based on Decoder-Encoder Transformer with Dense Symptom Representations
Wei Chen
Cheng Zhong
J. Peng
Zhongyu Wei
MedIm
23
18
0
08 May 2022
Knowledge Infused Decoding
Ruibo Liu
Guoqing Zheng
Shashank Gupta
Radhika Gaonkar
Chongyang Gao
Soroush Vosoughi
Milad Shokouhi
Ahmed Hassan Awadallah
KELM
25
14
0
06 Apr 2022
Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study
Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
22
43
0
04 Apr 2022
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization
Zihan Zhou
Wei Fu
Bingliang Zhang
Yi Wu
15
28
0
04 Apr 2022
PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model
Fei Mi
Yitong Li
Yulong Zeng
Jingyan Zhou
Yasheng Wang
Chuanfei Xu
Lifeng Shang
Xin Jiang
Shiqi Zhao
Qun Liu
ALM
37
18
0
31 Mar 2022
A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation
Shashi Narayan
Gonccalo Simoes
Yao-Min Zhao
Joshua Maynez
Dipanjan Das
Michael Collins
Mirella Lapata
24
30
0
28 Mar 2022
Long Time No See! Open-Domain Conversation with Long-Term Persona Memory
Xinchao Xu
Zhibin Gou
Wenquan Wu
Zheng-Yu Niu
Hua-Hong Wu
Haifeng Wang
Shihang Wang
RALM
25
107
0
11 Mar 2022
Reinforcement Learning for Linear Quadratic Control is Vulnerable Under Cost Manipulation
Yunhan Huang
Quanyan Zhu
OffRL
AAML
34
4
0
11 Mar 2022
Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process
C. Shi
Jin Zhu
Ye Shen
S. Luo
Hong Zhu
R. Song
OffRL
23
30
0
22 Feb 2022
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models
Farshid Faal
K. Schmitt
Jia Yuan Yu
13
25
0
19 Feb 2022
A Literature Survey of Recent Advances in Chatbots
Guendalina Caldarini
Sardar F. Jaf
K. McGarry
AI4CE
29
274
0
17 Jan 2022
Differentially Private Regret Minimization in Episodic Markov Decision Processes
Sayak Ray Chowdhury
Xingyu Zhou
21
21
0
20 Dec 2021
EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments
Emmanouil Zaranis
Georgios Paraskevopoulos
Athanasios Katsamanis
Alexandros Potamianos
25
9
0
30 Oct 2021
1
2
3
4
Next