Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.16773
Cited By
v1
v2
v3
v4
v5 (latest)
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning
30 November 2022
Xiao Yu
Qingyang Wu
Kun Qian
Zhou Yu
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning"
50 / 52 papers shown
Title
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems
Yihao Feng
Shentao Yang
Shujian Zhang
Jianguo Zhang
Caiming Xiong
Mi Zhou
Haiquan Wang
OffRL
80
25
0
20 Feb 2023
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
101
248
0
03 Oct 2022
Dialogue Evaluation with Offline Reinforcement Learning
Nurul Lubis
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Michael Heck
Shutong Feng
Milica Gavsić
OffRL
67
4
0
02 Sep 2022
GODEL: Large-Scale Pre-Training for Goal-Directed Dialog
Baolin Peng
Michel Galley
Pengcheng He
Chris Brockett
Lars Liden
E. Nouri
Zhou Yu
Bill Dolan
Jianfeng Gao
VLM
74
75
0
22 Jun 2022
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
Kushal Arora
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
VLM
72
41
0
15 Jun 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
147
219
0
26 May 2022
BORT: Back and Denoising Reconstruction for End-to-End Task-Oriented Dialog
Haipeng Sun
Junwei Bao
Youzheng Wu
Xiaodong He
54
31
0
05 May 2022
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning
Siddharth Verma
Justin Fu
Mengjiao Yang
Sergey Levine
OffRL
55
45
0
18 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
883
13,148
0
04 Mar 2022
GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection
Wanwei He
Yinpei Dai
Yinhe Zheng
Yuchuan Wu
Zhen Cao
...
Min Yang
Feiling Huang
Luo Si
Jian Sun
Yongbin Li
VLM
79
158
0
29 Nov 2021
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System
Yixuan Su
Lei Shu
Elman Mansimov
Arshit Gupta
Deng Cai
Yi-An Lai
Yi Zhang
215
192
0
29 Sep 2021
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
300
195
0
15 Sep 2021
Transferable Dialogue Systems and User Simulators
Bo-Hsiang Tseng
Yinpei Dai
Florian Kreyssig
Bill Byrne
71
54
0
25 Jul 2021
Shades of BLEU, Flavours of Success: The Case of MultiWOZ
Tomás Nekvinda
Ondrej Dusek
65
59
0
10 Jun 2021
A Student-Teacher Architecture for Dialog Domain Adaptation under the Meta-Learning Setting
Kun Qian
Wei Wei
Zhou Yu
77
8
0
06 Apr 2021
Domain State Tracking for a Simplified Dialogue System
Hyunmin Jeon
G. G. Lee
68
19
0
11 Mar 2021
Causal-aware Safe Policy Improvement for Task-oriented dialogue
Govardana Sachithanandam Ramachandran
Kazuma Hashimoto
Caiming Xiong
OffRL
39
11
0
10 Mar 2021
UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2
Yunyi Yang
Yunhao Li
Xiaojun Quan
84
191
0
07 Dec 2020
LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue Policy Optimization
Nurul Lubis
Christian Geishauser
Michael Heck
Hsien-Chin Lin
Marco Moresi
Carel van Niekerk
Milica Gavsić
OffRL
48
40
0
18 Nov 2020
Human-centric Dialog Training via Offline Reinforcement Learning
Natasha Jaques
J. Shen
Asma Ghandeharioun
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
77
96
0
12 Oct 2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
163
1,214
0
24 Sep 2020
Text Generation by Learning from Demonstrations
Richard Yuanzhe Pang
He He
OffRL
60
80
0
16 Sep 2020
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines
Xiaoxue Zang
Abhinav Rastogi
Srinivas Sunkara
Raghav Gupta
Jianguo Zhang
Jindong Chen
71
279
0
10 Jul 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
841
42,332
0
28 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
561
2,040
0
04 May 2020
A Simple Language Model for Task-Oriented Dialogue
Ehsan Hosseini-Asl
Bryan McCann
Chien-Sheng Wu
Semih Yavuz
R. Socher
90
528
0
02 May 2020
Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition
Ryuichi Takanobu
Runze Liang
Minlie Huang
LLMAG
103
55
0
08 Apr 2020
TextGAIL: Generative Adversarial Imitation Learning for Text Generation
Qingyang Wu
Lei Li
Zhou Yu
GAN
50
49
0
07 Apr 2020
Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context
Yichi Zhang
Zhijian Ou
Zhou Yu
124
182
0
24 Nov 2019
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
Yizhe Zhang
Siqi Sun
Michel Galley
Yen-Chun Chen
Chris Brockett
Xiang Gao
Jianfeng Gao
Jingjing Liu
W. Dolan
VLM
189
1,527
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
456
20,298
0
23 Oct 2019
Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models
Qingyang Wu
Yichi Zhang
Yu Li
Zhou Yu
VLM
67
63
0
09 Oct 2019
Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset
Abhinav Rastogi
Xiaoxue Zang
Srinivas Sunkara
Raghav Gupta
Pranav Khaitan
76
613
0
12 Sep 2019
How to Build User Simulators to Train RL-based Dialog Systems
Weiyan Shi
Kun Qian
Xuewei Wang
Zhou Yu
OffRL
45
64
0
03 Sep 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
130
343
0
30 Jun 2019
Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good
Xuewei Wang
Weiyan Shi
Richard Kim
Y. Oh
Sijia Yang
Jingwen Zhang
Zhou Yu
110
286
0
16 Jun 2019
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
343
5,860
0
21 Apr 2019
Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models
Tiancheng Zhao
Kaige Xie
M. Eskénazi
73
142
0
23 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,114
0
11 Oct 2018
MultiWOZ -- A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling
Paweł Budzianowski
Tsung-Hsien Wen
Bo-Hsiang Tseng
I. Casanueva
Stefan Ultes
Osman Ramadan
Milica Gasic
184
1,323
0
29 Sep 2018
End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient
Li Zhou
Kevin Small
Oleg Rokhlenko
Charles Elkan
OffRL
56
42
0
07 Dec 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
526
19,237
0
20 Jul 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
825
11,937
0
09 Mar 2017
Conditional Generation and Snapshot Learning in Neural Dialogue Systems
Tsung-Hsien Wen
Milica Gasic
N. Mrksic
L. Rojas-Barahona
Pei-hao Su
Stefan Ultes
David Vandyke
S. Young
60
79
0
10 Jun 2016
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
285
1,338
0
05 Jun 2016
A Network-based End-to-End Trainable Task-oriented Dialogue System
Tsung-Hsien Wen
David Vandyke
N. Mrksic
Milica Gasic
L. Rojas-Barahona
Pei-hao Su
Stefan Ultes
S. Young
77
1,108
0
15 Apr 2016
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
Chia-Wei Liu
Ryan J. Lowe
Iulian Serban
Michael Noseworthy
Laurent Charlin
Joelle Pineau
104
1,298
0
25 Mar 2016
Sequence Level Training with Recurrent Neural Networks
MarcÁurelio Ranzato
S. Chopra
Michael Auli
Wojciech Zaremba
102
1,619
0
20 Nov 2015
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Samy Bengio
Oriol Vinyals
Navdeep Jaitly
Noam M. Shazeer
152
2,038
0
09 Jun 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
104
3,434
0
08 Jun 2015
1
2
Next