Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.14457
Cited By
Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue
20 June 2024
Huifang Du
Shuqin Li
Minghao Wu
Xuejing Feng
Yuan-Fang Li
Haofen Wang
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue"
15 / 15 papers shown
Title
Behavior Alignment via Reward Function Optimization
Dhawal Gupta
Yash Chandak
Scott M. Jordan
Philip S. Thomas
Bruno Castro da Silva
53
10
0
29 Oct 2023
Are LLMs All You Need for Task-Oriented Dialogue?
Vojtvech Hudevcek
Ondrej Dusek
44
60
0
13 Apr 2023
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
74
244
0
03 Oct 2022
SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation
Wanwei He
Yinpei Dai
Min Yang
Jian Sun
Fei Huang
Luo Si
Yongbin Li
48
62
0
14 Sep 2022
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System
Yixuan Su
Lei Shu
Elman Mansimov
Arshit Gupta
Deng Cai
Yi-An Lai
Yi Zhang
174
192
0
29 Sep 2021
UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2
Yunyi Yang
Yunhao Li
Xiaojun Quan
61
190
0
07 Dec 2020
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
Zhaojiang Lin
Andrea Madotto
Genta Indra Winata
Pascale Fung
64
172
0
25 Sep 2020
GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning
Jianfeng Liu
Feiyang Pan
Ling Luo
OffRL
39
23
0
24 May 2020
Multi-domain Dialogue State Tracking as Dynamic Knowledge Graph Enhanced Question Answering
Li Zhou
Kevin Small
49
85
0
07 Nov 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
434
1,664
0
18 Sep 2019
Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems
Chien-Sheng Wu
Andrea Madotto
Ehsan Hosseini-Asl
Caiming Xiong
R. Socher
Pascale Fung
73
434
0
21 May 2019
Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning
Yuexin Wu
Xiujun Li
Jingjing Liu
Jianfeng Gao
Yiming Yang
49
43
0
19 Nov 2018
MultiWOZ -- A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling
Paweł Budzianowski
Tsung-Hsien Wen
Bo-Hsiang Tseng
I. Casanueva
Stefan Ultes
Osman Ramadan
Milica Gasic
144
1,306
0
29 Sep 2018
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
76
672
0
21 Sep 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
288
18,685
0
20 Jul 2017
1