Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.07402
Cited By
Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback
24 July 2017
Khanh Nguyen
Hal Daumé
Jordan L. Boyd-Graber
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"
27 / 27 papers shown
Title
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
31
1
0
08 Nov 2024
RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model
Zhuan Shi
Jing Yan
Xiaoli Tang
Lingjuan Lyu
Boi Faltings
34
1
0
29 Aug 2024
Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization
Siyi Gu
Minkai Xu
Alexander Powers
Weili Nie
Tomas Geffner
Karsten Kreis
J. Leskovec
Arash Vahdat
Stefano Ermon
48
7
0
01 Jul 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
Taiming Lu
Lingfeng Shen
Xinyu Yang
Weiting Tan
Beidi Chen
Huaxiu Yao
53
2
0
12 Jun 2024
Learning Generalizable Human Motion Generator with Reinforcement Learning
Yunyao Mao
Xiaoyang Liu
Wen-gang Zhou
Zhenbo Lu
Houqiang Li
38
2
0
24 May 2024
Privately Aligning Language Models with Reinforcement Learning
Fan Wu
Huseyin A. Inan
A. Backurs
Varun Chandrasekaran
Janardhan Kulkarni
Robert Sim
29
6
0
25 Oct 2023
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
46
10
0
28 Aug 2023
Prompt-Based Length Controlled Generation with Reinforcement Learning
Renlong Jie
Xiaojun Meng
Lifeng Shang
Xin Jiang
Qun Liu
17
8
0
23 Aug 2023
Continually Improving Extractive QA via Human Feedback
Ge Gao
Hung-Ting Chen
Yoav Artzi
Eunsol Choi
24
12
0
21 May 2023
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
31
239
0
03 Oct 2022
Mapping the Design Space of Human-AI Interaction in Text Summarization
Ruijia Cheng
Alison Smith-Renner
Kecheng Zhang
Joel R. Tetreault
A. Jaimes
39
31
0
29 Jun 2022
Review of Metrics to Measure the Stability, Robustness and Resilience of Reinforcement Learning
L. Pullum
11
2
0
22 Mar 2022
Onception: Active Learning with Expert Advice for Real World Machine Translation
Vania Mendoncca
Ricardo Rei
Luísa Coheur
Alberto Sardinha INESC-ID Lisboa
25
6
0
09 Mar 2022
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models
Farshid Faal
K. Schmitt
Jia Yuan Yu
13
25
0
19 Feb 2022
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Noriyuki Kojima
Alane Suhr
Yoav Artzi
25
24
0
10 Aug 2021
Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation
Samuel Kiegeland
Julia Kreutzer
AAML
31
46
0
16 Jun 2021
Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort
Vania Mendoncca
Ricardo Rei
Luísa Coheur
Alberto Sardinha
Ana Lúcia Santos INESC-ID Lisboa
13
6
0
27 May 2021
Reliability Testing for Natural Language Processing Systems
Samson Tan
Shafiq R. Joty
K. Baxter
Araz Taeihagh
G. Bennett
Min-Yen Kan
13
38
0
06 May 2021
Interactive Learning from Activity Description
Khanh Nguyen
Dipendra Kumar Misra
Robert Schapire
Miroslav Dudík
Patrick Shafto
45
34
0
13 Feb 2021
Learning to summarize from human feedback
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
14
1,978
0
02 Sep 2020
Machine Translation System Selection from Bandit Feedback
Jason Naradowsky
Xuan Zhang
Kevin Duh
OffRL
11
8
0
22 Feb 2020
A Study of Reinforcement Learning for Neural Machine Translation
Lijun Wu
Fei Tian
Tao Qin
Jianhuang Lai
Tie-Yan Liu
OffRL
27
181
0
27 Aug 2018
Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning
Julia Kreutzer
Joshua Uyheng
Stefan Riezler
17
83
0
27 May 2018
Learning to Extract Coherent Summary via Deep Reinforcement Learning
Yuxiang Wu
Baotian Hu
AI4TS
17
167
0
19 Apr 2018
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
271
5,329
0
05 Nov 2016
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
214
1,326
0
05 Jun 2016
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,923
0
17 Aug 2015
1