Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback

24 July 2017

Papers citing "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

27 / 27 papers shown

Title
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings Miguel Moura Ramos Tomás Almeida Daniel Vareta Filipe Azevedo Sweta Agrawal Patrick Fernandes André F. T. Martins 31 1 0 08 Nov 2024
RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model Zhuan Shi Jing Yan Xiaoli Tang Lingjuan Lyu Boi Faltings 34 1 0 29 Aug 2024
Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization Siyi Gu Minkai Xu Alexander Powers Weili Nie Tomas Geffner Karsten Kreis J. Leskovec Arash Vahdat Stefano Ermon 48 7 0 01 Jul 2024
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF Taiming Lu Lingfeng Shen Xinyu Yang Weiting Tan Beidi Chen Huaxiu Yao 53 2 0 12 Jun 2024
Learning Generalizable Human Motion Generator with Reinforcement Learning Yunyao Mao Xiaoyang Liu Wen-gang Zhou Zhenbo Lu Houqiang Li 38 2 0 24 May 2024
Privately Aligning Language Models with Reinforcement Learning Fan Wu Huseyin A. Inan A. Backurs Varun Chandrasekaran Janardhan Kulkarni Robert Sim 29 6 0 25 Oct 2023
Reinforcement Learning for Generative AI: A Survey Yuanjiang Cao Quan.Z Sheng Julian McAuley Lina Yao SyDa 46 10 0 28 Aug 2023
Prompt-Based Length Controlled Generation with Reinforcement Learning Renlong Jie Xiaojun Meng Lifeng Shang Xin Jiang Qun Liu 17 8 0 23 Aug 2023
Continually Improving Extractive QA via Human Feedback Ge Gao Hung-Ting Chen Yoav Artzi Eunsol Choi 24 12 0 21 May 2023
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization Rajkumar Ramamurthy Prithviraj Ammanabrolu Kianté Brantley Jack Hessel R. Sifa Christian Bauckhage Hannaneh Hajishirzi Yejin Choi OffRL 31 239 0 03 Oct 2022
Mapping the Design Space of Human-AI Interaction in Text Summarization Ruijia Cheng Alison Smith-Renner Kecheng Zhang Joel R. Tetreault A. Jaimes 39 31 0 29 Jun 2022
Review of Metrics to Measure the Stability, Robustness and Resilience of Reinforcement Learning L. Pullum 11 2 0 22 Mar 2022
Onception: Active Learning with Expert Advice for Real World Machine Translation Vania Mendoncca Ricardo Rei Luísa Coheur Alberto Sardinha INESC-ID Lisboa 25 6 0 09 Mar 2022
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models Farshid Faal K. Schmitt Jia Yuan Yu 13 25 0 19 Feb 2022
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior Noriyuki Kojima Alane Suhr Yoav Artzi 25 24 0 10 Aug 2021
Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation Samuel Kiegeland Julia Kreutzer AAML 31 46 0 16 Jun 2021
Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort Vania Mendoncca Ricardo Rei Luísa Coheur Alberto Sardinha Ana Lúcia Santos INESC-ID Lisboa 13 6 0 27 May 2021
Reliability Testing for Natural Language Processing Systems Samson Tan Shafiq R. Joty K. Baxter Araz Taeihagh G. Bennett Min-Yen Kan 13 38 0 06 May 2021
Interactive Learning from Activity Description Khanh Nguyen Dipendra Kumar Misra Robert Schapire Miroslav Dudík Patrick Shafto 45 34 0 13 Feb 2021
Learning to summarize from human feedback Nisan Stiennon Long Ouyang Jeff Wu Daniel M. Ziegler Ryan J. Lowe Chelsea Voss Alec Radford Dario Amodei Paul Christiano ALM 14 1,978 0 02 Sep 2020
Machine Translation System Selection from Bandit Feedback Jason Naradowsky Xuan Zhang Kevin Duh OffRL 11 8 0 22 Feb 2020
A Study of Reinforcement Learning for Neural Machine Translation Lijun Wu Fei Tian Tao Qin Jianhuang Lai Tie-Yan Liu OffRL 27 181 0 27 Aug 2018
Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning Julia Kreutzer Joshua Uyheng Stefan Riezler 17 83 0 27 May 2018
Learning to Extract Coherent Summary via Deep Reinforcement Learning Yuxiang Wu Baotian Hu AI4TS 17 167 0 19 Apr 2018
Neural Architecture Search with Reinforcement Learning Barret Zoph Quoc V. Le 271 5,329 0 05 Nov 2016
Deep Reinforcement Learning for Dialogue Generation Jiwei Li Will Monroe Alan Ritter Michel Galley Jianfeng Gao Dan Jurafsky 214 1,326 0 05 Jun 2016
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 218 7,923 0 17 Aug 2015