ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.01752
  4. Cited By
On the Weaknesses of Reinforcement Learning for Neural Machine
  Translation

On the Weaknesses of Reinforcement Learning for Neural Machine Translation

3 July 2019
Leshem Choshen
Lior Fox
Zohar Aizenbud
Omri Abend
ArXivPDFHTML

Papers citing "On the Weaknesses of Reinforcement Learning for Neural Machine Translation"

39 / 39 papers shown
Title
UC-MOA: Utility-Conditioned Multi-Objective Alignment for Distributional Pareto-Optimality
UC-MOA: Utility-Conditioned Multi-Objective Alignment for Distributional Pareto-Optimality
Zelei Cheng
Xin-Qiang Cai
Yuting Tang
Pushi Zhang
Boming Yang
Masashi Sugiyama
Xinyu Xing
130
0
0
10 Mar 2025
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Yuheng Zhang
Dian Yu
Tao Ge
Linfeng Song
Zhichen Zeng
Haitao Mi
Nan Jiang
Dong Yu
114
4
0
24 Feb 2025
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRL
LRM
113
12
0
11 Oct 2024
RRM: Robust Reward Model Training Mitigates Reward Hacking
RRM: Robust Reward Model Training Mitigates Reward Hacking
Tianqi Liu
Wei Xiong
Jie Jessie Ren
Lichang Chen
Junru Wu
...
Yuan Liu
Bilal Piot
Abe Ittycheriah
Aviral Kumar
Mohammad Saleh
AAML
78
21
0
20 Sep 2024
From Lists to Emojis: How Format Bias Affects Model Alignment
From Lists to Emojis: How Format Bias Affects Model Alignment
Xuanchang Zhang
Wei Xiong
Lichang Chen
Dinesh Manocha
Heng Huang
Tong Zhang
ALM
87
13
0
18 Sep 2024
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
Yuzi Yan
Yibo Miao
J. Li
Yipin Zhang
Jian Xie
Zhijie Deng
Dong Yan
79
13
0
11 Jun 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Zikang Shan
Guhao Feng
Wei Xiong
Xinle Cheng
Li Zhao
Di He
Jiang Bian
Liwei Wang
119
67
0
29 Apr 2024
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
147
13
0
28 Aug 2023
Why is constrained neural language generation particularly challenging?
Why is constrained neural language generation particularly challenging?
Cristina Garbacea
Qiaozhu Mei
98
15
0
11 Jun 2022
Language GANs Falling Short
Language GANs Falling Short
Massimo Caccia
Lucas Caccia
W. Fedus
Hugo Larochelle
Joelle Pineau
Laurent Charlin
180
218
0
06 Nov 2018
Evaluating Text GANs as Language Models
Evaluating Text GANs as Language Models
Guy Tevet
Gavriel Habib
Vered Shwartz
Jonathan Berant
EGVM
37
31
0
30 Oct 2018
A Study of Reinforcement Learning for Neural Machine Translation
A Study of Reinforcement Learning for Neural Machine Translation
Lijun Wu
Fei Tian
Tao Qin
Jianhuang Lai
Tie-Yan Liu
OffRL
50
183
0
27 Aug 2018
A Stochastic Decoder for Neural Machine Translation
A Stochastic Decoder for Neural Machine Translation
P. Schulz
Wilker Aziz
Trevor Cohn
BDL
67
29
0
28 May 2018
DORA The Explorer: Directed Outreaching Reinforcement Action-Selection
DORA The Explorer: Directed Outreaching Reinforcement Action-Selection
Leshem Choshen
Lior Fox
Y. Loewenstein
OffRL
93
64
0
11 Apr 2018
XNMT: The eXtensible Neural Machine Translation Toolkit
XNMT: The eXtensible Neural Machine Translation Toolkit
Graham Neubig
Matthias Sperber
Xinyi Wang
Matthieu Felix
Austin Matthews
...
Philip Arthur
Pierre Godard
John Hewitt
Rachid Riad
Liming Wang
63
67
0
01 Mar 2018
Diversity is All You Need: Learning Skills without a Reward Function
Diversity is All You Need: Learning Skills without a Reward Function
Benjamin Eysenbach
Abhishek Gupta
Julian Ibarz
Sergey Levine
99
1,085
0
16 Feb 2018
Divide-and-Conquer Reinforcement Learning
Divide-and-Conquer Reinforcement Learning
Dibya Ghosh
Avi Singh
Aravind Rajeswaran
Vikash Kumar
Sergey Levine
OffRL
73
127
0
27 Nov 2017
Classical Structured Prediction Losses for Sequence to Sequence Learning
Classical Structured Prediction Losses for Sequence to Sequence Learning
Sergey Edunov
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
AIMat
82
186
0
14 Nov 2017
Hindsight Experience Replay
Hindsight Experience Replay
Marcin Andrychowicz
Dwight Crow
Alex Ray
Jonas Schneider
Rachel Fong
Peter Welinder
Bob McGrew
Joshua Tobin
Pieter Abbeel
Wojciech Zaremba
OffRL
248
2,328
0
05 Jul 2017
Grammatical Error Correction with Neural Reinforcement Learning
Grammatical Error Correction with Neural Reinforcement Learning
Keisuke Sakaguchi
Matt Post
Benjamin Van Durme
57
59
0
02 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
701
131,652
0
12 Jun 2017
Parameter Space Noise for Exploration
Parameter Space Noise for Exploration
Matthias Plappert
Rein Houthooft
Prafulla Dhariwal
Szymon Sidor
Richard Y. Chen
Xi Chen
Tamim Asfour
Pieter Abbeel
Marcin Andrychowicz
54
596
0
06 Jun 2017
Language Generation with Recurrent Generative Adversarial Networks
  without Pre-training
Language Generation with Recurrent Generative Adversarial Networks without Pre-training
Ofir Press
Amir Bar
Ben Bogin
Jonathan Berant
Lior Wolf
GAN
50
104
0
05 Jun 2017
Adversarial Neural Machine Translation
Adversarial Neural Machine Translation
Lijun Wu
Yingce Xia
Li Zhao
Fei Tian
Tao Qin
Jianhuang Lai
Tie-Yan Liu
GAN
AAML
53
134
0
20 Apr 2017
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy
  Risks in Images
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images
Rakshith Shetty
Bernt Schiele
Mario Fritz
94
227
0
30 Mar 2017
Improving Neural Machine Translation with Conditional Sequence
  Generative Adversarial Nets
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
Zhen-Le Yang
Wei Chen
Feng Wang
Bo Xu
GAN
AI4CE
75
170
0
15 Mar 2017
Nematus: a Toolkit for Neural Machine Translation
Nematus: a Toolkit for Neural Machine Translation
Rico Sennrich
Orhan Firat
Kyunghyun Cho
Alexandra Birch
Barry Haddow
...
Marcin Junczys-Dowmunt
Samuel Läubli
Antonio Valerio Miceli Barone
Jozef Mokry
Maria Nadejde
56
406
0
13 Mar 2017
Adversarial Learning for Neural Dialogue Generation
Adversarial Learning for Neural Dialogue Generation
Jiwei Li
Will Monroe
Tianlin Shi
Sébastien Jean
Alan Ritter
Dan Jurafsky
60
899
0
23 Jan 2017
Self-critical Sequence Training for Image Captioning
Self-critical Sequence Training for Image Captioning
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
107
1,887
0
02 Dec 2016
Lexicons and Minimum Risk Training for Neural Machine Translation:
  NAIST-CMU at WAT2016
Lexicons and Minimum Risk Training for Neural Machine Translation: NAIST-CMU at WAT2016
Graham Neubig
56
28
0
20 Oct 2016
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
Lantao Yu
Weinan Zhang
Jun Wang
Yong Yu
GAN
72
2,401
0
18 Sep 2016
Neural Headline Generation with Sentence-wise Optimization
Neural Headline Generation with Sentence-wise Optimization
Ayana
Shiqi Shen
Yu Zhao
Zhiyuan Liu
Maosong Sun
61
55
0
07 Apr 2016
Generating Visual Explanations
Generating Visual Explanations
Lisa Anne Hendricks
Zeynep Akata
Marcus Rohrbach
Jeff Donahue
Bernt Schiele
Trevor Darrell
VLM
FAtt
84
620
0
28 Mar 2016
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
197
8,859
0
04 Feb 2016
Minimum Risk Training for Neural Machine Translation
Minimum Risk Training for Neural Machine Translation
Shiqi Shen
Yong Cheng
Zhongjun He
W. He
Hua Wu
Maosong Sun
Yang Liu
114
469
0
08 Dec 2015
Sequence Level Training with Recurrent Neural Networks
Sequence Level Training with Recurrent Neural Networks
MarcÁurelio Ranzato
S. Chopra
Michael Auli
Wojciech Zaremba
102
1,615
0
20 Nov 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
320
13,248
0
09 Sep 2015
Neural Machine Translation of Rare Words with Subword Units
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
221
7,745
0
31 Aug 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,115
0
22 Dec 2014
1