v1v2 (latest)

Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)

22 July 2022

Papers citing "Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)"

27 / 27 papers shown

Title
IQ-Learn: Inverse soft-Q Learning for Imitation Divyansh Garg Shuvam Chakraborty Chris Cundy Jiaming Song Matthieu Geist Stefano Ermon 86 188 0 23 Jun 2021
Simpson's Bias in NLP Training Fei Yuan Longtu Zhang Bojun Huang Yaobo Liang AI4CE 18 3 0 13 Mar 2021
Steady State Analysis of Episodic Reinforcement Learning Bojun Huang OffRL 39 23 0 12 Nov 2020
Off-Policy Evaluation via the Regularized Lagrangian Mengjiao Yang Ofir Nachum Bo Dai Lihong Li Dale Schuurmans OffRL 41 118 0 07 Jul 2020
Reinforcement Learning via Fenchel-Rockafellar Duality Ofir Nachum Bo Dai OffRL 148 122 0 07 Jan 2020
Faster saddle-point optimization for solving large-scale Markov decision processes Joan Bas-Serrano Gergely Neu 60 13 0 22 Sep 2019
On NMT Search Errors and Model Errors: Cat Got Your Tongue? Felix Stahlberg Bill Byrne LRM 85 154 0 27 Aug 2019
On the Weaknesses of Reinforcement Learning for Neural Machine Translation Leshem Choshen Lior Fox Zohar Aizenbud Omri Abend 110 110 0 03 Jul 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections Ofir Nachum Yinlam Chow Bo Dai Lihong Li OffRL 151 338 0 10 Jun 2019
A Study of Reinforcement Learning for Neural Machine Translation Lijun Wu Fei Tian Tao Qin Jianhuang Lai Tie-Yan Liu OffRL 56 183 0 27 Aug 2018
Scalable Bilinear $π$ Learning Using State and Action Features Yichen Chen Lihong Li Mengdi Wang 67 46 0 27 Apr 2018
A Call for Clarity in Reporting BLEU Scores Matt Post 177 2,996 0 23 Apr 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Tuomas Haarnoja Aurick Zhou Pieter Abbeel Sergey Levine 317 8,406 0 04 Jan 2018
Boosting the Actor with Dual Critic Bo Dai Albert Eaton Shaw Niao He Lihong Li Le Song 64 46 0 29 Dec 2017
Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality W. Cho Mengdi Wang OffRL 35 14 0 07 Dec 2017
Classical Structured Prediction Losses for Sequence to Sequence Learning Sergey Edunov Myle Ott Michael Auli David Grangier MarcÁurelio Ranzato AIMat 105 186 0 14 Nov 2017
Six Challenges for Neural Machine Translation Philipp Koehn Rebecca Knowles AAML AIMat 373 1,225 0 12 Jun 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 730 132,363 0 12 Jun 2017
Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning Yichen Chen Mengdi Wang 68 64 0 08 Dec 2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Zhiwen Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 911 6,796 0 26 Sep 2016
Unifying task specification in reinforcement learning Martha White OffRL 55 90 0 07 Sep 2016
An Actor-Critic Algorithm for Sequence Prediction Dzmitry Bahdanau Philemon Brakel Kelvin Xu Anirudh Goyal Ryan J. Lowe Joelle Pineau Aaron Courville Yoshua Bengio 133 639 0 24 Jul 2016
Generative Adversarial Imitation Learning Jonathan Ho Stefano Ermon GAN 159 3,119 0 10 Jun 2016
Continuous Deep Q-Learning with Model-based Acceleration S. Gu Timothy Lillicrap Ilya Sutskever Sergey Levine 91 1,013 0 02 Mar 2016
Sequence Level Training with Recurrent Neural Networks MarcÁurelio Ranzato S. Chopra Michael Auli Wojciech Zaremba 102 1,620 0 20 Nov 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning R. Sutton A. R. Mahmood Martha White 91 272 0 14 Mar 2015
Trust Region Policy Optimization John Schulman Sergey Levine Philipp Moritz Michael I. Jordan Pieter Abbeel 277 6,796 0 19 Feb 2015