ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.11161
  4. Cited By
Lagrangian Method for Q-Function Learning (with Applications to Machine
  Translation)
v1v2 (latest)

Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)

22 July 2022
Bojun Huang
ArXiv (abs)PDFHTML

Papers citing "Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)"

27 / 27 papers shown
Title
IQ-Learn: Inverse soft-Q Learning for Imitation
IQ-Learn: Inverse soft-Q Learning for Imitation
Divyansh Garg
Shuvam Chakraborty
Chris Cundy
Jiaming Song
Matthieu Geist
Stefano Ermon
86
188
0
23 Jun 2021
Simpson's Bias in NLP Training
Simpson's Bias in NLP Training
Fei Yuan
Longtu Zhang
Bojun Huang
Yaobo Liang
AI4CE
18
3
0
13 Mar 2021
Steady State Analysis of Episodic Reinforcement Learning
Steady State Analysis of Episodic Reinforcement Learning
Bojun Huang
OffRL
39
23
0
12 Nov 2020
Off-Policy Evaluation via the Regularized Lagrangian
Off-Policy Evaluation via the Regularized Lagrangian
Mengjiao Yang
Ofir Nachum
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
41
118
0
07 Jul 2020
Reinforcement Learning via Fenchel-Rockafellar Duality
Reinforcement Learning via Fenchel-Rockafellar Duality
Ofir Nachum
Bo Dai
OffRL
148
122
0
07 Jan 2020
Faster saddle-point optimization for solving large-scale Markov decision
  processes
Faster saddle-point optimization for solving large-scale Markov decision processes
Joan Bas-Serrano
Gergely Neu
60
13
0
22 Sep 2019
On NMT Search Errors and Model Errors: Cat Got Your Tongue?
On NMT Search Errors and Model Errors: Cat Got Your Tongue?
Felix Stahlberg
Bill Byrne
LRM
85
154
0
27 Aug 2019
On the Weaknesses of Reinforcement Learning for Neural Machine
  Translation
On the Weaknesses of Reinforcement Learning for Neural Machine Translation
Leshem Choshen
Lior Fox
Zohar Aizenbud
Omri Abend
110
110
0
03 Jul 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
151
338
0
10 Jun 2019
A Study of Reinforcement Learning for Neural Machine Translation
A Study of Reinforcement Learning for Neural Machine Translation
Lijun Wu
Fei Tian
Tao Qin
Jianhuang Lai
Tie-Yan Liu
OffRL
56
183
0
27 Aug 2018
Scalable Bilinear $π$ Learning Using State and Action Features
Scalable Bilinear πππ Learning Using State and Action Features
Yichen Chen
Lihong Li
Mengdi Wang
67
46
0
27 Apr 2018
A Call for Clarity in Reporting BLEU Scores
A Call for Clarity in Reporting BLEU Scores
Matt Post
177
2,996
0
23 Apr 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
317
8,406
0
04 Jan 2018
Boosting the Actor with Dual Critic
Boosting the Actor with Dual Critic
Bo Dai
Albert Eaton Shaw
Niao He
Lihong Li
Le Song
64
46
0
29 Dec 2017
Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using
  Bellman Duality
Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality
W. Cho
Mengdi Wang
OffRL
35
14
0
07 Dec 2017
Classical Structured Prediction Losses for Sequence to Sequence Learning
Classical Structured Prediction Losses for Sequence to Sequence Learning
Sergey Edunov
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
AIMat
105
186
0
14 Nov 2017
Six Challenges for Neural Machine Translation
Six Challenges for Neural Machine Translation
Philipp Koehn
Rebecca Knowles
AAMLAIMat
373
1,225
0
12 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
730
132,363
0
12 Jun 2017
Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement
  Learning
Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning
Yichen Chen
Mengdi Wang
68
64
0
08 Dec 2016
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
911
6,796
0
26 Sep 2016
Unifying task specification in reinforcement learning
Unifying task specification in reinforcement learning
Martha White
OffRL
55
90
0
07 Sep 2016
An Actor-Critic Algorithm for Sequence Prediction
An Actor-Critic Algorithm for Sequence Prediction
Dzmitry Bahdanau
Philemon Brakel
Kelvin Xu
Anirudh Goyal
Ryan J. Lowe
Joelle Pineau
Aaron Courville
Yoshua Bengio
133
639
0
24 Jul 2016
Generative Adversarial Imitation Learning
Generative Adversarial Imitation Learning
Jonathan Ho
Stefano Ermon
GAN
159
3,119
0
10 Jun 2016
Continuous Deep Q-Learning with Model-based Acceleration
Continuous Deep Q-Learning with Model-based Acceleration
S. Gu
Timothy Lillicrap
Ilya Sutskever
Sergey Levine
91
1,013
0
02 Mar 2016
Sequence Level Training with Recurrent Neural Networks
Sequence Level Training with Recurrent Neural Networks
MarcÁurelio Ranzato
S. Chopra
Michael Auli
Wojciech Zaremba
102
1,620
0
20 Nov 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference
  Learning
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
91
272
0
14 Mar 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,796
0
19 Feb 2015
1