Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.09321
Cited By
Improving Policy Gradient by Exploring Under-appreciated Rewards
28 November 2016
Ofir Nachum
Mohammad Norouzi
Dale Schuurmans
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Policy Gradient by Exploring Under-appreciated Rewards"
13 / 13 papers shown
Title
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
39
3
0
11 May 2023
Network Calculus with Flow Prolongation -- A Feedforward FIFO Analysis enabled by ML
Fabien Geyer
Alexander Scheffler
Steffen Bondorf
26
6
0
07 Feb 2022
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
25
29
0
17 Jul 2021
Learning to Reach Goals via Iterated Supervised Learning
Dibya Ghosh
Abhishek Gupta
Ashwin Reddy
Justin Fu
Coline Devin
Benjamin Eysenbach
Sergey Levine
32
34
0
12 Dec 2019
Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses
Matt Grenander
Yue Dong
Jackie C.K. Cheung
Annie Louis
27
35
0
08 Sep 2019
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
39
186
0
05 Jun 2019
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Rui Zhao
Xudong Sun
Volker Tresp
29
80
0
21 May 2019
Learning to Generalize from Sparse and Underspecified Rewards
Rishabh Agarwal
Chen Liang
Dale Schuurmans
Mohammad Norouzi
OffRL
54
97
0
19 Feb 2019
Efficient Entropy for Policy Gradient with Multidimensional Action Space
Yiming Zhang
Q. Vuong
Kenny Song
Xiao-Yue Gong
Keith Ross
30
17
0
02 Jun 2018
Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search
Linnan Wang
Yiyang Zhao
Yuu Jinnai
Yuandong Tian
Rodrigo Fonseca
BDL
36
50
0
18 May 2018
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
Kelvin Guu
Panupong Pasupat
E. Liu
Percy Liang
34
190
0
25 Apr 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,505
0
25 Jan 2017
An Alternative Softmax Operator for Reinforcement Learning
Kavosh Asadi
Michael L. Littman
20
10
0
16 Dec 2016
1