Improving Policy Gradient by Exploring Under-appreciated Rewards

28 November 2016

Papers citing "Improving Policy Gradient by Exploring Under-appreciated Rewards"

13 / 13 papers shown

Title
Policy Gradient Algorithms Implicitly Optimize by Continuation Adrien Bolland Gilles Louppe D. Ernst 39 3 0 11 May 2023
Network Calculus with Flow Prolongation -- A Feedforward FIFO Analysis enabled by ML Fabien Geyer Alexander Scheffler Steffen Bondorf 26 6 0 07 Feb 2022
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences Alan Chan Hugo Silva Sungsu Lim Tadashi Kozuno A. R. Mahmood Martha White 25 29 0 17 Jul 2021
Learning to Reach Goals via Iterated Supervised Learning Dibya Ghosh Abhishek Gupta Ashwin Reddy Justin Fu Coline Devin Benjamin Eysenbach Sergey Levine 32 34 0 12 Dec 2019
Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses Matt Grenander Yue Dong Jackie C.K. Cheung Annie Louis 27 35 0 08 Sep 2019
Global Optimality Guarantees For Policy Gradient Methods Jalaj Bhandari Daniel Russo 39 186 0 05 Jun 2019
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning Rui Zhao Xudong Sun Volker Tresp 29 80 0 21 May 2019
Learning to Generalize from Sparse and Underspecified Rewards Rishabh Agarwal Chen Liang Dale Schuurmans Mohammad Norouzi OffRL 54 97 0 19 Feb 2019
Efficient Entropy for Policy Gradient with Multidimensional Action Space Yiming Zhang Q. Vuong Kenny Song Xiao-Yue Gong Keith Ross 30 17 0 02 Jun 2018
Neural Architecture Search using Deep Neural Networks and Monte Carlo Tree Search Linnan Wang Yiyang Zhao Yuu Jinnai Yuandong Tian Rodrigo Fonseca BDL 36 50 0 18 May 2018
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood Kelvin Guu Panupong Pasupat E. Liu Percy Liang 34 190 0 25 Apr 2017
Deep Reinforcement Learning: An Overview Yuxi Li OffRL VLM 104 1,505 0 25 Jan 2017
An Alternative Softmax Operator for Reinforcement Learning Kavosh Asadi Michael L. Littman 20 10 0 16 Dec 2016