Reinforcement Learning When All Actions are Not Always Available

v1v2 (latest)

Reinforcement Learning When All Actions are Not Always Available

5 June 2019

Georgios Theocharous

Blossom Metevier

Philip S. Thomas

ArXiv (abs)PDF HTML

Papers citing "Reinforcement Learning When All Actions are Not Always Available"

6 / 6 papers shown

Title
Using Large Ensembles of Control Variates for Variational Inference Tomas Geffner Justin Domke BDL 78 35 0 30 Oct 2018
Planning and Learning with Stochastic Action Sets Craig Boutilier Alon Cohen Amit Daniely Avinatan Hassidim Yishay Mansour Ofer Meshi Martin Mladenov Dale Schuurmans OffRL 48 21 0 07 May 2018
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines Cathy Wu Aravind Rajeswaran Yan Duan Vikash Kumar Alexandre M. Bayen Sham Kakade Igor Mordatch Pieter Abbeel OffRL 70 153 0 20 Mar 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning George Tucker Surya Bhupatiraju S. Gu Richard Turner Zoubin Ghahramani Sergey Levine OffRL 73 127 0 27 Feb 2018
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation Will Grathwohl Dami Choi Yuhuai Wu Geoffrey Roeder David Duvenaud 120 300 0 31 Oct 2017
Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines Philip S. Thomas Emma Brunskill OffRL 57 53 0 20 Jun 2017