ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01772
  4. Cited By
Reinforcement Learning When All Actions are Not Always Available
v1v2 (latest)

Reinforcement Learning When All Actions are Not Always Available

5 June 2019
Yash Chandak
Georgios Theocharous
Blossom Metevier
Philip S. Thomas
ArXiv (abs)PDFHTML

Papers citing "Reinforcement Learning When All Actions are Not Always Available"

6 / 6 papers shown
Title
Using Large Ensembles of Control Variates for Variational Inference
Using Large Ensembles of Control Variates for Variational Inference
Tomas Geffner
Justin Domke
BDL
78
35
0
30 Oct 2018
Planning and Learning with Stochastic Action Sets
Planning and Learning with Stochastic Action Sets
Craig Boutilier
Alon Cohen
Amit Daniely
Avinatan Hassidim
Yishay Mansour
Ofer Meshi
Martin Mladenov
Dale Schuurmans
OffRL
48
21
0
07 May 2018
Variance Reduction for Policy Gradient with Action-Dependent Factorized
  Baselines
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines
Cathy Wu
Aravind Rajeswaran
Yan Duan
Vikash Kumar
Alexandre M. Bayen
Sham Kakade
Igor Mordatch
Pieter Abbeel
OffRL
70
153
0
20 Mar 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
73
127
0
27 Feb 2018
Backpropagation through the Void: Optimizing control variates for
  black-box gradient estimation
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
120
300
0
31 Oct 2017
Policy Gradient Methods for Reinforcement Learning with Function
  Approximation and Action-Dependent Baselines
Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines
Philip S. Thomas
Emma Brunskill
OffRL
57
53
0
20 Jun 2017
1