
v1v2 (latest)
Reinforcement Learning When All Actions are Not Always Available
Papers citing "Reinforcement Learning When All Actions are Not Always Available"
5 / 5 papers shown
Title |
---|
![]() Variance Reduction for Policy Gradient with Action-Dependent Factorized
Baselines Cathy Wu Aravind Rajeswaran Yan Duan Vikash Kumar Alexandre M. Bayen Sham Kakade Igor Mordatch Pieter Abbeel |