
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Papers citing "Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation"
36 / 36 papers shown
Title |
---|
![]() Variance Reduction for Policy Gradient with Action-Dependent Factorized
Baselines Cathy Wu Aravind Rajeswaran Yan Duan Vikash Kumar Alexandre M. Bayen Sham Kakade Igor Mordatch Pieter Abbeel |