CASA: Bridging the Gap between Policy Improvement and Policy Evaluation
with Conflict Averse Policy Iteration

CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration

9 May 2021

Papers citing "CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration"

11 / 11 papers shown

Title
RotoGrad: Gradient Homogenization in Multitask Learning Adrián Javaloy Isabel Valera 67 88 0 03 Mar 2021
Decoupling Value and Policy for Generalization in Reinforcement Learning Roberta Raileanu Rob Fergus DRL OffRL 28 96 0 20 Feb 2021
Exploring Simple Siamese Representation Learning Xinlei Chen Kaiming He SSL 161 3,992 0 20 Nov 2020
Phasic Policy Gradient K. Cobbe Jacob Hilton Oleg Klimov John Schulman OffRL 29 155 0 09 Sep 2020
Gradient Surgery for Multi-Task Learning Tianhe Yu Saurabh Kumar Abhishek Gupta Sergey Levine Karol Hausman Chelsea Finn 91 1,190 0 19 Jan 2020
Off-Policy Actor-Critic with Shared Experience Replay Simon Schmitt Matteo Hessel Karen Simonyan OffRL 44 68 0 25 Sep 2019
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures L. Espeholt Hubert Soyer Rémi Munos Karen Simonyan Volodymyr Mnih ... Vlad Firoiu Tim Harley Iain Dunning Shane Legg Koray Kavukcuoglu 140 1,584 0 05 Feb 2018
Safe and Efficient Off-Policy Reinforcement Learning Rémi Munos T. Stepleton Anna Harutyunyan Marc G. Bellemare OffRL 107 611 0 08 Jun 2016
Asynchronous Methods for Deep Reinforcement Learning Volodymyr Mnih Adria Puigdomenech Badia M. Berk Mirza Alex Graves Timothy Lillicrap Tim Harley David Silver Koray Kavukcuoglu 157 8,805 0 04 Feb 2016
Prioritized Experience Replay Tom Schaul John Quan Ioannis Antonoglou David Silver OffRL 185 3,777 0 18 Nov 2015
The Arcade Learning Environment: An Evaluation Platform for General Agents Marc G. Bellemare Yavar Naddaf J. Veness Michael Bowling 61 2,992 0 19 Jul 2012