Mean Actor Critic

1 September 2017

Cameron Allen

Michael Littman

Papers citing "Mean Actor Critic"

23 / 23 papers shown

Title
Learning in complex action spaces without policy gradients Arash Tavakoli Sina Ghiassian Nemanja Rakićević OffRL 34 0 0 08 Oct 2024
S-REINFORCE: A Neuro-Symbolic Policy Gradient Approach for Interpretable Reinforcement Learning R. Dutta Qinchen Wang Ankur Singh Dhruv Kumarjiguda Xiaoli Li Senthilnath Jayavelu 22 2 0 12 May 2023
Taylor TD-learning Michele Garibbo Maxime Robeyns Laurence Aitchison OffRL 23 1 0 27 Feb 2023
On Many-Actions Policy Gradient Michal Nauman Marek Cygan 19 0 0 24 Oct 2022
Hindsight Learning for MDPs with Exogenous Inputs Sean R. Sinclair Felipe Vieira Frujeri Ching-An Cheng Luke Marshall Hugo Barbalho Jingling Li Jennifer Neville Ishai Menache Adith Swaminathan 18 23 0 13 Jul 2022
A unified view of likelihood ratio and reparameterization gradients Paavo Parmas Masashi Sugiyama 28 9 0 31 May 2021
Low-Variance Policy Gradient Estimation with World Models Michal Nauman Floris den Hengst OffRL 30 1 0 29 Oct 2020
A reinforcement learning approach to rare trajectory sampling Dominic C. Rose Jamie F. Mair J. P. Garrahan 26 51 0 26 May 2020
All-Action Policy Gradient Methods: A Numerical Integration Approach Benjamin Petit Loren Amdahl-Culleton Yao Liu Jimmy T.H. Smith Pierre-Luc Bacon 24 9 0 21 Oct 2019
A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme Paavo Parmas Masashi Sugiyama 19 3 0 14 Oct 2019
Deep Active Inference as Variational Policy Gradients Beren Millidge BDL 32 103 0 08 Jul 2019
Combating the Compounding-Error Problem with a Multi-step Model Kavosh Asadi Dipendra Kumar Misra Seungchan Kim Michel L. Littman LRM 16 55 0 30 May 2019
Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates Dennis J. N. J. Soemers Éric Piette Matthew Stephenson C. Browne 8 8 0 14 May 2019
Total stochastic gradient algorithms and applications in reinforcement learning Paavo Parmas 33 17 0 05 Feb 2019
Towards a Simple Approach to Multi-step Model-based Reinforcement Learning Kavosh Asadi Evan Cater Dipendra Kumar Misra Michael L. Littman OffRL 29 13 0 31 Oct 2018
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments S. Srinivasan Marc Lanctot V. Zambaldi Julien Perolat K. Tuyls Rémi Munos Michael Bowling 13 148 0 21 Oct 2018
Deep Reinforcement Learning Yuxi Li VLM OffRL 28 144 0 15 Oct 2018
Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation Alane Suhr Yoav Artzi 16 33 0 25 May 2018
Reward Estimation for Variance Reduction in Deep Reinforcement Learning Joshua Romoff Peter Henderson Alexandre Piché Vincent François-Lavet Joelle Pineau 11 42 0 09 May 2018
Clipped Action Policy Gradient Yasuhiro Fujita S. Maeda OffRL 34 37 0 21 Feb 2018
Expected Policy Gradients for Reinforcement Learning K. Ciosek Shimon Whiteson 50 51 0 10 Jan 2018
Action-depedent Control Variates for Policy Optimization via Stein's Identity Hao Liu Yihao Feng Yi Mao Dengyong Zhou Jian-wei Peng Qiang Liu 35 4 0 30 Oct 2017
Expected Policy Gradients K. Ciosek Shimon Whiteson 27 57 0 15 Jun 2017