ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.00503
  4. Cited By
Mean Actor Critic

Mean Actor Critic

1 September 2017
Cameron Allen
Kavosh Asadi
Melrose Roderick
Abdel-rahman Mohamed
George Konidaris
Michael Littman
ArXivPDFHTML

Papers citing "Mean Actor Critic"

23 / 23 papers shown
Title
Learning in complex action spaces without policy gradients
Learning in complex action spaces without policy gradients
Arash Tavakoli
Sina Ghiassian
Nemanja Rakićević
OffRL
34
0
0
08 Oct 2024
S-REINFORCE: A Neuro-Symbolic Policy Gradient Approach for Interpretable
  Reinforcement Learning
S-REINFORCE: A Neuro-Symbolic Policy Gradient Approach for Interpretable Reinforcement Learning
R. Dutta
Qinchen Wang
Ankur Singh
Dhruv Kumarjiguda
Xiaoli Li
Senthilnath Jayavelu
22
2
0
12 May 2023
Taylor TD-learning
Taylor TD-learning
Michele Garibbo
Maxime Robeyns
Laurence Aitchison
OffRL
23
1
0
27 Feb 2023
On Many-Actions Policy Gradient
On Many-Actions Policy Gradient
Michal Nauman
Marek Cygan
19
0
0
24 Oct 2022
Hindsight Learning for MDPs with Exogenous Inputs
Hindsight Learning for MDPs with Exogenous Inputs
Sean R. Sinclair
Felipe Vieira Frujeri
Ching-An Cheng
Luke Marshall
Hugo Barbalho
Jingling Li
Jennifer Neville
Ishai Menache
Adith Swaminathan
18
23
0
13 Jul 2022
A unified view of likelihood ratio and reparameterization gradients
A unified view of likelihood ratio and reparameterization gradients
Paavo Parmas
Masashi Sugiyama
28
9
0
31 May 2021
Low-Variance Policy Gradient Estimation with World Models
Low-Variance Policy Gradient Estimation with World Models
Michal Nauman
Floris den Hengst
OffRL
30
1
0
29 Oct 2020
A reinforcement learning approach to rare trajectory sampling
A reinforcement learning approach to rare trajectory sampling
Dominic C. Rose
Jamie F. Mair
J. P. Garrahan
26
51
0
26 May 2020
All-Action Policy Gradient Methods: A Numerical Integration Approach
All-Action Policy Gradient Methods: A Numerical Integration Approach
Benjamin Petit
Loren Amdahl-Culleton
Yao Liu
Jimmy T.H. Smith
Pierre-Luc Bacon
24
9
0
21 Oct 2019
A unified view of likelihood ratio and reparameterization gradients and
  an optimal importance sampling scheme
A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme
Paavo Parmas
Masashi Sugiyama
19
3
0
14 Oct 2019
Deep Active Inference as Variational Policy Gradients
Deep Active Inference as Variational Policy Gradients
Beren Millidge
BDL
32
103
0
08 Jul 2019
Combating the Compounding-Error Problem with a Multi-step Model
Combating the Compounding-Error Problem with a Multi-step Model
Kavosh Asadi
Dipendra Kumar Misra
Seungchan Kim
Michel L. Littman
LRM
16
55
0
30 May 2019
Learning Policies from Self-Play with Policy Gradients and MCTS Value
  Estimates
Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates
Dennis J. N. J. Soemers
Éric Piette
Matthew Stephenson
C. Browne
8
8
0
14 May 2019
Total stochastic gradient algorithms and applications in reinforcement
  learning
Total stochastic gradient algorithms and applications in reinforcement learning
Paavo Parmas
33
17
0
05 Feb 2019
Towards a Simple Approach to Multi-step Model-based Reinforcement
  Learning
Towards a Simple Approach to Multi-step Model-based Reinforcement Learning
Kavosh Asadi
Evan Cater
Dipendra Kumar Misra
Michael L. Littman
OffRL
29
13
0
31 Oct 2018
Actor-Critic Policy Optimization in Partially Observable Multiagent
  Environments
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
S. Srinivasan
Marc Lanctot
V. Zambaldi
Julien Perolat
K. Tuyls
Rémi Munos
Michael Bowling
13
148
0
21 Oct 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLM
OffRL
28
144
0
15 Oct 2018
Situated Mapping of Sequential Instructions to Actions with Single-step
  Reward Observation
Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation
Alane Suhr
Yoav Artzi
16
33
0
25 May 2018
Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Joshua Romoff
Peter Henderson
Alexandre Piché
Vincent François-Lavet
Joelle Pineau
11
42
0
09 May 2018
Clipped Action Policy Gradient
Clipped Action Policy Gradient
Yasuhiro Fujita
S. Maeda
OffRL
34
37
0
21 Feb 2018
Expected Policy Gradients for Reinforcement Learning
Expected Policy Gradients for Reinforcement Learning
K. Ciosek
Shimon Whiteson
50
51
0
10 Jan 2018
Action-depedent Control Variates for Policy Optimization via Stein's
  Identity
Action-depedent Control Variates for Policy Optimization via Stein's Identity
Hao Liu
Yihao Feng
Yi Mao
Dengyong Zhou
Jian-wei Peng
Qiang Liu
35
4
0
30 Oct 2017
Expected Policy Gradients
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
27
57
0
15 Jun 2017
1