ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.06920
  4. Cited By
Maximum a Posteriori Policy Optimisation

Maximum a Posteriori Policy Optimisation

14 June 2018
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
ArXivPDFHTML

Papers citing "Maximum a Posteriori Policy Optimisation"

50 / 144 papers shown
Title
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
65
791
0
12 May 2022
How to Spend Your Robot Time: Bridging Kickstarting and Offline
  Reinforcement Learning for Vision-based Robotic Manipulation
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation
Alex X. Lee
Coline Devin
Jost Tobias Springenberg
Yuxiang Zhou
Thomas Lampe
A. Abdolmaleki
Konstantinos Bousmalis
OffRL
OnRL
24
15
0
06 May 2022
Revisiting Gaussian mixture critics in off-policy reinforcement
  learning: a sample-based approach
Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Bobak Shahriari
A. Abdolmaleki
Arunkumar Byravan
A. Friesen
Siqi Liu
Jost Tobias Springenberg
N. Heess
Matthew W. Hoffman
Martin Riedmiller
OffRL
46
27
0
21 Apr 2022
Learning to Constrain Policy Optimization with Virtual Trust Region
Learning to Constrain Policy Optimization with Virtual Trust Region
Hung Le
Thommen Karimpanal George
Majid Abdolshah
D. Nguyen
Kien Do
Sunil R. Gupta
Svetha Venkatesh
30
3
0
20 Apr 2022
Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data
Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data
Wenxuan Zhou
Steven Bohez
Jan Humplik
A. Abdolmaleki
Dushyant Rao
Markus Wulfmeier
Tuomas Haarnoja
N. Heess
OffRL
34
6
0
12 Apr 2022
Monte Carlo Tree Search based Hybrid Optimization of Variational Quantum
  Circuits
Monte Carlo Tree Search based Hybrid Optimization of Variational Quantum Circuits
Jiahao Yao
Haoya Li
Marin Bukov
Lin Lin
Lexing Ying
16
15
0
30 Mar 2022
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement
  Learning
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning
Jinxin Liu
Hongyin Zhang
Donglin Wang
OffRL
38
32
0
13 Mar 2022
Robot Learning of Mobile Manipulation with Reachability Behavior Priors
Robot Learning of Mobile Manipulation with Reachability Behavior Priors
Snehal Jauhri
Jan Peters
Georgia Chalvatzaki
18
45
0
08 Mar 2022
Learning Robust Real-Time Cultural Transmission without Human Data
Learning Robust Real-Time Cultural Transmission without Human Data
Cultural General Intelligence Team
Avishkar Bhoopchand
Bethanie Brownfield
Adrian Collister
Agustin Dal Lago
...
Alex Platonov
Evan Senter
Sukhdeep Singh
Alexander Zacherl
Lei M. Zhang
VLM
46
11
0
01 Mar 2022
Using Deep Reinforcement Learning with Automatic Curriculum Learning for
  Mapless Navigation in Intralogistics
Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics
Honghu Xue
Benedikt Hein
M. Bakr
Georg Schildbach
Bengt Abel
Elmar Rueckert
16
15
0
23 Feb 2022
Retrieval-Augmented Reinforcement Learning
Retrieval-Augmented Reinforcement Learning
Anirudh Goyal
A. Friesen
Andrea Banino
T. Weber
Nan Rosemary Ke
...
Michal Valko
Simon Osindero
Timothy Lillicrap
N. Heess
Charles Blundell
OffRL
32
53
0
17 Feb 2022
NeuPL: Neural Population Learning
NeuPL: Neural Population Learning
Siqi Liu
Luke Marris
Daniel Hennes
J. Merel
N. Heess
T. Graepel
35
17
0
15 Feb 2022
Bingham Policy Parameterization for 3D Rotations in Reinforcement
  Learning
Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning
Stephen James
Pieter Abbeel
35
9
0
08 Feb 2022
Conservative Distributional Reinforcement Learning with Safety
  Constraints
Conservative Distributional Reinforcement Learning with Safety Constraints
Hengrui Zhang
Youfang Lin
Sheng Han
Shuo Wang
Kai Lv
OffRL
21
5
0
18 Jan 2022
Learning Transferable Motor Skills with Hierarchical Latent Mixture
  Policies
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies
Dushyant Rao
Fereshteh Sadeghi
Leonard Hasenclever
Markus Wulfmeier
Martina Zambelli
...
Dhruva Tirumala
Y. Aytar
J. Merel
N. Heess
R. Hadsell
18
28
0
09 Dec 2021
Towards an Understanding of Default Policies in Multitask Policy
  Optimization
Towards an Understanding of Default Policies in Multitask Policy Optimization
Theodore H. Moskovitz
Michael Arbel
Jack Parker-Holder
Aldo Pacchiano
25
9
0
04 Nov 2021
Self-Consistent Models and Values
Self-Consistent Models and Values
Roy Miles
Kate Baumli
Zita Marinho
Angelos Filos
Matteo Hessel
Hado van Hasselt
David Silver
38
8
0
25 Oct 2021
Evaluating model-based planning and planner amortization for continuous
  control
Evaluating model-based planning and planner amortization for continuous control
Arunkumar Byravan
Leonard Hasenclever
Piotr Trochim
M. Berk Mirza
Alessandro Davide Ialongo
...
Jost Tobias Springenberg
A. Abdolmaleki
N. Heess
J. Merel
Martin Riedmiller
55
17
0
07 Oct 2021
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Takuya Hiraoka
Takahisa Imagawa
Taisei Hashimoto
Takashi Onishi
Yoshimasa Tsuruoka
11
105
0
05 Oct 2021
Learning Dynamics Models for Model Predictive Agents
Learning Dynamics Models for Model Predictive Agents
M. Lutter
Leonard Hasenclever
Arunkumar Byravan
Gabriel Dulac-Arnold
Piotr Trochim
N. Heess
J. Merel
Yuval Tassa
AI4CE
57
26
0
29 Sep 2021
Dual Behavior Regularized Reinforcement Learning
Dual Behavior Regularized Reinforcement Learning
Chapman Siu
Jason M. Traish
R. Xu
OffRL
20
1
0
19 Sep 2021
Is Curiosity All You Need? On the Utility of Emergent Behaviours from
  Curious Exploration
Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration
Oliver Groth
Markus Wulfmeier
Giulia Vezzani
Vibhavari Dasagi
Tim Hertweck
Roland Hafner
N. Heess
Martin Riedmiller
LRM
41
20
0
17 Sep 2021
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
Tianhe Yu
Aviral Kumar
Yevgen Chebotar
Karol Hausman
Sergey Levine
Chelsea Finn
OffRL
35
77
0
16 Sep 2021
Bootstrapped Meta-Learning
Bootstrapped Meta-Learning
Sebastian Flennerhag
Yannick Schroecker
Tom Zahavy
Hado van Hasselt
David Silver
Satinder Singh
38
59
0
09 Sep 2021
Toward a `Standard Model' of Machine Learning
Toward a `Standard Model' of Machine Learning
Zhiting Hu
Eric P. Xing
37
12
0
17 Aug 2021
Implicitly Regularized RL with Implicit Q-Values
Implicitly Regularized RL with Implicit Q-Values
Nino Vieillard
Marcin Andrychowicz
Anton Raichuk
Olivier Pietquin
M. Geist
OffRL
24
9
0
16 Aug 2021
Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Elliot Chane-Sane
Cordelia Schmid
Ivan Laptev
24
140
0
01 Jul 2021
Applications of the Free Energy Principle to Machine Learning and
  Neuroscience
Applications of the Free Energy Principle to Machine Learning and Neuroscience
Beren Millidge
DRL
20
7
0
30 Jun 2021
Behavioral Priors and Dynamics Models: Improving Performance and Domain
  Transfer in Offline RL
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL
Catherine Cang
Aravind Rajeswaran
Pieter Abbeel
Michael Laskin
OffRL
27
29
0
16 Jun 2021
Simplifying Deep Reinforcement Learning via Self-Supervision
Simplifying Deep Reinforcement Learning via Self-Supervision
Daochen Zha
Kwei-Herng Lai
Kaixiong Zhou
Xia Hu
SSL
35
15
0
10 Jun 2021
Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint
Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint
M. Geist
Julien Pérolat
Mathieu Laurière
Romuald Elie
Sarah Perrin
Olivier Bachem
Rémi Munos
Olivier Pietquin
37
62
0
07 Jun 2021
From Motor Control to Team Play in Simulated Humanoid Football
From Motor Control to Team Play in Simulated Humanoid Football
Siqi Liu
Guy Lever
Zhe Wang
J. Merel
S. M. Ali Eslami
...
Tuomas Haarnoja
Brendan D. Tracey
K. Tuyls
T. Graepel
N. Heess
31
129
0
25 May 2021
Efficient Transformers in Reinforcement Learning using Actor-Learner
  Distillation
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation
Emilio Parisotto
Ruslan Salakhutdinov
42
44
0
04 Apr 2021
Near Optimal Policy Optimization via REPS
Near Optimal Policy Optimization via REPS
Aldo Pacchiano
Jonathan Lee
Peter L. Bartlett
Ofir Nachum
23
3
0
17 Mar 2021
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Benjamin Eysenbach
Sergey Levine
OOD
41
175
0
10 Mar 2021
Latent Imagination Facilitates Zero-Shot Transfer in Autonomous Racing
Latent Imagination Facilitates Zero-Shot Transfer in Autonomous Racing
Axel Brunnbauer
Luigi Berducci
Andreas Brandstätter
Mathias Lechner
Ramin Hasani
Daniela Rus
Radu Grosu
LM&Ro
38
37
0
08 Mar 2021
Foresee then Evaluate: Decomposing Value Estimation with Latent Future
  Prediction
Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Hongyao Tang
Jianye Hao
Guangyong Chen
Pengfei Chen
Chong Chen
Yaodong Yang
Lu Zhang
Wulong Liu
Zhaopeng Meng
OffRL
35
4
0
03 Mar 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
19
10
0
11 Feb 2021
Approximately Solving Mean Field Games via Entropy-Regularized Deep
  Reinforcement Learning
Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning
Kai Cui
Heinz Koeppl
64
91
0
02 Feb 2021
Decoupled Exploration and Exploitation Policies for Sample-Efficient
  Reinforcement Learning
Decoupled Exploration and Exploitation Policies for Sample-Efficient Reinforcement Learning
William F. Whitney
Michael Bloesch
Jost Tobias Springenberg
A. Abdolmaleki
Kyunghyun Cho
Martin Riedmiller
OffRL
29
13
0
23 Jan 2021
Differentiable Trust Region Layers for Deep Reinforcement Learning
Differentiable Trust Region Layers for Deep Reinforcement Learning
Fabian Otto
P. Becker
Ngo Anh Vien
Hanna Ziesche
Gerhard Neumann
OffRL
41
19
0
22 Jan 2021
Grounding Artificial Intelligence in the Origins of Human Behavior
Grounding Artificial Intelligence in the Origins of Human Behavior
Eleni Nisioti
Clément Moulin-Frier
AI4CE
47
5
0
15 Dec 2020
Behavior Priors for Efficient Reinforcement Learning
Behavior Priors for Efficient Reinforcement Learning
Dhruva Tirumala
Alexandre Galashov
Hyeonwoo Noh
Leonard Hasenclever
Razvan Pascanu
...
Guillaume Desjardins
Wojciech M. Czarnecki
Arun Ahuja
Yee Whye Teh
N. Heess
37
39
0
27 Oct 2020
Logistic Q-Learning
Logistic Q-Learning
Joan Bas-Serrano
Sebastian Curi
Andreas Krause
Gergely Neu
14
40
0
21 Oct 2020
Robust Constrained Reinforcement Learning for Continuous Control with
  Model Misspecification
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification
D. Mankowitz
D. A. Calian
Rae Jeong
Cosmin Paduraru
N. Heess
Sumanth Dathathri
Martin Riedmiller
Timothy A. Mann
24
11
0
20 Oct 2020
Learning Dexterous Manipulation from Suboptimal Experts
Learning Dexterous Manipulation from Suboptimal Experts
Rae Jeong
Jost Tobias Springenberg
Jackie Kay
Daniel Zheng
Yuxiang Zhou
Alexandre Galashov
N. Heess
F. Nori
OffRL
18
36
0
16 Oct 2020
Human-centric Dialog Training via Offline Reinforcement Learning
Human-centric Dialog Training via Offline Reinforcement Learning
Natasha Jaques
J. Shen
Asma Ghandeharioun
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
37
92
0
12 Oct 2020
Data-efficient Hindsight Off-policy Option Learning
Data-efficient Hindsight Off-policy Option Learning
Markus Wulfmeier
Dushyant Rao
Roland Hafner
Thomas Lampe
A. Abdolmaleki
...
Michael Neunert
Dhruva Tirumala
Noah Y. Siegel
N. Heess
Martin Riedmiller
OffRL
25
47
0
30 Jul 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
Monte-Carlo Tree Search as Regularized Policy Optimization
Jean-Bastien Grill
Florent Altché
Yunhao Tang
Thomas Hubert
Michal Valko
Ioannis Antonoglou
Rémi Munos
27
73
0
24 Jul 2020
Control as Hybrid Inference
Control as Hybrid Inference
Alexander Tschantz
Beren Millidge
A. Seth
Christopher L. Buckley
19
9
0
11 Jul 2020
Previous
123
Next