Information asymmetry in KL-regularized RL

3 May 2019

Alexandre Galashov

Siddhant M. Jayakumar

Leonard Hasenclever

Dhruva Tirumala

Jonathan Richard Schwarz

Guillaume Desjardins

Wojciech M. Czarnecki

Papers citing "Information asymmetry in KL-regularized RL"

39 / 39 papers shown

Title
Inverse Decision Modeling: Learning Interpretable Representations of Behavior Daniel Jarrett Alihan Huyuk M. Schaar AI4CE 22 27 0 28 Oct 2023
Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks Andrew Starnes Anton Dereventsov Clayton Webster 24 0 0 09 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF Ted Moskovitz Aaditya K. Singh DJ Strouse T. Sandholm Ruslan Salakhutdinov Anca D. Dragan Stephen Marcus McAleer 50 48 0 06 Oct 2023
Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies Hanna Ziesche Leonel Rozo 26 5 0 17 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load Dilip Arumugam Mark K. Ho Noah D. Goodman Benjamin Van Roy OffRL 34 8 0 05 May 2023
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning Tuomas Haarnoja Ben Moran Guy Lever Sandy H. Huang Dhruva Tirumala ... Andrea Huber N. Hurley F. Nori R. Hadsell N. Heess 50 143 0 26 Apr 2023
A general Markov decision process formalism for action-state entropy-regularized reward maximization D. Grytskyy Jorge Ramírez-Ruiz R. Moreno-Bote 22 3 0 02 Feb 2023
On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations Tim G. J. Rudner Cong Lu Michael A. Osborne Yarin Gal Yee Whye Teh OffRL 38 27 0 28 Dec 2022
SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration Giulia Vezzani Dhruva Tirumala Markus Wulfmeier Dushyant Rao A. Abdolmaleki ... Tim Hertweck Thomas Lampe Fereshteh Sadeghi N. Heess Martin Riedmiller OffRL 43 6 0 24 Nov 2022
Hierarchically Structured Task-Agnostic Continual Learning Heinke Hihn Daniel A. Braun BDL CLL 23 8 0 14 Nov 2022
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning Dilip Arumugam Mark K. Ho Noah D. Goodman Benjamin Van Roy 31 4 0 30 Oct 2022
Towards Artificial Virtuous Agents: Games, Dilemmas and Machine Learning Ajay Vishwanath E. Bøhn Ole-Christoffer Granmo Charl Maree C. Omlin AI4CE 35 5 0 30 Aug 2022
Minimum Description Length Control Theodore H. Moskovitz Ta-Chu Kao M. Sahani M. Botvinick 28 1 0 17 Jul 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning Aniket Didolkar Kshitij Gupta Anirudh Goyal Nitesh B. Gundavarapu Alex Lamb Nan Rosemary Ke Yoshua Bengio AI4CE 121 17 0 30 May 2022
Reinforcement Learning with Intrinsic Affinity for Personalized Prosperity Management Charl Maree C. Omlin 35 1 0 20 Apr 2022
Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors Steven Bohez S. Tunyasuvunakool Philemon Brakel Fereshteh Sadeghi Leonard Hasenclever ... Nathan Batchelor Federico Casarini J. Merel R. Hadsell N. Heess 38 51 0 31 Mar 2022
Robot Learning of Mobile Manipulation with Reachability Behavior Priors Snehal Jauhri Jan Peters Georgia Chalvatzaki 18 45 0 08 Mar 2022
Retrieval-Augmented Reinforcement Learning Anirudh Goyal A. Friesen Andrea Banino T. Weber Nan Rosemary Ke ... Michal Valko Simon Osindero Timothy Lillicrap N. Heess Charles Blundell OffRL 32 53 0 17 Feb 2022
Reinforcement Learning Your Way: Agent Characterization through Policy Regularization Charl Maree C. Omlin 27 8 0 21 Jan 2022
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies Dushyant Rao Fereshteh Sadeghi Leonard Hasenclever Markus Wulfmeier Martina Zambelli ... Dhruva Tirumala Y. Aytar J. Merel N. Heess R. Hadsell 26 28 0 09 Dec 2021
Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning DeepMind Interactive Agents Team Josh Abramson Josh Abramson Arun Ahuja Arthur Brussee Federico Carnevale ... Tamara von Glehn Greg Wayne Nathaniel Wong Chen Yan Rui Zhu LM&Ro 45 46 0 07 Dec 2021
Towards an Understanding of Default Policies in Multitask Policy Optimization Theodore H. Moskovitz Michael Arbel Jack Parker-Holder Aldo Pacchiano 27 9 0 04 Nov 2021
Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning Jinxin Liu Hao Shen Donglin Wang Yachen Kang Qiangxing Tian 32 19 0 25 Oct 2021
Evaluating model-based planning and planner amortization for continuous control Arunkumar Byravan Leonard Hasenclever Piotr Trochim M. Berk Mirza Alessandro Davide Ialongo ... Jost Tobias Springenberg A. Abdolmaleki N. Heess J. Merel Martin Riedmiller 55 17 0 07 Oct 2021
Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics Krishan Rana Vibhavari Dasagi Jesse Haviland Ben Talbot Michael Milford Niko Sünderhauf BDL OffRL 27 31 0 21 Jul 2021
Goal-Conditioned Reinforcement Learning with Imagined Subgoals Elliot Chane-Sane Cordelia Schmid Ivan Laptev 30 141 0 01 Jul 2021
From Motor Control to Team Play in Simulated Humanoid Football Siqi Liu Guy Lever Zhe Wang J. Merel S. M. Ali Eslami ... Tuomas Haarnoja Brendan D. Tracey K. Tuyls T. Graepel N. Heess 31 130 0 25 May 2021
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning Dilip Arumugam Peter Henderson Pierre-Luc Bacon 24 17 0 10 Mar 2021
Specialization in Hierarchical Learning Systems Heinke Hihn Daniel A. Braun 29 16 0 03 Nov 2020
Behavior Priors for Efficient Reinforcement Learning Dhruva Tirumala Alexandre Galashov Hyeonwoo Noh Leonard Hasenclever Razvan Pascanu ... Guillaume Desjardins Wojciech M. Czarnecki Arun Ahuja Yee Whye Teh N. Heess 37 39 0 27 Oct 2020
Learning Dexterous Manipulation from Suboptimal Experts Rae Jeong Jost Tobias Springenberg Jackie Kay Daniel Zheng Yuxiang Zhou Alexandre Galashov N. Heess F. Nori OffRL 18 36 0 16 Oct 2020
Data-efficient Hindsight Off-policy Option Learning Markus Wulfmeier Dushyant Rao Roland Hafner Thomas Lampe A. Abdolmaleki ... Michael Neunert Dhruva Tirumala Noah Y. Siegel N. Heess Martin Riedmiller OffRL 31 47 0 30 Jul 2020
Hierarchically Decoupled Imitation for Morphological Transfer D. Hejna Pieter Abbeel Lerrel Pinto LM&Ro 25 41 0 03 Mar 2020
Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning Noah Y. Siegel Jost Tobias Springenberg Felix Berkenkamp A. Abdolmaleki Michael Neunert Thomas Lampe Roland Hafner Nicolas Heess Martin Riedmiller OffRL 22 282 0 19 Feb 2020
Continual adaptation for efficient machine communication Robert D. Hawkins Minae Kwon Dorsa Sadigh Noah D. Goodman CLL 27 33 0 22 Nov 2019
Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models Arunkumar Byravan Jost Tobias Springenberg A. Abdolmaleki Roland Hafner Michael Neunert Thomas Lampe Noah Y. Siegel N. Heess Martin Riedmiller OffRL 17 41 0 09 Oct 2019
Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses Matt Grenander Yue Dong Jackie C.K. Cheung Annie Louis 24 35 0 08 Sep 2019
Compositional Transfer in Hierarchical Reinforcement Learning Markus Wulfmeier A. Abdolmaleki Roland Hafner Jost Tobias Springenberg Michael Neunert Tim Hertweck Thomas Lampe Noah Y. Siegel N. Heess Martin Riedmiller 30 27 0 26 Jun 2019
Emergence of Locomotion Behaviours in Rich Environments N. Heess TB Dhruva S. Sriram Jay Lemmon J. Merel ... Tom Erez Ziyun Wang S. M. Ali Eslami Martin Riedmiller David Silver 143 928 0 07 Jul 2017