v1v2 (latest)

Discovered Policy Optimisation

11 October 2022

Christian Schroeder de Witt

Jakob N. Foerster

OffRL

ArXiv (abs)PDF HTML

Papers citing "Discovered Policy Optimisation"

50 / 63 papers shown

Title
SOReL and TOReL: Two Methods for Fully Offline Reinforcement Learning Mattie Fellows Clarisse Wibault Uljad Berdica Johannes Forkel Jakob Foerster Michael A. Osborne OffRL OnRL 65 0 0 28 May 2025
Hadamax Encoding: Elevating Performance in Model-Free Atari Jacob E. Kooi Zhao Yang Vincent François-Lavet 81 1 0 21 May 2025
Bi-Level Policy Optimization with Nyström Hypergradients Arjun Prakash Naicheng He Denizalp Goktas Amy Greenwald 73 0 0 16 May 2025
Scalable Meta-Learning via Mixed-Mode Differentiation Iurii Kemaev Dan A. Calian Luisa M. Zintgraf Gregory Farquhar H. V. Hasselt 112 1 0 01 May 2025
A Clean Slate for Offline Reinforcement Learning Matthew Jackson Uljad Berdica Jarek Liesen Shimon Whiteson Jakob Foerster OffRL OnRL 91 1 0 15 Apr 2025
Predicting Multi-Agent Specialization via Task Parallelizability Elizabeth Mieczkowski Ruaridh Mon-Williams Neil R. Bramley Christopher G. Lucas Natalia Vélez Thomas Griffiths 98 1 0 19 Mar 2025
SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas Zihao Guo Richard Willis Richard Willis Tristan Tomilin Joel Z Leibo Yali Du 122 0 0 18 Mar 2025
IPCGRL: Language-Instructed Reinforcement Learning for Procedural Level Generation In-Chang Baek Sung-Hyun Kim Seo-Young Lee Dong-Hyeun Kim Kyung-Joong Kim 97 0 0 16 Mar 2025
PCGRLLM: Large Language Model-Driven Reward Design for Procedural Content Generation Reinforcement Learning In-Chang Baek Sung-Hyun Kim Sam Earle Zehua Jiang Noh Jin-Ha Julian Togelius Kyung-Joong Kim 80 2 0 15 Feb 2025
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps Benjamin Ellis Matthew Jackson Andrei Lupu Alexander David Goldie Mattie Fellows Shimon Whiteson Jakob Foerster 142 3 0 22 Dec 2024
A Method for Evaluating Hyperparameter Sensitivity in Reinforcement Learning Jacob Adkins Michael Bowling Adam White 135 5 0 10 Dec 2024
Beyond the Boundaries of Proximal Policy Optimization Charlie B. Tan Edan Toledo Benjamin Ellis Jakob Foerster Ferenc Huszár 52 0 0 01 Nov 2024
Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks Michael T. Matthews Michael Beukman Chris Xiaoxuan Lu Jakob Foerster OffRL AI4CE 121 8 0 30 Oct 2024
Reinforcement Learning Controllers for Soft Robots using Learned Environments Uljad Berdica Matthew Jackson Niccolò Enrico Veronese Jakob Foerster Perla Maiolino DRL 32 1 0 24 Oct 2024
Foragax: An Agent-Based Modelling Framework Based on JAX Siddharth Chaturvedi Ahmed El-Gazzar Marcel van Gerven 72 1 0 10 Sep 2024
Real-Time Recurrent Learning using Trace Units in Reinforcement Learning Esraa Elelimy Adam White Michael Bowling Martha White OffRL 90 3 0 02 Sep 2024
JaxLife: An Open-Ended Agentic Simulator Chris Xiaoxuan Lu Michael Beukman Michael T. Matthews Jakob Foerster LM&Ro 82 3 0 01 Sep 2024
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery Alexander Rutherford Michael Beukman Timon Willi Bruno Lacerda Nick Hawes Jakob Foerster 104 9 0 27 Aug 2024
PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators Sam Earle Zehua Jiang Julian Togelius 61 3 0 22 Aug 2024
NAVIX: Scaling MiniGrid Environments with JAX Eduardo Pignatelli Jarek Liesen R. T. Lange Chris Xiaoxuan Lu Pablo Samuel Castro Laura Toni 142 4 0 28 Jul 2024
Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy Cameron Allen Aaron Kirtland Ruo Yu Tao Sam Lobel Daniel Scott Nicholas Petrocelli Omer Gottesman Ronald E. Parr M. L. Littman George Konidaris 50 2 0 10 Jul 2024
Can Learned Optimization Make Reinforcement Learning Less Difficult? Alexander David Goldie Chris Xiaoxuan Lu Matthew Jackson Shimon Whiteson Jakob N. Foerster 141 5 0 09 Jul 2024
Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents Sam Earle Julian Togelius 89 1 0 05 Jul 2024
Simplifying Deep Temporal Difference Learning Matteo Gallici Mattie Fellows Benjamin Ellis B. Pou Ivan Masmitja Jakob Foerster Mario Martin OffRL 170 26 0 05 Jul 2024
Mixture of Experts in a Mixture of RL settings Timon Willi J. Obando-Ceron Jakob Foerster Karolina Dziugaite Pablo Samuel Castro MoE 146 11 0 26 Jun 2024
On the consistency of hyper-parameter selection in value-based deep reinforcement learning J. Obando-Ceron J. G. Araújo Rameswar Panda Pablo Samuel Castro 120 9 0 25 Jun 2024
Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization Félix Chalumeau Refiloe Shabe Noah de Nicola Arnu Pretorius Thomas D. Barrett Nathan Grinsztajn 153 3 0 24 Jun 2024
Behaviour Distillation Andrei Lupu Chris Xiaoxuan Lu Jarek Liesen R. T. Lange Jakob Foerster DD 108 4 0 21 Jun 2024
Discovering Minimal Reinforcement Learning Environments Jarek Liesen Chris Xiaoxuan Lu Andrei Lupu Jakob N. Foerster Henning Sprekeler R. T. Lange OffRL 92 4 0 18 Jun 2024
EvIL: Evolution Strategies for Generalisable Imitation Learning Silvia Sapora Gokul Swamy Chris Xiaoxuan Lu Yee Whye Teh Jakob Nicolaus Foerster 81 6 0 15 Jun 2024
A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI Haruka Kita Sotetsu Koyamada Yotaro Yamaguchi Shin Ishii 73 0 0 14 Jun 2024
Discovering Preference Optimization Algorithms with and for Large Language Models Chris Xiaoxuan Lu Samuel Holt Claudio Fanconi Alex J. Chan Jakob Foerster M. Schaar R. T. Lange OffRL 114 18 0 12 Jun 2024
Speeding up Policy Simulation in Supply Chain RL Vivek Farias Joren Gijsbrechts Aryan I. Khojandi Tianyi Peng A. Zheng 101 0 0 04 Jun 2024
Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning Jonathan Cook Chris Xiaoxuan Lu Edward Hughes Joel Z Leibo Jakob N. Foerster 84 6 0 01 Jun 2024
Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning Maximilian Nägele Jan Olle Thomas Fösel Remmy Zen Florian Marquardt 135 2 0 22 May 2024
Preparing for Black Swans: The Antifragility Imperative for Machine Learning Ming Jin 114 2 0 18 May 2024
Searching Search Spaces: Meta-evolving a Geometric Encoding for Neural Networks Tarek Kunze Paul Templier Dennis G. Wilson 68 0 0 20 Mar 2024
JaxUED: A simple and useable UED library in Jax Samuel Coward Michael Beukman Jakob Foerster 75 5 0 19 Mar 2024
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation Carmelo Sferrazza Dun-Ming Huang Xingyu Lin Youngwoon Lee Pieter Abbeel 128 48 0 15 Mar 2024
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning Michael T. Matthews Michael Beukman Benjamin Ellis Mikayel Samvelyan Matthew Jackson Samuel Coward Jakob Foerster OffRL 98 31 0 26 Feb 2024
Refining Minimax Regret for Unsupervised Environment Design Michael Beukman Samuel Coward Michael T. Matthews Mattie Fellows Minqi Jiang Michael Dennis Jakob Foerster 93 8 0 19 Feb 2024
Discovering Temporally-Aware Reinforcement Learning Algorithms Matthew Jackson Chris Xiaoxuan Lu Louis Kirsch R. T. Lange Shimon Whiteson Jakob N. Foerster 110 18 0 08 Feb 2024
Learning mirror maps in policy mirror descent Carlo Alfano Sebastian Towers Silvia Sapora Chris Xiaoxuan Lu Patrick Rebeschini 63 0 0 07 Feb 2024
The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games Jake Levi Chris Xiaoxuan Lu Timon Willi Christian Schroeder de Witt Jakob N. Foerster 57 0 0 02 Feb 2024
Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid Algorithms Pengyi Li Jianye Hao Hongyao Tang Xian Fu Yan Zheng Ke Tang 119 13 0 22 Jan 2024
XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX Alexander Nikulin Vladislav Kurenkov Ilya Zisman Artem Agarkov Viacheslav Sinii Sergey Kolesnikov 116 30 0 19 Dec 2023
minimax: Efficient Baselines for Autocurricula in JAX Minqi Jiang Michael Dennis Edward Grefenstette Tim Rocktaschel 76 9 0 21 Nov 2023
Deep Model Predictive Optimization Jacob Sacks Rwik Rana Kevin Huang Alex Spitzer Guanya Shi Byron Boots 91 7 0 06 Oct 2023
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design Matthew Jackson Minqi Jiang Jack Parker-Holder Risto Vuorio Chris Xiaoxuan Lu Gregory Farquhar Shimon Whiteson Jakob N. Foerster OOD 64 9 0 04 Oct 2023
JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading Sascha Frey Kang Li Peer Nagy Silvia Sapora Chris Xiaoxuan Lu S. Zohren Jakob N. Foerster Anisoara Calinescu 76 15 0 25 Aug 2023