Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.13741
Cited By
PB
2
^2
2
: Preference Space Exploration via Population-Based Methods in Preference-Based Reinforcement Learning
16 June 2025
Brahim Driss
Alex Davey
Riad Akrour
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PB$^2$: Preference Space Exploration via Population-Based Methods in Preference-Based Reinforcement Learning"
12 / 12 papers shown
Title
TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with Demonstrations
Shuaiyi Huang
Mara Levy
Anubhav Gupta
Daniel Ekpo
Ruijie Zheng
Abhinav Shrivastava
62
1
0
09 May 2025
JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading
Sascha Frey
Kang Li
Peer Nagy
Silvia Sapora
Chris Xiaoxuan Lu
S. Zohren
Jakob N. Foerster
Anisoara Calinescu
50
15
0
25 Aug 2023
Query-Policy Misalignment in Preference-Based Reinforcement Learning
Xiao Hu
Jianxiong Li
Xianyuan Zhan
Qing-Shan Jia
Ya Zhang
73
9
0
27 May 2023
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Katherine Metcalf
Miguel Sarabia
B. Theobald
OffRL
72
5
0
12 Nov 2022
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
96
100
0
04 Nov 2021
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
Kimin Lee
Laura M. Smith
Pieter Abbeel
OffRL
65
288
0
09 Jun 2021
Discovering Diverse Solutions in Deep Reinforcement Learning by Maximizing State-Action-Based Mutual Information
Takayuki Osa
Voot Tangkaratt
Masashi Sugiyama
44
33
0
12 Mar 2021
Active Preference-Based Gaussian Process Regression for Reward Learning
Erdem Biyik
Nicolas Huynh
Mykel J. Kochenderfer
Dorsa Sadigh
GP
72
109
0
06 May 2020
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
143
2,449
0
13 Dec 2018
Population Based Training of Neural Networks
Max Jaderberg
Valentin Dalibard
Simon Osindero
Wojciech M. Czarnecki
Jeff Donahue
...
Tim Green
Iain Dunning
Karen Simonyan
Chrisantha Fernando
Koray Kavukcuoglu
93
744
0
27 Nov 2017
Inverse Reward Design
Dylan Hadfield-Menell
S. Milli
Pieter Abbeel
Stuart J. Russell
Anca Dragan
81
399
0
08 Nov 2017
Illuminating search spaces by mapping elites
Jean-Baptiste Mouret
Jeff Clune
89
735
0
20 Apr 2015
1