Robust Reinforcement Learning from Corrupted Human Feedback

Robust Reinforcement Learning from Corrupted Human Feedback

21 June 2024

Alexander Bukharin

Tuo Zhao

Papers citing "Robust Reinforcement Learning from Corrupted Human Feedback"

12 / 12 papers shown

Title
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences Jie Cheng Gang Xiong Xingyuan Dai Qinghai Miao Yisheng Lv Fei-Yue Wang 49 17 0 27 Feb 2024
Corruption Robust Offline Reinforcement Learning with Human Feedback Debmalya Mandal Andi Nika Parameswaran Kamalaruban Adish Singla Goran Radanović OffRL 57 11 0 09 Feb 2024
A General Theoretical Paradigm to Understand Learning from Human Preferences M. G. Azar Mark Rowland Bilal Piot Daniel Guo Daniele Calandriello Michal Valko Rémi Munos 83 580 0 18 Oct 2023
Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms Alexander Bukharin Yan Li Yue Yu Qingru Zhang Zhehui Chen Simiao Zuo Chao Zhang Songan Zhang Tuo Zhao OOD AAML 37 18 0 16 Oct 2023
Reward Model Ensembles Help Mitigate Overoptimization Thomas Coste Usman Anwar Robert Kirk David M. Krueger NoLa ALM 37 126 0 04 Oct 2023
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback Yann Dubois Xuechen Li Rohan Taori Tianyi Zhang Ishaan Gulrajani Jimmy Ba Carlos Guestrin Percy Liang Tatsunori B. Hashimoto ALM 69 569 0 22 May 2023
B-Pref: Benchmarking Preference-Based Reinforcement Learning Kimin Lee Laura M. Smith Anca Dragan Pieter Abbeel OffRL 60 95 0 04 Nov 2021
A Universal Law of Robustness via Isoperimetry Sébastien Bubeck Mark Sellke 25 215 0 26 May 2021
Smooth Exploration for Robotic Reinforcement Learning Antonin Raffin Jens Kober F. Stulp 50 57 0 12 May 2020
Proximal Policy Optimization Algorithms John Schulman Filip Wolski Prafulla Dhariwal Alec Radford Oleg Klimov OffRL 183 18,685 0 20 Jul 2017
A General and Adaptive Robust Loss Function Jonathan T. Barron OOD DRL 64 536 0 11 Jan 2017
Adversarial Machine Learning at Scale Alexey Kurakin Ian Goodfellow Samy Bengio AAML 425 3,124 0 04 Nov 2016