Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.04257
Cited By
Deep Reinforcement Learning from Policy-Dependent Human Feedback
12 February 2019
Dilip Arumugam
Jun Ki Lee
S. Saskin
Michael L. Littman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Reinforcement Learning from Policy-Dependent Human Feedback"
26 / 26 papers shown
Title
PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation
Yuxuan Liu
45
0
0
03 Mar 2025
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
105
18
0
17 Jan 2025
CREW: Facilitating Human-AI Teaming Research
Lingyu Zhang
Zhengran Ji
Boyuan Chen
44
3
0
03 Jan 2025
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Fatemeh Pesaran Zadeh
Juyeon Kim
Jin-Hwa Kim
Gunhee Kim
ALM
51
1
0
05 Oct 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
40
0
0
09 Jul 2024
Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning
Md Saiful Islam
Srijita Das
S. Gottipati
William Duguay
Clodéric Mars
Jalal Arabneydi
Antoine Fagette
Matthew J. Guzdial
Matthew E. Taylor
38
1
0
23 Dec 2023
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
Rohin Shah
23
6
0
05 Dec 2023
Continual Learning for Instruction Following from Realtime Feedback
Alane Suhr
Yoav Artzi
26
17
0
19 Dec 2022
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Katherine Metcalf
Miguel Sarabia
B. Theobald
OffRL
38
4
0
12 Nov 2022
Incorporating Voice Instructions in Model-Based Reinforcement Learning for Self-Driving Cars
Mingze Wang
Ziyang Zhang
Grace Hui Yang
21
1
0
21 Jun 2022
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback
Jensen Gao
S. Reddy
Glen Berseth
Nicholas Hardy
N. Natraj
K. Ganguly
Anca Dragan
Sergey Levine
20
10
0
04 Mar 2022
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning
S. Chen
Jensen Gao
S. Reddy
Glen Berseth
Anca Dragan
Sergey Levine
OffRL
33
11
0
05 Feb 2022
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
40
93
0
04 Nov 2021
Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
Eugenio Chisari
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
Abhinav Valada
19
37
0
07 Oct 2021
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback
Xiaofei Wang
Kimin Lee
Kourosh Hakhamaneshi
Pieter Abbeel
Michael Laskin
17
42
0
11 Aug 2021
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Noriyuki Kojima
Alane Suhr
Yoav Artzi
25
24
0
10 Aug 2021
Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks
Ruohan Zhang
F. Torabi
Garrett A. Warnell
Peter Stone
73
28
0
13 Jul 2021
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
Christian Arzate Cruz
Takeo Igarashi
OffRL
11
93
0
27 May 2021
An overview of 11 proposals for building safe advanced AI
Evan Hubinger
AAML
22
23
0
04 Dec 2020
Avoiding Tampering Incentives in Deep RL via Decoupled Approval
J. Uesato
Ramana Kumar
Victoria Krakovna
Tom Everitt
Richard Ngo
Shane Legg
26
14
0
17 Nov 2020
Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning
Adam Bignold
Francisco Cruz
Richard Dazeley
Peter Vamplew
Cameron Foale
22
18
0
21 Sep 2020
Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation
L. Guan
Mudit Verma
Sihang Guo
Ruohan Zhang
Subbarao Kambhampati
43
42
0
26 Jun 2020
Learning to Interactively Learn and Assist
Mark P. Woodward
Chelsea Finn
Karol Hausman
13
33
0
24 Jun 2019
Robot Learning via Human Adversarial Games
Jiali Duan
Qian Wang
Lerrel Pinto
C.-C. Jay Kuo
Stefanos Nikolaidis
AAML
SSL
14
7
0
02 Mar 2019
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
34
395
0
19 Nov 2018
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1