ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.04257
  4. Cited By
Deep Reinforcement Learning from Policy-Dependent Human Feedback

Deep Reinforcement Learning from Policy-Dependent Human Feedback

12 February 2019
Dilip Arumugam
Jun Ki Lee
S. Saskin
Michael L. Littman
ArXivPDFHTML

Papers citing "Deep Reinforcement Learning from Policy-Dependent Human Feedback"

26 / 26 papers shown
Title
PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation
Yuxuan Liu
45
0
0
03 Mar 2025
A Comprehensive Survey of Foundation Models in Medicine
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
105
18
0
17 Jan 2025
CREW: Facilitating Human-AI Teaming Research
CREW: Facilitating Human-AI Teaming Research
Lingyu Zhang
Zhengran Ji
Boyuan Chen
44
3
0
03 Jan 2025
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Fatemeh Pesaran Zadeh
Juyeon Kim
Jin-Hwa Kim
Gunhee Kim
ALM
51
1
0
05 Oct 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
40
0
0
09 Jul 2024
Human-AI Collaboration in Real-World Complex Environment with
  Reinforcement Learning
Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning
Md Saiful Islam
Srijita Das
S. Gottipati
William Duguay
Clodéric Mars
Jalal Arabneydi
Antoine Fagette
Matthew J. Guzdial
Matthew E. Taylor
38
1
0
23 Dec 2023
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for
  Training and Benchmarking Agents that Solve Fuzzy Tasks
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
Rohin Shah
23
6
0
05 Dec 2023
Continual Learning for Instruction Following from Realtime Feedback
Continual Learning for Instruction Following from Realtime Feedback
Alane Suhr
Yoav Artzi
26
17
0
19 Dec 2022
Rewards Encoding Environment Dynamics Improves Preference-based
  Reinforcement Learning
Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Katherine Metcalf
Miguel Sarabia
B. Theobald
OffRL
38
4
0
12 Nov 2022
Incorporating Voice Instructions in Model-Based Reinforcement Learning
  for Self-Driving Cars
Incorporating Voice Instructions in Model-Based Reinforcement Learning for Self-Driving Cars
Mingze Wang
Ziyang Zhang
Grace Hui Yang
21
1
0
21 Jun 2022
X2T: Training an X-to-Text Typing Interface with Online Learning from
  User Feedback
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback
Jensen Gao
S. Reddy
Glen Berseth
Nicholas Hardy
N. Natraj
K. Ganguly
Anca Dragan
Sergey Levine
20
10
0
04 Mar 2022
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement
  Learning
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning
S. Chen
Jensen Gao
S. Reddy
Glen Berseth
Anca Dragan
Sergey Levine
OffRL
33
11
0
05 Feb 2022
B-Pref: Benchmarking Preference-Based Reinforcement Learning
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
40
93
0
04 Nov 2021
Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation
Eugenio Chisari
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
Abhinav Valada
19
37
0
07 Oct 2021
Skill Preferences: Learning to Extract and Execute Robotic Skills from
  Human Feedback
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback
Xiaofei Wang
Kimin Lee
Kourosh Hakhamaneshi
Pieter Abbeel
Michael Laskin
17
42
0
11 Aug 2021
Continual Learning for Grounded Instruction Generation by Observing
  Human Following Behavior
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Noriyuki Kojima
Alane Suhr
Yoav Artzi
25
24
0
10 Aug 2021
Recent Advances in Leveraging Human Guidance for Sequential
  Decision-Making Tasks
Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks
Ruohan Zhang
F. Torabi
Garrett A. Warnell
Peter Stone
73
28
0
13 Jul 2021
A Survey on Interactive Reinforcement Learning: Design Principles and
  Open Challenges
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
Christian Arzate Cruz
Takeo Igarashi
OffRL
11
93
0
27 May 2021
An overview of 11 proposals for building safe advanced AI
An overview of 11 proposals for building safe advanced AI
Evan Hubinger
AAML
22
23
0
04 Dec 2020
Avoiding Tampering Incentives in Deep RL via Decoupled Approval
Avoiding Tampering Incentives in Deep RL via Decoupled Approval
J. Uesato
Ramana Kumar
Victoria Krakovna
Tom Everitt
Richard Ngo
Shane Legg
26
14
0
17 Nov 2020
Human Engagement Providing Evaluative and Informative Advice for
  Interactive Reinforcement Learning
Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning
Adam Bignold
Francisco Cruz
Richard Dazeley
Peter Vamplew
Cameron Foale
22
18
0
21 Sep 2020
Widening the Pipeline in Human-Guided Reinforcement Learning with
  Explanation and Context-Aware Data Augmentation
Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation
L. Guan
Mudit Verma
Sihang Guo
Ruohan Zhang
Subbarao Kambhampati
43
42
0
26 Jun 2020
Learning to Interactively Learn and Assist
Learning to Interactively Learn and Assist
Mark P. Woodward
Chelsea Finn
Karol Hausman
13
33
0
24 Jun 2019
Robot Learning via Human Adversarial Games
Robot Learning via Human Adversarial Games
Jiali Duan
Qian Wang
Lerrel Pinto
C.-C. Jay Kuo
Stefanos Nikolaidis
AAML
SSL
14
7
0
02 Mar 2019
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
34
395
0
19 Nov 2018
Off-Policy Actor-Critic
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1