Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.10284
Cited By
From "Thumbs Up" to "10 out of 10": Reconsidering Scalar Feedback in Interactive Reinforcement Learning
17 November 2023
Hang Yu
Reuben M. Aronson
Katherine H. Allen
E. Short
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"From "Thumbs Up" to "10 out of 10": Reconsidering Scalar Feedback in Interactive Reinforcement Learning"
15 / 15 papers shown
Title
Enhancing Preference-based Linear Bandits via Human Response Time
Shen Li
Yuyang Zhang
Tongzheng Ren
Claire Liang
Na Li
J. Shah
144
1
0
03 Jan 2025
Self-Initiated Open World Learning for Autonomous AI Agents
Bing-Quan Liu
Eric Robertson
Scott Grigsby
Sahisnu Mazumder
AI4CE
72
8
0
21 Oct 2021
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
Christian Arzate Cruz
Takeo Igarashi
OffRL
46
96
0
27 May 2021
On Reward-Free Reinforcement Learning with Linear Function Approximation
Ruosong Wang
S. Du
Lin F. Yang
Ruslan Salakhutdinov
OffRL
73
107
0
19 Jun 2020
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
116
420
0
19 Nov 2018
Reward learning from human preferences and demonstrations in Atari
Borja Ibarz
Jan Leike
Tobias Pohlen
G. Irving
Shane Legg
Dario Amodei
101
397
0
15 Nov 2018
DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback
Riku Arakawa
Sosuke Kobayashi
Y. Unno
Yuta Tsuboi
S. Maeda
51
75
0
28 Oct 2018
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
218
3,377
0
12 Jun 2017
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRM
SSL
125
2,451
0
15 May 2017
Representation Learning and Pairwise Ranking for Implicit Feedback in Recommendation Systems
Sumit Sidana
Mikhail Trofimov
Oleh Horodnytskyi
Charlotte Laclau
Yury Maximov
Massih-Reza Amini
FedML
92
25
0
29 Apr 2017
Interactive Learning from Policy-Dependent Human Feedback
J. MacGlashan
Mark K. Ho
R. Loftin
Bei Peng
Guan Wang
David L. Roberts
Matthew E. Taylor
Michael L. Littman
87
306
0
21 Jan 2017
Variational Intrinsic Control
Karol Gregor
Danilo Jimenez Rezende
Daan Wierstra
DRL
OffRL
88
430
0
22 Nov 2016
Generative Adversarial Imitation Learning
Jonathan Ho
Stefano Ermon
GAN
159
3,125
0
10 Jun 2016
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
179
1,484
0
06 Jun 2016
On Wasserstein Two Sample Testing and Related Families of Nonparametric Tests
Aaditya Ramdas
Nicolas García Trillos
Marco Cuturi
71
487
0
08 Sep 2015
1