ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.10284
  4. Cited By
From "Thumbs Up" to "10 out of 10": Reconsidering Scalar Feedback in
  Interactive Reinforcement Learning

From "Thumbs Up" to "10 out of 10": Reconsidering Scalar Feedback in Interactive Reinforcement Learning

17 November 2023
Hang Yu
Reuben M. Aronson
Katherine H. Allen
E. Short
ArXiv (abs)PDFHTML

Papers citing "From "Thumbs Up" to "10 out of 10": Reconsidering Scalar Feedback in Interactive Reinforcement Learning"

15 / 15 papers shown
Title
Enhancing Preference-based Linear Bandits via Human Response Time
Enhancing Preference-based Linear Bandits via Human Response Time
Shen Li
Yuyang Zhang
Tongzheng Ren
Claire Liang
Na Li
J. Shah
144
1
0
03 Jan 2025
Self-Initiated Open World Learning for Autonomous AI Agents
Self-Initiated Open World Learning for Autonomous AI Agents
Bing-Quan Liu
Eric Robertson
Scott Grigsby
Sahisnu Mazumder
AI4CE
72
8
0
21 Oct 2021
A Survey on Interactive Reinforcement Learning: Design Principles and
  Open Challenges
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
Christian Arzate Cruz
Takeo Igarashi
OffRL
46
96
0
27 May 2021
On Reward-Free Reinforcement Learning with Linear Function Approximation
On Reward-Free Reinforcement Learning with Linear Function Approximation
Ruosong Wang
S. Du
Lin F. Yang
Ruslan Salakhutdinov
OffRL
73
107
0
19 Jun 2020
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
116
420
0
19 Nov 2018
Reward learning from human preferences and demonstrations in Atari
Reward learning from human preferences and demonstrations in Atari
Borja Ibarz
Jan Leike
Tobias Pohlen
G. Irving
Shane Legg
Dario Amodei
101
397
0
15 Nov 2018
DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable
  Feedback
DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback
Riku Arakawa
Sosuke Kobayashi
Y. Unno
Yuta Tsuboi
S. Maeda
51
75
0
28 Oct 2018
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
218
3,377
0
12 Jun 2017
Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRMSSL
125
2,451
0
15 May 2017
Representation Learning and Pairwise Ranking for Implicit Feedback in
  Recommendation Systems
Representation Learning and Pairwise Ranking for Implicit Feedback in Recommendation Systems
Sumit Sidana
Mikhail Trofimov
Oleh Horodnytskyi
Charlotte Laclau
Yury Maximov
Massih-Reza Amini
FedML
92
25
0
29 Apr 2017
Interactive Learning from Policy-Dependent Human Feedback
Interactive Learning from Policy-Dependent Human Feedback
J. MacGlashan
Mark K. Ho
R. Loftin
Bei Peng
Guan Wang
David L. Roberts
Matthew E. Taylor
Michael L. Littman
87
306
0
21 Jan 2017
Variational Intrinsic Control
Variational Intrinsic Control
Karol Gregor
Danilo Jimenez Rezende
Daan Wierstra
DRLOffRL
88
430
0
22 Nov 2016
Generative Adversarial Imitation Learning
Generative Adversarial Imitation Learning
Jonathan Ho
Stefano Ermon
GAN
159
3,125
0
10 Jun 2016
Unifying Count-Based Exploration and Intrinsic Motivation
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
179
1,484
0
06 Jun 2016
On Wasserstein Two Sample Testing and Related Families of Nonparametric
  Tests
On Wasserstein Two Sample Testing and Related Families of Nonparametric Tests
Aaditya Ramdas
Nicolas García Trillos
Marco Cuturi
71
487
0
08 Sep 2015
1