Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.03741
Cited By
Deep reinforcement learning from human preferences
12 June 2017
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep reinforcement learning from human preferences"
16 / 216 papers shown
Title
Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences
Erdem Biyik
Dylan P. Losey
Malayandi Palan
Nicholas C. Landolfi
Gleb Shevchuk
Dorsa Sadigh
57
118
0
24 Jun 2020
Deep Q-learning from Demonstrations
Todd Hester
Matej Vecerík
Olivier Pietquin
Marc Lanctot
Tom Schaul
...
Gabriel Dulac-Arnold
Ian Osband
J. Agapiou
Joel Z Leibo
A. Gruslys
OffRL
54
155
0
12 Apr 2017
Third-Person Imitation Learning
Bradly C. Stadie
Pieter Abbeel
Ilya Sutskever
62
235
0
06 Mar 2017
Interactive Learning from Policy-Dependent Human Feedback
J. MacGlashan
Mark K. Ho
R. Loftin
Bei Peng
Guan Wang
David L. Roberts
Matthew E. Taylor
Michael L. Littman
74
305
0
21 Jan 2017
Generalizing Skills with Semi-Supervised Reinforcement Learning
Chelsea Finn
Tianhe Yu
Justin Fu
Pieter Abbeel
Sergey Levine
OffRL
SSL
84
69
0
01 Dec 2016
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
222
2,389
0
21 Jun 2016
Generative Adversarial Imitation Learning
Jonathan Ho
Stefano Ermon
GAN
140
3,115
0
10 Jun 2016
Cooperative Inverse Reinforcement Learning
Dylan Hadfield-Menell
Anca Dragan
Pieter Abbeel
Stuart J. Russell
69
644
0
09 Jun 2016
Learning Language Games through Interaction
Sida I. Wang
Percy Liang
Christopher D. Manning
56
190
0
08 Jun 2016
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martín Abadi
Ashish Agarwal
P. Barham
E. Brevdo
Zhiwen Chen
...
Pete Warden
Martin Wattenberg
Martin Wicke
Yuan Yu
Xiaoqiang Zheng
274
11,151
0
14 Mar 2016
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
Chelsea Finn
Sergey Levine
Pieter Abbeel
108
949
0
01 Mar 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
197
8,859
0
04 Feb 2016
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,776
0
19 Feb 2015
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
127
12,231
0
19 Dec 2013
APRIL: Active Preference-learning based Reinforcement Learning
R. Akrour
Marc Schoenauer
Michèle Sebag
OffRL
74
128
0
05 Aug 2012
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
117
3,006
0
19 Jul 2012
Previous
1
2
3
4
5