ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.07418
  4. Cited By
KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human
  Suboptimal Knowledge
v1v2 (latest)

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

18 February 2020
Peng Zhang
Jianye Hao
Weixun Wang
Hongyao Tang
Yi Ma
Yihai Duan
Yan Zheng
    OffRLOnRL
ArXiv (abs)PDFHTML

Papers citing "KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge"

10 / 10 papers shown
Title
Scalable trust-region method for deep reinforcement learning using
  Kronecker-factored approximation
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
61
630
0
17 Aug 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
544
19,296
0
20 Jul 2017
Deep Q-learning from Demonstrations
Deep Q-learning from Demonstrations
Todd Hester
Matej Vecerík
Olivier Pietquin
Marc Lanctot
Tom Schaul
...
Gabriel Dulac-Arnold
Ian Osband
J. Agapiou
Joel Z Leibo
A. Gruslys
OffRL
70
156
0
12 Apr 2017
HyperNetworks
HyperNetworks
David R Ha
Andrew M. Dai
Quoc V. Le
170
1,633
0
27 Sep 2016
Generative Adversarial Imitation Learning
Generative Adversarial Imitation Learning
Jonathan Ho
Stefano Ermon
GAN
159
3,125
0
10 Jun 2016
Harnessing Deep Neural Networks with Logic Rules
Harnessing Deep Neural Networks with Logic Rules
Zhiting Hu
Xuezhe Ma
Zhengzhong Liu
Eduard H. Hovy
Eric Xing
AI4CENAI
90
614
0
21 Mar 2016
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
129
3,439
0
08 Jun 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
279
6,801
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,364
0
22 Dec 2014
Natural Language Processing (almost) from Scratch
Natural Language Processing (almost) from Scratch
R. Collobert
Jason Weston
Léon Bottou
Michael Karlen
Koray Kavukcuoglu
Pavel P. Kuksa
203
7,729
0
02 Mar 2011
1