Deep Reinforcement Learning

15 October 2018

Papers citing "Deep Reinforcement Learning"

21 / 521 papers shown

Title
Auto-Encoding Variational Bayes Diederik P. Kingma Max Welling BDL 455 16,922 0 20 Dec 2013
Playing Atari with Deep Reinforcement Learning Volodymyr Mnih Koray Kavukcuoglu David Silver Alex Graves Ioannis Antonoglou Daan Wierstra Martin Riedmiller 129 12,269 0 19 Dec 2013
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition Jeff Donahue Yangqing Jia Oriol Vinyals Judy Hoffman Ning Zhang Eric Tzeng Trevor Darrell VLM ObjD 191 4,951 0 06 Oct 2013
Speech Recognition with Deep Recurrent Neural Networks Alex Graves Abdel-rahman Mohamed Geoffrey E. Hinton 230 8,526 0 22 Mar 2013
Efficient Estimation of Word Representations in Vector Space Tomas Mikolov Kai Chen G. Corrado J. Dean 3DV 691 31,553 0 16 Jan 2013
PEGASUS: A Policy Search Method for Large MDPs and POMDPs A. Ng Michael I. Jordan 113 496 0 16 Jan 2013
The Arcade Learning Environment: An Evaluation Platform for General Agents Marc G. Bellemare Yavar Naddaf J. Veness Michael Bowling 120 3,021 0 19 Jul 2012
Predictive State Representations: A New Theory for Modeling Dynamical Systems Satinder Singh Michael R. James Matthew R. Rudary AI4TS AI4CE 91 289 0 11 Jul 2012
Representation Learning: A Review and New Perspectives Yoshua Bengio Aaron Courville Pascal Vincent OOD SSL 278 12,460 0 24 Jun 2012
Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting Ning Xie Hirotaka Hachiya Masashi Sugiyama 83 83 0 18 Jun 2012
Q-learning with censored data Y. Goldberg Michael R. Kosorok OffRL 127 139 0 30 May 2012
Off-Policy Actor-Critic T. Degris Martha White R. Sutton OffRL CML 230 220 0 22 May 2012
Parametric Return Density Estimation for Reinforcement Learning Tetsuro Morimura Masashi Sugiyama H. Kashima Hirotaka Hachiya Toshiyuki Tanaka 83 112 0 15 Mar 2012
Building high-level features using large scale unsupervised learning Quoc V. Le MarcÁurelio Ranzato R. Monga M. Devin Kai Chen G. Corrado J. Dean A. Ng SSL OffRL CVBM 122 2,272 0 29 Dec 2011
Optimal and Approximate Q-value Functions for Decentralized POMDPs F. Oliehoek M. Spaan N. Vlassis OffRL 116 503 0 31 Oct 2011
GIB: Imperfect Information in a Computationally Challenging Game Matthew L. Ginsberg 72 148 0 03 Jun 2011
From Machine Learning to Machine Reasoning Léon Bottou LRM ReLM NAI 146 285 0 09 Feb 2011
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning Stéphane Ross Geoffrey J. Gordon J. Andrew Bagnell OffRL 254 3,238 0 02 Nov 2010
A Comprehensive Survey of Data Mining-based Fraud Detection Research Shing-Han Li D. Yen Wen-Hui Lu Chi-Jer Wang 97 445 0 30 Sep 2010
A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong Li Wei Chu John Langford Robert Schapire 473 2,957 0 28 Feb 2010
Search-based Structured Prediction Hal Daumé John Langford Daniel Marcu GNN 142 586 0 04 Jul 2009