Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.10816
Cited By
Chain of Thought Imitation with Procedure Cloning
22 May 2022
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Chain of Thought Imitation with Procedure Cloning"
12 / 62 papers shown
Title
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
Joshua Tobin
Rachel Fong
Alex Ray
Jonas Schneider
Wojciech Zaremba
Pieter Abbeel
259
2,972
0
20 Mar 2017
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
92
778
0
16 Mar 2017
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
111
1,229
0
16 Nov 2016
Learning to Navigate in Complex Environments
Piotr Wojciech Mirowski
Razvan Pascanu
Fabio Viola
Hubert Soyer
Andy Ballard
...
Ross Goroshin
Laurent Sifre
Koray Kavukcuoglu
D. Kumaran
R. Hadsell
107
880
0
11 Nov 2016
Playing FPS Games with Deep Reinforcement Learning
Guillaume Lample
Devendra Singh Chaplot
OffRL
EgoV
89
587
0
18 Sep 2016
Value Iteration Networks
Aviv Tamar
Yi Wu
G. Thomas
Sergey Levine
Pieter Abbeel
79
654
0
09 Feb 2016
Neural Programmer-Interpreters
Scott E. Reed
Nando de Freitas
101
410
0
19 Nov 2015
Recurrent Reinforcement Learning: A Hybrid Approach
Xiujun Li
Lihong Li
Jianfeng Gao
Xiaodong He
Jianshu Chen
Li Deng
Ji He
OffRL
64
77
0
10 Sep 2015
Massively Parallel Methods for Deep Reinforcement Learning
Arun Nair
Praveen Srinivasan
Sam Blackwell
Cagdas Alcicek
Rory Fearon
...
Stig Petersen
Shane Legg
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
OffRL
AI4CE
GNN
102
504
0
15 Jul 2015
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
410
510
0
04 Feb 2014
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
120
3,021
0
19 Jul 2012
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Stéphane Ross
Geoffrey J. Gordon
J. Andrew Bagnell
OffRL
244
3,233
0
02 Nov 2010
Previous
1
2