ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.10816
  4. Cited By
Chain of Thought Imitation with Procedure Cloning

Chain of Thought Imitation with Procedure Cloning

22 May 2022
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Chain of Thought Imitation with Procedure Cloning"

50 / 62 papers shown
Title
RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs
RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs
Soumya Rani Samineni
Durgesh Kalwar
Karthik Valmeekam
Kaya Stechly
Subbarao Kambhampati
OffRL
97
1
0
19 May 2025
Teaching Large Language Models to Reason through Learning and Forgetting
Teaching Large Language Models to Reason through Learning and Forgetting
Tianwei Ni
Allen Nie
Sapana Chaudhary
Yao Liu
Huzefa Rangwala
Rasool Fakoor
ReLMCLLLRM
461
0
0
15 Apr 2025
FLARE: Faithful Logic-Aided Reasoning and Exploration
FLARE: Faithful Logic-Aided Reasoning and Exploration
Erik Arakelyan
Pasquale Minervini
Pat Verga
Patrick Lewis
Isabelle Augenstein
ReLMLRM
168
2
0
14 Oct 2024
HoneyGPT: Breaking the Trilemma in Terminal Honeypots with Large Language Model
HoneyGPT: Breaking the Trilemma in Terminal Honeypots with Large Language Model
Ziyang Wang
Jianzhou You
Haining Wang
Tianwei Yuan
Shichao Lv
Yang Wang
Limin Sun
81
2
0
04 Jun 2024
STaR: Bootstrapping Reasoning With Reasoning
STaR: Bootstrapping Reasoning With Reasoning
E. Zelikman
Yuhuai Wu
Jesse Mu
Noah D. Goodman
ReLMLRM
144
508
0
28 Mar 2022
Implicit Kinematic Policies: Unifying Joint and Cartesian Action Spaces
  in End-to-End Robot Learning
Implicit Kinematic Policies: Unifying Joint and Cartesian Action Spaces in End-to-End Robot Learning
Aditya Ganapathi
Peter R. Florence
Jacob Varley
Kaylee Burns
Ken Goldberg
Andy Zeng
185
17
0
03 Mar 2022
Online Decision Transformer
Online Decision Transformer
Qinqing Zheng
Amy Zhang
Aditya Grover
OffRL
76
209
0
11 Feb 2022
Can Wikipedia Help Offline Reinforcement Learning?
Can Wikipedia Help Offline Reinforcement Learning?
Machel Reid
Yutaro Yamada
S. Gu
3DVRALMOffRL
215
96
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
843
9,644
0
28 Jan 2022
Tell me why! Explanations support learning relational and causal
  structure
Tell me why! Explanations support learning relational and causal structure
Andrew Kyle Lampinen
Nicholas A. Roy
Ishita Dasgupta
Stephanie C. Y. Chan
Allison C. Tam
...
Chen Yan
Adam Santoro
Neil C. Rabinowitz
Jane X. Wang
Felix Hill
100
46
0
07 Dec 2021
Show Your Work: Scratchpads for Intermediate Computation with Language
  Models
Show Your Work: Scratchpads for Intermediate Computation with Language Models
Maxwell Nye
Anders Andreassen
Guy Gur-Ari
Henryk Michalewski
Jacob Austin
...
Aitor Lewkowycz
Maarten Bosma
D. Luan
Charles Sutton
Augustus Odena
ReLMLRM
183
753
0
30 Nov 2021
Generalized Decision Transformer for Offline Hindsight Information
  Matching
Generalized Decision Transformer for Offline Hindsight Information Matching
Hiroki Furuta
Y. Matsuo
S. Gu
OffRL
72
103
0
19 Nov 2021
TRAIL: Near-Optimal Imitation Learning with Suboptimal Data
TRAIL: Near-Optimal Imitation Learning with Suboptimal Data
Mengjiao Yang
Sergey Levine
Ofir Nachum
OffRL
82
42
0
27 Oct 2021
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLMOffRLLRM
342
4,569
0
27 Oct 2021
RuleBert: Teaching Soft Rules to Pre-trained Language Models
RuleBert: Teaching Soft Rules to Pre-trained Language Models
Mohammed Saeed
N. Ahmadi
Preslav Nakov
Paolo Papotti
LRM
312
33
0
24 Sep 2021
Multi-Task Learning with Sequence-Conditioned Transporter Networks
Multi-Task Learning with Sequence-Conditioned Transporter Networks
M. H. Lim
Andy Zeng
Brian Ichter
Maryam Bandari
Erwin Coumans
Claire Tomlin
S. Schaal
Aleksandra Faust
54
15
0
15 Sep 2021
Implicit Behavioral Cloning
Implicit Behavioral Cloning
Peter R. Florence
Corey Lynch
Andy Zeng
Oscar Ramirez
Ayzaan Wahid
Laura Downs
Adrian S. Wong
Johnny Lee
Igor Mordatch
Jonathan Tompson
OffRL
119
390
0
01 Sep 2021
Hierarchical Few-Shot Imitation with Skill Transition Models
Hierarchical Few-Shot Imitation with Skill Transition Models
Kourosh Hakhamaneshi
Ruihan Zhao
Albert Zhan
Pieter Abbeel
Michael Laskin
OffRL
77
42
0
19 Jul 2021
Visual Adversarial Imitation Learning using Variational Models
Visual Adversarial Imitation Learning using Variational Models
Rafael Rafailov
Tianhe Yu
Aravind Rajeswaran
Chelsea Finn
SSL
81
50
0
16 Jul 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Michael Janner
Qiyang Li
Sergey Levine
OffRL
158
685
0
03 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
136
1,658
0
02 Jun 2021
Provable Representation Learning for Imitation with Contrastive Fourier
  Features
Provable Representation Learning for Imitation with Contrastive Fourier Features
Ofir Nachum
Mengjiao Yang
SSLOffRL
86
39
0
26 May 2021
Representation Matters: Offline Pretraining for Sequential Decision
  Making
Representation Matters: Offline Pretraining for Sequential Decision Making
Mengjiao Yang
Ofir Nachum
SSLOffRL
78
119
0
11 Feb 2021
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement
  Learning
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
Anurag Ajay
Aviral Kumar
Pulkit Agrawal
Sergey Levine
Ofir Nachum
OffRLOnRL
87
159
0
26 Oct 2020
Learning Quadrupedal Locomotion over Challenging Terrain
Learning Quadrupedal Locomotion over Challenging Terrain
Joonho Lee
Jemin Hwangbo
Lorenz Wellhausen
V. Koltun
Marco Hutter
142
1,176
0
21 Oct 2020
Learning Invariant Representations for Reinforcement Learning without
  Reconstruction
Learning Invariant Representations for Reinforcement Learning without Reconstruction
Amy Zhang
R. McAllister
Roberto Calandra
Y. Gal
Sergey Levine
OODSSL
114
478
0
18 Jun 2020
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
  Over Implicit Knowledge
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Alon Talmor
Oyvind Tafjord
Peter Clark
Yoav Goldberg
Jonathan Berant
ReLMLRM
80
38
0
11 Jun 2020
Acme: A Research Framework for Distributed Reinforcement Learning
Acme: A Research Framework for Distributed Reinforcement Learning
Matthew W. Hoffman
Bobak Shahriari
John Aslanides
Gabriel Barth-Maron
Nikola Momchev
...
Srivatsan Srinivasan
A. Cowie
Ziyun Wang
Bilal Piot
Nando de Freitas
120
226
0
01 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
873
42,379
0
28 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GPOffRL
229
1,381
0
15 Apr 2020
Provable Representation Learning for Imitation Learning via Bi-level
  Optimization
Provable Representation Learning for Imitation Learning via Bi-level Optimization
Sanjeev Arora
S. Du
Sham Kakade
Yuping Luo
Nikunj Saunshi
72
61
0
24 Feb 2020
Transformers as Soft Reasoners over Language
Transformers as Soft Reasoners over Language
Peter Clark
Oyvind Tafjord
Kyle Richardson
ReLMOffRLLRM
106
360
0
14 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
611
4,905
0
23 Jan 2020
Learning by Cheating
Learning by Cheating
Dian Chen
Brady Zhou
V. Koltun
Philipp Krahenbuhl
SSL
112
517
0
27 Dec 2019
Leveraging Procedural Generation to Benchmark Reinforcement Learning
Leveraging Procedural Generation to Benchmark Reinforcement Learning
K. Cobbe
Christopher Hesse
Jacob Hilton
John Schulman
79
557
0
03 Dec 2019
IRIS: Implicit Reinforcement without Interaction at Scale for Learning
  Control from Offline Robot Manipulation Data
IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data
Ajay Mandlekar
Fabio Ramos
Byron Boots
Silvio Savarese
Li Fei-Fei
Animesh Garg
Dieter Fox
OffRL
91
119
0
13 Nov 2019
Stabilizing Transformers for Reinforcement Learning
Stabilizing Transformers for Reinforcement Learning
Emilio Parisotto
H. F. Song
Jack W. Rae
Razvan Pascanu
Çağlar Gülçehre
...
Aidan Clark
Seb Noury
M. Botvinick
N. Heess
R. Hadsell
OffRL
91
366
0
13 Oct 2019
Goal-conditioned Imitation Learning
Goal-conditioned Imitation Learning
Yiming Ding
Carlos Florensa
Mariano Phielipp
Pieter Abbeel
67
227
0
13 Jun 2019
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Nazneen Rajani
Bryan McCann
Caiming Xiong
R. Socher
ReLMLRM
84
566
0
06 Jun 2019
Quantifying Generalization in Reinforcement Learning
Quantifying Generalization in Reinforcement Learning
K. Cobbe
Oleg Klimov
Christopher Hesse
Taehoon Kim
John Schulman
OffRL
109
674
0
06 Dec 2018
Generalization and Regularization in DQN
Generalization and Regularization in DQN
Jesse Farebrother
Marlos C. Machado
Michael Bowling
96
207
0
29 Sep 2018
A Study on Overfitting in Deep Reinforcement Learning
A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
OffRLOnRL
59
391
0
18 Apr 2018
Hierarchical Imitation and Reinforcement Learning
Hierarchical Imitation and Reinforcement Learning
Hoang Minh Le
Nan Jiang
Alekh Agarwal
Miroslav Dudík
Yisong Yue
Hal Daumé
59
192
0
01 Mar 2018
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
Yuke Zhu
Ziyun Wang
J. Merel
Andrei A. Rusu
Tom Erez
...
S. Tunyasuvunakool
János Kramár
R. Hadsell
Nando de Freitas
N. Heess
SSL
96
320
0
26 Feb 2018
Mastering Chess and Shogi by Self-Play with a General Reinforcement
  Learning Algorithm
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
...
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
153
1,782
0
05 Dec 2017
Neural Task Programming: Learning to Generalize Across Hierarchical
  Tasks
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
Danfei Xu
Suraj Nair
Yuke Zhu
J. Gao
Animesh Garg
Li Fei-Fei
Silvio Savarese
75
197
0
04 Oct 2017
Revisiting the Arcade Learning Environment: Evaluation Protocols and
  Open Problems for General Agents
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
Marlos C. Machado
Marc G. Bellemare
Erik Talvitie
J. Veness
Matthew J. Hausknecht
Michael Bowling
94
557
0
18 Sep 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
778
132,363
0
12 Jun 2017
Program Induction by Rationale Generation : Learning to Solve and
  Explain Algebraic Word Problems
Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems
Wang Ling
Dani Yogatama
Chris Dyer
Phil Blunsom
AIMat
106
735
0
11 May 2017
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement
  Learning
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
78
311
0
22 Mar 2017
12
Next