Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.05635
Cited By
Self-Imitation Learning
14 June 2018
Junhyuk Oh
Yijie Guo
Satinder Singh
Honglak Lee
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Imitation Learning"
46 / 46 papers shown
Title
DSADF: Thinking Fast and Slow for Decision Making
Alex Zhihao Dou
Dongfei Cui
Jun Yan
Wei Wang
Benteng Chen
Haoming Wang
Zeke Xie
Shufei Zhang
OffRL
51
1
0
13 May 2025
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Yu Yue
Yufeng Yuan
Qiying Yu
Xiaochen Zuo
Ruofei Zhu
...
Ru Zhang
Xin Liu
Mingxuan Wang
Yonghui Wu
Lin Yan
OffRL
LRM
56
13
0
07 Apr 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
49
16
0
28 Jan 2025
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRL
OnRL
67
0
0
19 Oct 2024
Online Control-Informed Learning
Zihao Liang
Tianyu Zhou
Zehui Lu
Shaoshuai Mou
38
1
0
04 Oct 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
45
0
0
09 Jul 2024
Interpretable Robotic Manipulation from Language
Boyuan Zheng
Jianlong Zhou
Fang Chen
LM&Ro
48
0
0
27 May 2024
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng
Tianhao Hu
Han Xu
Zhisong Zhang
Yong Dai
Lei Han
Nan Du
Nan Du
Xiaolong Li
SyDa
LRM
ReLM
98
30
0
16 Apr 2024
Adaptive trajectory-constrained exploration strategy for deep reinforcement learning
Guojian Wang
Faguo Wu
Xiao Zhang
Ning Guo
Zhiming Zheng
41
3
0
27 Dec 2023
Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards
Alain Andres
Daochen Zha
Javier Del Ser
39
0
0
01 Nov 2023
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
Jinyi Liu
Yi Ma
Jianye Hao
Yujing Hu
Yan Zheng
Tangjie Lv
Changjie Fan
OffRL
52
2
0
27 Jun 2023
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
28
34
0
19 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
32
0
0
07 Dec 2022
E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance
C. Chang
Ni Mu
Jiajun Wu
Ling Pan
Huazhe Xu
50
7
0
05 Dec 2022
Boosting Offline Reinforcement Learning via Data Rebalancing
Yang Yue
Bingyi Kang
Xiao Ma
Zhongwen Xu
Gao Huang
Shuicheng Yan
OffRL
26
22
0
17 Oct 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
69
354
0
17 Jun 2022
Exploration in Deep Reinforcement Learning: A Survey
Pawel Ladosz
Lilian Weng
Minwoo Kim
H. Oh
OffRL
31
324
0
02 May 2022
Learning Generalizable Dexterous Manipulation from Human Grasp Affordance
Yueh-hua Wu
Jiashun Wang
Xiaolong Wang
34
55
0
05 Apr 2022
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Alexis Jacq
Johan Ferret
Olivier Pietquin
M. Geist
32
9
0
16 Mar 2022
Learning Category-Level Generalizable Object Manipulation Policy via Generative Adversarial Self-Imitation Learning from Demonstrations
Hao Shen
Weikang Wan
He Wang
SSL
33
24
0
04 Mar 2022
Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL
Rui Yang
Yiming Lu
Wenzhe Li
Hao Sun
Meng Fang
Yali Du
Xiu Li
Lei Han
Chongjie Zhang
OffRL
51
67
0
09 Feb 2022
JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning
Zichuan Lin
Junyou Li
Jianing Shi
Deheng Ye
Qiang Fu
Wei Yang
BDL
45
34
0
07 Dec 2021
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
J. E. Grigsby
Yanjun Qi
OffRL
34
5
0
10 Oct 2021
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation
Soojung Yang
Doyeong Hwang
Seul Lee
Seongok Ryu
Sung Ju Hwang
39
67
0
04 Oct 2021
Learning to Superoptimize Real-world Programs
Alex Shypula
Pengcheng Yin
Jeremy Lacomis
Claire Le Goues
Edward N. Schwartz
Graham Neubig
NAI
121
10
0
28 Sep 2021
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Jianye Hao
Tianpei Yang
Hongyao Tang
Chenjia Bai
Jinyi Liu
Zhaopeng Meng
Peng Liu
Zhen Wang
OffRL
41
93
0
14 Sep 2021
Simplifying Deep Reinforcement Learning via Self-Supervision
Daochen Zha
Kwei-Herng Lai
Kaixiong Zhou
Xia Hu
SSL
49
15
0
10 Jun 2021
Hierarchical Representation Learning for Markov Decision Processes
Lorenzo Steccanella
Simone Totaro
Anders Jonsson
28
4
0
03 Jun 2021
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills
Yevgen Chebotar
Karol Hausman
Yao Lu
Ted Xiao
Dmitry Kalashnikov
...
A. Irpan
Benjamin Eysenbach
Ryan Julian
Chelsea Finn
Sergey Levine
SSL
OffRL
37
146
0
15 Apr 2021
Self-Imitation Learning by Planning
Junhyuk Oh
Yijie Guo
Satinder Singh
SSL
35
85
0
25 Mar 2021
An advantage actor-critic algorithm for robotic motion planning in dense and dynamic scenarios
Chengmin Zhou
Bingding Huang
Pasi Fränti
22
1
0
05 Feb 2021
Asymmetric self-play for automatic goal discovery in robotic manipulation
OpenAI OpenAI
Matthias Plappert
Raul Sampedro
Tao Xu
Ilge Akkaya
...
Hyeonwoo Noh
Lilian Weng
Qiming Yuan
Casey Chu
Wojciech Zaremba
SSL
82
76
0
13 Jan 2021
BeBold: Exploration Beyond the Boundary of Explored Regions
Tianjun Zhang
Huazhe Xu
Xiaolong Wang
Yi Wu
Kurt Keutzer
Joseph E. Gonzalez
Yuandong Tian
36
40
0
15 Dec 2020
Hierarchical reinforcement learning for efficient exploration and transfer
Lorenzo Steccanella
Simone Totaro
Damien Allonsius
Anders Jonsson
BDL
30
8
0
12 Nov 2020
Learning Abstract Models for Strategic Exploration and Fast Reward Transfer
E. Liu
Ramtin Keramati
Sudarshan Seshadri
Kelvin Guu
Panupong Pasupat
Emma Brunskill
Percy Liang
OffRL
27
5
0
12 Jul 2020
Self-Imitation Learning via Generalized Lower Bound Q-learning
Yunhao Tang
SSL
33
24
0
12 Jun 2020
First return, then explore
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
47
352
0
27 Apr 2020
Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
Christian Scheller
Yanick Schraner
Manfred Vogel
29
27
0
12 Mar 2020
Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement
Benjamin Eysenbach
Xinyang Geng
Sergey Levine
Ruslan Salakhutdinov
OffRL
18
86
0
25 Feb 2020
Population-Guided Parallel Policy Search for Reinforcement Learning
Whiyoung Jung
Giseung Park
Y. Sung
OffRL
24
38
0
09 Jan 2020
Learning to Reach Goals via Iterated Supervised Learning
Dibya Ghosh
Abhishek Gupta
Ashwin Reddy
Justin Fu
Coline Devin
Benjamin Eysenbach
Sergey Levine
32
34
0
12 Dec 2019
P3O: Policy-on Policy-off Policy Optimization
Rasool Fakoor
Pratik Chaudhari
Alex Smola
OffRL
29
51
0
05 May 2019
ConvLab: Multi-Domain End-to-End Dialog System Platform
Sungjin Lee
Qi Zhu
Ryuichi Takanobu
Xiang Li
Yaoqin Zhang
...
Jinchao Li
Baolin Peng
Xiujun Li
Minlie Huang
Jianfeng Gao
VLM
27
110
0
18 Apr 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Zhijian Zhang
Haozheng Li
Lu Zhang
Tianyin Zheng
Ting Zhang
Xiong Hao
Xiaoxin Chen
Min Chen
Fangxu Xiao
Wei Zhou
17
15
0
23 Jan 2019
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
29
58
0
03 Dec 2018
Learning Self-Imitating Diverse Policies
Tanmay Gangwani
Qiang Liu
Jian Peng
29
65
0
25 May 2018
1