ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.05635
  4. Cited By
Self-Imitation Learning

Self-Imitation Learning

14 June 2018
Junhyuk Oh
Yijie Guo
Satinder Singh
Honglak Lee
    SSL
ArXivPDFHTML

Papers citing "Self-Imitation Learning"

46 / 46 papers shown
Title
DSADF: Thinking Fast and Slow for Decision Making
DSADF: Thinking Fast and Slow for Decision Making
Alex Zhihao Dou
Dongfei Cui
Jun Yan
Wei Wang
Benteng Chen
Haoming Wang
Zeke Xie
Shufei Zhang
OffRL
51
1
0
13 May 2025
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Yu Yue
Yufeng Yuan
Qiying Yu
Xiaochen Zuo
Ruofei Zhu
...
Ru Zhang
Xin Liu
Mingxuan Wang
Yonghui Wu
Lin Yan
OffRL
LRM
56
13
0
07 Apr 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
49
16
0
28 Jan 2025
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRL
OnRL
67
0
0
19 Oct 2024
Online Control-Informed Learning
Online Control-Informed Learning
Zihao Liang
Tianyu Zhou
Zehui Lu
Shaoshuai Mou
38
1
0
04 Oct 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
45
0
0
09 Jul 2024
Interpretable Robotic Manipulation from Language
Interpretable Robotic Manipulation from Language
Boyuan Zheng
Jianlong Zhou
Fang Chen
LM&Ro
48
0
0
27 May 2024
Self-playing Adversarial Language Game Enhances LLM Reasoning
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng
Tianhao Hu
Han Xu
Zhisong Zhang
Yong Dai
Lei Han
Nan Du
Nan Du
Xiaolong Li
SyDa
LRM
ReLM
98
30
0
16 Apr 2024
Adaptive trajectory-constrained exploration strategy for deep
  reinforcement learning
Adaptive trajectory-constrained exploration strategy for deep reinforcement learning
Guojian Wang
Faguo Wu
Xiao Zhang
Ning Guo
Zhiming Zheng
41
3
0
27 Dec 2023
Enhanced Generalization through Prioritization and Diversity in
  Self-Imitation Reinforcement Learning over Procedural Environments with
  Sparse Rewards
Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards
Alain Andres
Daochen Zha
Javier Del Ser
39
0
0
01 Nov 2023
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning
Jinyi Liu
Yi Ma
Jianye Hao
Yujing Hu
Yan Zheng
Tangjie Lv
Changjie Fan
OffRL
52
2
0
27 Jun 2023
I2D2: Inductive Knowledge Distillation with NeuroLogic and
  Self-Imitation
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
28
34
0
19 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy
  Constraints and Q-Ensemble
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
32
0
0
07 Dec 2022
E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel
  Program Guidance
E-MAPP: Efficient Multi-Agent Reinforcement Learning with Parallel Program Guidance
C. Chang
Ni Mu
Jiajun Wu
Ling Pan
Huazhe Xu
50
7
0
05 Dec 2022
Boosting Offline Reinforcement Learning via Data Rebalancing
Boosting Offline Reinforcement Learning via Data Rebalancing
Yang Yue
Bingyi Kang
Xiao Ma
Zhongwen Xu
Gao Huang
Shuicheng Yan
OffRL
26
22
0
17 Oct 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
  Knowledge
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
69
354
0
17 Jun 2022
Exploration in Deep Reinforcement Learning: A Survey
Exploration in Deep Reinforcement Learning: A Survey
Pawel Ladosz
Lilian Weng
Minwoo Kim
H. Oh
OffRL
31
324
0
02 May 2022
Learning Generalizable Dexterous Manipulation from Human Grasp
  Affordance
Learning Generalizable Dexterous Manipulation from Human Grasp Affordance
Yueh-hua Wu
Jiashun Wang
Xiaolong Wang
34
55
0
05 Apr 2022
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When
  to Act
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act
Alexis Jacq
Johan Ferret
Olivier Pietquin
M. Geist
32
9
0
16 Mar 2022
Learning Category-Level Generalizable Object Manipulation Policy via
  Generative Adversarial Self-Imitation Learning from Demonstrations
Learning Category-Level Generalizable Object Manipulation Policy via Generative Adversarial Self-Imitation Learning from Demonstrations
Hao Shen
Weikang Wan
He Wang
SSL
33
24
0
04 Mar 2022
Rethinking Goal-conditioned Supervised Learning and Its Connection to
  Offline RL
Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL
Rui Yang
Yiming Lu
Wenzhe Li
Hao Sun
Meng Fang
Yali Du
Xiu Li
Lei Han
Chongjie Zhang
OffRL
51
67
0
09 Feb 2022
JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical
  Reinforcement Learning
JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning
Zichuan Lin
Junyou Li
Jianing Shi
Deheng Ye
Qiang Fu
Wei Yang
BDL
45
34
0
07 Dec 2021
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise
  Datasets
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
J. E. Grigsby
Yanjun Qi
OffRL
34
5
0
10 Oct 2021
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule
  Generation
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation
Soojung Yang
Doyeong Hwang
Seul Lee
Seongok Ryu
Sung Ju Hwang
39
67
0
04 Oct 2021
Learning to Superoptimize Real-world Programs
Learning to Superoptimize Real-world Programs
Alex Shypula
Pengcheng Yin
Jeremy Lacomis
Claire Le Goues
Edward N. Schwartz
Graham Neubig
NAI
121
10
0
28 Sep 2021
Exploration in Deep Reinforcement Learning: From Single-Agent to
  Multiagent Domain
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Jianye Hao
Tianpei Yang
Hongyao Tang
Chenjia Bai
Jinyi Liu
Zhaopeng Meng
Peng Liu
Zhen Wang
OffRL
41
93
0
14 Sep 2021
Simplifying Deep Reinforcement Learning via Self-Supervision
Simplifying Deep Reinforcement Learning via Self-Supervision
Daochen Zha
Kwei-Herng Lai
Kaixiong Zhou
Xia Hu
SSL
49
15
0
10 Jun 2021
Hierarchical Representation Learning for Markov Decision Processes
Hierarchical Representation Learning for Markov Decision Processes
Lorenzo Steccanella
Simone Totaro
Anders Jonsson
28
4
0
03 Jun 2021
Actionable Models: Unsupervised Offline Reinforcement Learning of
  Robotic Skills
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills
Yevgen Chebotar
Karol Hausman
Yao Lu
Ted Xiao
Dmitry Kalashnikov
...
A. Irpan
Benjamin Eysenbach
Ryan Julian
Chelsea Finn
Sergey Levine
SSL
OffRL
37
146
0
15 Apr 2021
Self-Imitation Learning by Planning
Self-Imitation Learning by Planning
Junhyuk Oh
Yijie Guo
Satinder Singh
SSL
35
85
0
25 Mar 2021
An advantage actor-critic algorithm for robotic motion planning in dense
  and dynamic scenarios
An advantage actor-critic algorithm for robotic motion planning in dense and dynamic scenarios
Chengmin Zhou
Bingding Huang
Pasi Fränti
22
1
0
05 Feb 2021
Asymmetric self-play for automatic goal discovery in robotic
  manipulation
Asymmetric self-play for automatic goal discovery in robotic manipulation
OpenAI OpenAI
Matthias Plappert
Raul Sampedro
Tao Xu
Ilge Akkaya
...
Hyeonwoo Noh
Lilian Weng
Qiming Yuan
Casey Chu
Wojciech Zaremba
SSL
82
76
0
13 Jan 2021
BeBold: Exploration Beyond the Boundary of Explored Regions
BeBold: Exploration Beyond the Boundary of Explored Regions
Tianjun Zhang
Huazhe Xu
Xiaolong Wang
Yi Wu
Kurt Keutzer
Joseph E. Gonzalez
Yuandong Tian
36
40
0
15 Dec 2020
Hierarchical reinforcement learning for efficient exploration and
  transfer
Hierarchical reinforcement learning for efficient exploration and transfer
Lorenzo Steccanella
Simone Totaro
Damien Allonsius
Anders Jonsson
BDL
30
8
0
12 Nov 2020
Learning Abstract Models for Strategic Exploration and Fast Reward
  Transfer
Learning Abstract Models for Strategic Exploration and Fast Reward Transfer
E. Liu
Ramtin Keramati
Sudarshan Seshadri
Kelvin Guu
Panupong Pasupat
Emma Brunskill
Percy Liang
OffRL
27
5
0
12 Jul 2020
Self-Imitation Learning via Generalized Lower Bound Q-learning
Self-Imitation Learning via Generalized Lower Bound Q-learning
Yunhao Tang
SSL
33
24
0
12 Jun 2020
First return, then explore
First return, then explore
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
47
352
0
27 Apr 2020
Sample Efficient Reinforcement Learning through Learning from
  Demonstrations in Minecraft
Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
Christian Scheller
Yanick Schraner
Manfred Vogel
29
27
0
12 Mar 2020
Rewriting History with Inverse RL: Hindsight Inference for Policy
  Improvement
Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement
Benjamin Eysenbach
Xinyang Geng
Sergey Levine
Ruslan Salakhutdinov
OffRL
18
86
0
25 Feb 2020
Population-Guided Parallel Policy Search for Reinforcement Learning
Population-Guided Parallel Policy Search for Reinforcement Learning
Whiyoung Jung
Giseung Park
Y. Sung
OffRL
24
38
0
09 Jan 2020
Learning to Reach Goals via Iterated Supervised Learning
Learning to Reach Goals via Iterated Supervised Learning
Dibya Ghosh
Abhishek Gupta
Ashwin Reddy
Justin Fu
Coline Devin
Benjamin Eysenbach
Sergey Levine
32
34
0
12 Dec 2019
P3O: Policy-on Policy-off Policy Optimization
P3O: Policy-on Policy-off Policy Optimization
Rasool Fakoor
Pratik Chaudhari
Alex Smola
OffRL
29
51
0
05 May 2019
ConvLab: Multi-Domain End-to-End Dialog System Platform
ConvLab: Multi-Domain End-to-End Dialog System Platform
Sungjin Lee
Qi Zhu
Ryuichi Takanobu
Xiang Li
Yaoqin Zhang
...
Jinchao Li
Baolin Peng
Xiujun Li
Minlie Huang
Jianfeng Gao
VLM
27
110
0
18 Apr 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Zhijian Zhang
Haozheng Li
Lu Zhang
Tianyin Zheng
Ting Zhang
Xiong Hao
Xiaoxin Chen
Min Chen
Fangxu Xiao
Wei Zhou
17
15
0
23 Jan 2019
Generative Adversarial Self-Imitation Learning
Generative Adversarial Self-Imitation Learning
Yijie Guo
Junhyuk Oh
Satinder Singh
Honglak Lee
GAN
29
58
0
03 Dec 2018
Learning Self-Imitating Diverse Policies
Learning Self-Imitating Diverse Policies
Tanmay Gangwani
Qiang Liu
Jian Peng
29
65
0
25 May 2018
1