ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.01495
  4. Cited By
Hindsight Experience Replay
v1v2v3 (latest)

Hindsight Experience Replay

5 July 2017
Marcin Andrychowicz
Dwight Crow
Alex Ray
Jonas Schneider
Rachel Fong
Peter Welinder
Bob McGrew
Joshua Tobin
Pieter Abbeel
Wojciech Zaremba
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Hindsight Experience Replay"

50 / 1,267 papers shown
Title
Neural Lyapunov Function Approximation with Self-Supervised Reinforcement Learning
Neural Lyapunov Function Approximation with Self-Supervised Reinforcement Learning
Luc McCutcheon
Bahman Gharesifard
Saber Fallah
90
0
0
19 Mar 2025
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
Kevin Wang
Ishaan Javali
Michał Bortkiewicz
Tomasz Trzciñski
Benjamin Eysenbach
SSLOffRL
124
2
0
19 Mar 2025
Hierarchical Reinforcement Learning for Safe Mapless Navigation with Congestion Estimation
Hierarchical Reinforcement Learning for Safe Mapless Navigation with Congestion Estimation
Jianqi Gao
Xizheng Pang
Qi Liu
Yanjie Li
101
0
0
15 Mar 2025
LUMOS: Language-Conditioned Imitation Learning with World Models
Iman Nematollahi
Branton DeMoss
Akshay L Chandra
Nick Hawes
Wolfram Burgard
Ingmar Posner
OffRL
69
1
0
13 Mar 2025
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models
Ruizhe Chen
Wenhao Chai
Zhifei Yang
Xiaotian Zhang
Qiufeng Wang
Tony Q.S. Quek
Soujanya Poria
Zuozhu Liu
133
1
0
06 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei Zhang
Bo Yang
Hua Chen
177
1
0
05 Mar 2025
Causality-Based Reinforcement Learning Method for Multi-Stage Robotic Tasks
Jiechao Deng
Ning Tan
90
0
0
05 Mar 2025
ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment
Shaofei Cai
Zhancun Mu
Hoang Trung-Dung
Yitao Liang
87
6
0
04 Mar 2025
M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality
M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality
Ziyan Wang
Zhicheng Zhang
Fei Fang
Yali Du
117
3
0
03 Mar 2025
Sentence-level Reward Model can Generalize Better for Aligning LLM from Human Preference
Sentence-level Reward Model can Generalize Better for Aligning LLM from Human Preference
Wenjie Qiu
Yi-Chen Li
Xuqin Zhang
Tianyi Zhang
Yiming Zhang
Zongzhang Zhang
Yang Yu
ALM
109
1
0
01 Mar 2025
A Simulation Pipeline to Facilitate Real-World Robotic Reinforcement Learning Applications
A Simulation Pipeline to Facilitate Real-World Robotic Reinforcement Learning Applications
Jefferson Silveira
Joshua A. Marshall
Sidney N. Givigi Jr
122
0
0
24 Feb 2025
Training a Generally Curious Agent
Training a Generally Curious Agent
Fahim Tajwar
Yiding Jiang
Abitha Thankaraj
Sumaita Sadia Rahman
J. Zico Kolter
Jeff Schneider
Ruslan Salakhutdinov
237
3
0
24 Feb 2025
Theoretical Barriers in Bellman-Based Reinforcement Learning
Theoretical Barriers in Bellman-Based Reinforcement Learning
Brieuc Pinon
Raphaël Jungers
Jean-Charles Delvenne
63
0
0
17 Feb 2025
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
Vivek Myers
Bill Chunyuan Zheng
Anca Dragan
Kuan Fang
Sergey Levine
200
1
0
08 Feb 2025
Search-Based Adversarial Estimates for Improving Sample Efficiency in Off-Policy Reinforcement Learning
Search-Based Adversarial Estimates for Improving Sample Efficiency in Off-Policy Reinforcement Learning
Federico Malato
Ville Hautamaki
72
1
0
03 Feb 2025
Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning
Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning
Kaixi Bao
Chenhao Li
Yarden As
Andreas Krause
Marco Hutter
OffRLCLL
259
1
0
03 Feb 2025
Upside Down Reinforcement Learning with Policy Generators
Upside Down Reinforcement Learning with Policy Generators
Jacopo Di Ventura
Dylan R. Ashley
Vincent Herrmann
Francesco Faccio
Jürgen Schmidhuber
106
0
0
27 Jan 2025
Adaptive Data Exploitation in Deep Reinforcement Learning
Adaptive Data Exploitation in Deep Reinforcement Learning
Mingqi Yuan
Bo Li
Xin Jin
Wenjun Zeng
OffRL
457
0
0
22 Jan 2025
Pareto Set Learning for Multi-Objective Reinforcement Learning
Pareto Set Learning for Multi-Objective Reinforcement Learning
Erlong Liu
Yu-Chang Wu
Xiaobin Huang
Chengrui Gao
Ren-Jian Wang
Ke Xue
Chao Qian
OffRL
235
2
0
12 Jan 2025
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
119
2
0
07 Jan 2025
Attribute-Based Robotic Grasping with Data-Efficient Adaptation
Attribute-Based Robotic Grasping with Data-Efficient Adaptation
Yang Yang
Houjian Yu
Xibai Lou
Yuanhao Liu
Changhyun Choi
111
9
0
04 Jan 2025
DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning
DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning
Utsav Singh
Souradip Chakraborty
Wesley A Suttle
Brian M. Sadler
Vinay P. Namboodiri
Amrit Singh Bedi
OffRL
110
0
0
03 Jan 2025
Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning
Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning
Anthony Kobanda
Rémy Portelas
Odalric-Ambrym Maillard
Ludovic Denoyer
OffRLCLL
176
1
0
19 Dec 2024
Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down
  Maps
Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps
Linfeng Zhao
Lawson L. S. Wong
134
1
0
16 Dec 2024
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations
Cevahir Köprülü
Po-han Li
Tianyu Qiu
Ruihan Zhao
T. Westenbroek
David Fridovich-Keil
Sandeep Chinchali
Ufuk Topcu
OffRL
124
0
0
02 Dec 2024
Umbrella Reinforcement Learning -- computationally efficient tool for
  hard non-linear problems
Umbrella Reinforcement Learning -- computationally efficient tool for hard non-linear problems
Egor E. Nuzhin
Nikolai V. Brilliantov
98
1
0
21 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy
  Learning
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo
Bohan Zhou
Zongqing Lu
68
2
0
05 Nov 2024
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs
  Hierarchically
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
Kefan Dong
Arvind V. Mahankali
Tengyu Ma
ReLMLRM
102
7
0
04 Nov 2024
Learning World Models for Unconstrained Goal Navigation
Learning World Models for Unconstrained Goal Navigation
Yuanlin Duan
Wensen Mao
He Zhu
60
1
0
03 Nov 2024
Exploring the Edges of Latent State Clusters for Goal-Conditioned
  Reinforcement Learning
Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning
Yuanlin Duan
Guofeng Cui
He Zhu
OffRL
120
0
0
03 Nov 2024
Hierarchical Preference Optimization: Learning to achieve goals via
  feasible subgoals prediction
Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
Utsav Singh
Souradip Chakraborty
Wesley A Suttle
Brian M. Sadler
Anit Kumar Sahu
Mubarak Shah
Vinay P. Namboodiri
Amrit Singh Bedi
131
1
0
01 Nov 2024
Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning
Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning
Beyazit Yalcinkaya
Niklas Lauffer
Marcell Vazquez-Chanlatte
Sanjit A. Seshia
AI4CE
141
6
0
31 Oct 2024
Maximum Entropy Hindsight Experience Replay
Maximum Entropy Hindsight Experience Replay
Douglas C. Crowder
Matthew L. Trappett
Darrien M. McKenzie
Frances S. Chance
61
0
0
31 Oct 2024
Efficient Diversity-based Experience Replay for Deep Reinforcement Learning
Efficient Diversity-based Experience Replay for Deep Reinforcement Learning
Kaiyan Zhao
Yiming Wang
Yuyang Chen
Yan Li
Leong Hou U
Xiaoguang Niu
119
1
0
27 Oct 2024
OGBench: Benchmarking Offline Goal-Conditioned RL
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park
Kevin Frans
Benjamin Eysenbach
Sergey Levine
OffRL
148
29
0
26 Oct 2024
SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
Zizhao Wang
Jiaheng Hu
Caleb Chuck
Stephen Chen
Roberto Martín-Martín
Amy Zhang
S. Niekum
Peter Stone
OffRL
90
1
0
24 Oct 2024
Safe Load Balancing in Software-Defined-Networking
Safe Load Balancing in Software-Defined-Networking
L. Dinh
Pham Tran Anh Quang
Jérémie Leguay
62
0
0
22 Oct 2024
Interpretable end-to-end Neurosymbolic Reinforcement Learning agents
Interpretable end-to-end Neurosymbolic Reinforcement Learning agents
Nils Grandien
Quentin Delfosse
Kristian Kersting
OffRL
75
2
0
18 Oct 2024
Novelty-based Sample Reuse for Continuous Robotics Control
Novelty-based Sample Reuse for Continuous Robotics Control
Ke Duan
Kai Yang
Houde Liu
Xueqian Wang
77
0
0
17 Oct 2024
SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and
  Hindsight Relabeling
SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling
Loris Gaven
Clément Romac
Thomas Carta
Sylvain Lamprier
Olivier Sigaud
Pierre-Yves Oudeyer
LLMAGOffRL
47
3
0
16 Oct 2024
Potential-Based Intrinsic Motivation: Preserving Optimality With
  Complex, Non-Markovian Shaping Rewards
Potential-Based Intrinsic Motivation: Preserving Optimality With Complex, Non-Markovian Shaping Rewards
Grant C. Forbes
Leonardo Villalobos-Arias
Jianxun Wang
Arnav Jhala
David L. Roberts
77
1
0
16 Oct 2024
The State of Robot Motion Generation
The State of Robot Motion Generation
Kostas E. Bekris
Joe H. Doerr
Patrick Meng
Sumanth Tangirala
3DV
89
3
0
16 Oct 2024
Zero-Shot Offline Imitation Learning via Optimal Transport
Zero-Shot Offline Imitation Learning via Optimal Transport
Thomas Rupf
Marco Bagatella
Nico Gürtler
Jonas Frey
Georg Martius
OffRL
441
0
0
11 Oct 2024
Effective Exploration Based on the Structural Information Principles
Effective Exploration Based on the Structural Information Principles
Xianghua Zeng
Hao Peng
Angsheng Li
62
2
0
09 Oct 2024
Unsupervised Skill Discovery for Robotic Manipulation through Automatic
  Task Generation
Unsupervised Skill Discovery for Robotic Manipulation through Automatic Task Generation
Paul Jansonnie
Bingbing Wu
Julien Perez
Jan Peters
SSL
51
1
0
07 Oct 2024
ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control
ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control
Ehsan Futuhi
Shayan Karimi
Chao Gao
Martin Müller
104
1
0
07 Oct 2024
Choices are More Important than Efforts: LLM Enables Efficient
  Multi-Agent Exploration
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
Yun Qu
Boyuan Wang
Yuhang Jiang
Jianzhun Shao
Yixiu Mao
Cheems Wang
Chang Liu
Xiangyang Ji
131
5
0
03 Oct 2024
Learning to Bridge the Gap: Efficient Novelty Recovery with Planning and
  Reinforcement Learning
Learning to Bridge the Gap: Efficient Novelty Recovery with Planning and Reinforcement Learning
Alicia Li
Nishanth Kumar
Tomás Lozano-Pérez
Leslie Kaelbling
OffRL
91
0
0
28 Sep 2024
Synatra: Turning Indirect Knowledge into Direct Demonstrations for
  Digital Agents at Scale
Synatra: Turning Indirect Knowledge into Direct Demonstrations for Digital Agents at Scale
Tianyue Ou
Frank F. Xu
Aman Madaan
J. Liu
Robert Lo
Abishek Sridhar
Sudipta Sengupta
Dan Roth
Graham Neubig
Shuyan Zhou
OffRL
89
15
0
24 Sep 2024
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC
Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC
Aleksi Mäki-Penttilä
Naeim Ebrahimi Toulkani
Reza Ghabcheloo
88
0
0
24 Sep 2024
Previous
12345...242526
Next