ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.12962
43
6

Bias-reduced Multi-step Hindsight Experience Replay for Efficient Multi-goal Reinforcement Learning

25 February 2021
Rui Yang
Jiafei Lyu
Yu Yang
Jiangpeng Yan
Feng Luo
Dijun Luo
Lanqing Li
Xiu Li
ArXivPDFHTML
Abstract

Multi-goal reinforcement learning is widely applied in planning and robot manipulation. Two main challenges in multi-goal reinforcement learning are sparse rewards and sample inefficiency. Hindsight Experience Replay (HER) aims to tackle the two challenges via goal relabeling. However, HER-related works still need millions of samples and a huge computation. In this paper, we propose Multi-step Hindsight Experience Replay (MHER), incorporating multi-step relabeled returns based on nnn-step relabeling to improve sample efficiency. Despite the advantages of nnn-step relabeling, we theoretically and experimentally prove the off-policy nnn-step bias introduced by nnn-step relabeling may lead to poor performance in many environments. To address the above issue, two bias-reduced MHER algorithms, MHER(λ\lambdaλ) and Model-based MHER (MMHER) are presented. MHER(λ\lambdaλ) exploits the λ\lambdaλ return while MMHER benefits from model-based value expansions. Experimental results on numerous multi-goal robotic tasks show that our solutions can successfully alleviate off-policy nnn-step bias and achieve significantly higher sample efficiency than HER and Curriculum-guided HER with little additional computation beyond HER.

View on arXiv
Comments on this paper