ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.01626
  4. Cited By
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

3 June 2022
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Aaron C. Courville
Marc G. Bellemare
    OffRL
    OnRL
ArXivPDFHTML

Papers citing "Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress"

50 / 51 papers shown
Title
Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation
Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation
Donghoon Lee
Tung M. Luu
Younghwan Lee
Chang D. Yoo
OffRL
VLM
11
0
0
16 May 2025
Refined Policy Distillation: From VLA Generalists to RL Experts
Tobias Jülg
Wolfram Burgard
Florian Walter
OffRL
39
1
0
06 Mar 2025
Skill Expansion and Composition in Parameter Space
Skill Expansion and Composition in Parameter Space
Tenglong Liu
J. Li
Yinan Zheng
Haoyi Niu
Yixing Lan
Xin Xu
Xianyuan Zhan
58
4
0
09 Feb 2025
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy
Yuhui Chen
Shuai Tian
Shugao Liu
Yingting Zhou
Haoran Li
Dongbin Zhao
OffRL
106
1
0
08 Feb 2025
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Maohao Shen
Guangtao Zeng
Zhenting Qi
Zhang-Wei Hong
Zhenfang Chen
Wei Lu
G. Wornell
Subhro Das
David D. Cox
Chuang Gan
LLMAG
LRM
171
6
0
04 Feb 2025
Search-Based Adversarial Estimates for Improving Sample Efficiency in Off-Policy Reinforcement Learning
Search-Based Adversarial Estimates for Improving Sample Efficiency in Off-Policy Reinforcement Learning
Federico Malato
Ville Hautamaki
39
0
0
03 Feb 2025
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo
  Cancellation
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation
Fei Zhao
Xueliang Zhang
36
0
0
25 Dec 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
78
0
0
12 Nov 2024
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson
Qiyang Li
Kevin Frans
Sergey Levine
SSL
OffRL
OnRL
57
0
0
23 Oct 2024
Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach
Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach
Henrique Donâncio
Antoine Barrier
Leah F. South
Florence Forbes
25
0
0
16 Oct 2024
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
C. Voelcker
Marcel Hussing
Eric Eaton
Amir-massoud Farahmand
Igor Gilitschenski
39
1
0
11 Oct 2024
QGym: Scalable Simulation and Benchmarking of Queuing Network
  Controllers
QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers
Haozhe Chen
Ang Li
Ethan Che
Tianyi Peng
Jing Dong
Hongseok Namkoong
OffRL
27
0
0
08 Oct 2024
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale
  Reinforcement Learning Fine-Tuning
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Jiaheng Hu
Rose Hendrix
Ali Farhadi
Aniruddha Kembhavi
Roberto Martín-Martín
Peter Stone
Kuo-Hao Zeng
Kiana Ehsani
40
7
0
25 Sep 2024
Explaining an Agent's Future Beliefs through Temporally Decomposing
  Future Reward Estimators
Explaining an Agent's Future Beliefs through Temporally Decomposing Future Reward Estimators
Mark Towers
Yali Du
Christopher T. Freeman
Timothy J. Norman
34
0
0
15 Aug 2024
Boosting Soft Q-Learning by Bounding
Boosting Soft Q-Learning by Bounding
Jacob Adamczyk
Volodymyr Makarenko
Stas Tiomkin
Rahul V. Kulkarni
OffRL
56
2
0
26 Jun 2024
Which Experiences Are Influential for RL Agents? Efficiently Estimating
  The Influence of Experiences
Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences
Takuya Hiraoka
Guanquan Wang
Takashi Onishi
Yoshimasa Tsuruoka
45
0
0
23 May 2024
RICE: Breaking Through the Training Bottlenecks of Reinforcement
  Learning with Explanation
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Zelei Cheng
Xian Wu
Jiahao Yu
Sabrina Yang
Gang Wang
Xinyu Xing
OffRL
26
2
0
05 May 2024
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for
  Efficiency
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
Yanxiao Zhao
Yangge Qian
Tianyi Wang
Jingyang Shan
Xiaolin Qin
21
0
0
01 Mar 2024
RIME: Robust Preference-based Reinforcement Learning with Noisy
  Preferences
RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences
Jie Cheng
Gang Xiong
Xingyuan Dai
Q. Miao
Yisheng Lv
Fei-Yue Wang
33
15
0
27 Feb 2024
Improving a Proportional Integral Controller with Reinforcement Learning
  on a Throttle Valve Benchmark
Improving a Proportional Integral Controller with Reinforcement Learning on a Throttle Valve Benchmark
Paul Daoudi
B. Mavkov
Bogdan Robu
Christophe Prieur
Emmanuel Witrant
M. Barlier
Ludovic Dos Santos
28
2
0
21 Feb 2024
In value-based deep reinforcement learning, a pruned network is a good
  network
In value-based deep reinforcement learning, a pruned network is a good network
J. Obando-Ceron
Aaron C. Courville
Pablo Samuel Castro
OffRL
38
18
0
19 Feb 2024
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement
  Learning
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning
Shengyi Huang
Quentin Gallouedec
Florian Felten
Antonin Raffin
Rousslan Fernand Julien Dossa
...
Alexander Nikulin
Xiao Hu
Tianlin Liu
Jongwook Choi
Brent Yi
OffRL
29
7
0
05 Feb 2024
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting
  Mitigation Problem
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Maciej Wolczyk
Bartłomiej Cupiał
M. Ostaszewski
Michal Bortkiewicz
Michal Zajkac
Razvan Pascanu
Lukasz Kuciñski
Piotr Milo's
CLL
48
13
0
05 Feb 2024
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement
  Learning
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Yinmin Zhang
Jie Liu
Chuming Li
Yazhe Niu
Yaodong Yang
Yu Liu
Wanli Ouyang
OffRL
OnRL
46
11
0
12 Dec 2023
Action Inference by Maximising Evidence: Zero-Shot Imitation from
  Observation with World Models
Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models
Xingyuan Zhang
Philip Becker-Ehmck
Patrick van der Smagt
Maximilian Karl
39
5
0
04 Dec 2023
Large Language Model as a Policy Teacher for Training Reinforcement
  Learning Agents
Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents
Zihao Zhou
Bin-Bin Hu
Chenyang Zhao
Pu Zhang
Bin Liu
LLMAG
29
9
0
22 Nov 2023
Accelerating Exploration with Unlabeled Prior Data
Accelerating Exploration with Unlabeled Prior Data
Qiyang Li
Jason Zhang
Dibya Ghosh
Amy Zhang
Sergey Levine
OffRL
OnRL
31
9
0
09 Nov 2023
Fair collaborative vehicle routing: A deep multi-agent reinforcement
  learning approach
Fair collaborative vehicle routing: A deep multi-agent reinforcement learning approach
Stephen Mak
Liming Xu
Tim Pearce
Michael Ostroumov
Alexandra Brintrup
29
11
0
26 Oct 2023
Diverse Conventions for Human-AI Collaboration
Diverse Conventions for Human-AI Collaboration
Bidipta Sarkar
Andy Shih
Dorsa Sadigh
23
5
0
24 Oct 2023
TAIL: Task-specific Adapters for Imitation Learning with Large
  Pretrained Models
TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models
Zuxin Liu
Jesse Zhang
Kavosh Asadi
Yao Liu
Ding Zhao
Shoham Sabach
Rasool Fakoor
ALM
AI4CE
23
25
0
09 Oct 2023
TGRL: An Algorithm for Teacher Guided Reinforcement Learning
TGRL: An Algorithm for Teacher Guided Reinforcement Learning
Idan Shenfeld
Zhang-Wei Hong
Aviv Tamar
Pulkit Agrawal
24
12
0
06 Jul 2023
On-Policy Distillation of Language Models: Learning from Self-Generated
  Mistakes
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
Rishabh Agarwal
Nino Vieillard
Yongchao Zhou
Piotr Stańczyk
Sabela Ramos
Matthieu Geist
Olivier Bachem
37
84
0
23 Jun 2023
Genes in Intelligent Agents
Genes in Intelligent Agents
Fu Feng
Jing Wang
Xu Yang
Xin Geng
AI4CE
24
6
0
17 Jun 2023
Bigger, Better, Faster: Human-level Atari with human-level efficiency
Bigger, Better, Faster: Human-level Atari with human-level efficiency
Max Schwarzer
J. Obando-Ceron
Aaron C. Courville
Marc G. Bellemare
Rishabh Agarwal
Pablo Samuel Castro
OffRL
48
82
0
30 May 2023
On the Value of Myopic Behavior in Policy Reuse
On the Value of Myopic Behavior in Policy Reuse
Kang Xu
Chenjia Bai
Shuang Qiu
Haoran He
Bin Zhao
Zhen Wang
Wei Li
Xuelong Li
29
1
0
28 May 2023
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement
  Learning
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Ya-Qin Zhang
OffRL
OnRL
36
19
0
25 May 2023
Deep Reinforcement Learning with Plasticity Injection
Deep Reinforcement Learning with Plasticity Injection
Evgenii Nikishin
Junhyuk Oh
Georg Ostrovski
Clare Lyle
Razvan Pascanu
Will Dabney
André Barreto
OffRL
23
49
0
24 May 2023
Knowledge Transfer from Teachers to Learners in Growing-Batch
  Reinforcement Learning
Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning
P. Emedom-Nnamdi
A. Friesen
Bobak Shahriari
Nando de Freitas
Matthew W. Hoffman
OffRL
23
0
0
05 May 2023
Reduce, Reuse, Recycle: Selective Reincarnation in Multi-Agent
  Reinforcement Learning
Reduce, Reuse, Recycle: Selective Reincarnation in Multi-Agent Reinforcement Learning
Claude Formanek
C. Tilbury
Jonathan P. Shock
Kale-ab Tessera
Arnu Pretorius
29
3
0
31 Mar 2023
Accelerating Policy Gradient by Estimating Value Function from Prior
  Computation in Deep Reinforcement Learning
Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning
Hassam Sheikh
Mariano Phielipp
OffRL
16
6
0
02 Feb 2023
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making
  using Language Guided World Modelling
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling
Kolby Nottingham
Prithviraj Ammanabrolu
Alane Suhr
Yejin Choi
Hannaneh Hajishirzi
Sameer Singh
Roy Fox
LLMAG
LM&Ro
44
77
0
28 Jan 2023
Policy Adaptation from Foundation Model Feedback
Policy Adaptation from Foundation Model Feedback
Yuying Ge
Annabella Macaluso
Erran L. Li
Ping Luo
Xiaolong Wang
LM&Ro
27
12
0
14 Dec 2022
Offline Q-Learning on Diverse Multi-Task Data Both Scales And
  Generalizes
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes
Aviral Kumar
Rishabh Agarwal
Xinyang Geng
George Tucker
Sergey Levine
OffRL
44
48
0
28 Nov 2022
Pretraining in Deep Reinforcement Learning: A Survey
Pretraining in Deep Reinforcement Learning: A Survey
Zhihui Xie
Zichuan Lin
Junyou Li
Shuai Li
Deheng Ye
OffRL
OnRL
AI4CE
26
23
0
08 Nov 2022
Flexible Attention-Based Multi-Policy Fusion for Efficient Deep
  Reinforcement Learning
Flexible Attention-Based Multi-Policy Fusion for Efficient Deep Reinforcement Learning
Zih-Yun Chiu
Yi-Lin Tuan
William Yang Wang
Michael C. Yip
OffRL
25
3
0
07 Oct 2022
Multi-Source Transfer Learning for Deep Model-Based Reinforcement
  Learning
Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning
Remo Sasso
M. Sabatelli
M. Wiering
49
9
0
28 May 2022
CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery
CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery
Michael Laskin
Hao Liu
Xue Bin Peng
Denis Yarats
Aravind Rajeswaran
Pieter Abbeel
SSL
74
65
0
01 Feb 2022
AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at
  Scale
AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at Scale
Yao Lu
Karol Hausman
Yevgen Chebotar
Mengyuan Yan
Eric Jang
...
Ted Xiao
A. Irpan
Mohi Khansari
Dmitry Kalashnikov
Sergey Levine
OffRL
89
59
0
09 Nov 2021
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
214
843
0
12 Oct 2021
MLGO: a Machine Learning Guided Compiler Optimizations Framework
MLGO: a Machine Learning Guided Compiler Optimizations Framework
Mircea Trofin
Yundi Qian
E. Brevdo
Zinan Lin
K. Choromanski
D. Li
36
62
0
13 Jan 2021
12
Next