ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.05696
  4. Cited By
Multi-Fidelity Policy Gradient Algorithms
v1v2 (latest)

Multi-Fidelity Policy Gradient Algorithms

7 March 2025
Xinjie Liu
Cyrus Neary
Kushagra Gupta
Christian Ellis
Ufuk Topcu
David Fridovich-Keil
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Multi-Fidelity Policy Gradient Algorithms"

33 / 33 papers shown
Title
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Patrick Yin
Tyler Westenbroek
Simran Bagaria
Kevin Huang
Ching-an Cheng
Andrey Kobolov
Abhishek Gupta
179
4
0
04 Feb 2025
Adaptive Learning of Design Strategies over Non-Hierarchical Multi-Fidelity Models via Policy Alignment
Akash Agrawal
Christopher McComb
36
1
0
16 Nov 2024
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
Jiafei Lyu
Kang Xu
Jiacheng Xu
Mengbei Yan
Jingwen Yang
Zongzhang Zhang
Chenjia Bai
Zongqing Lu
Xiaochen Li
OffRL
39
5
0
28 Oct 2024
Learning to Walk from Three Minutes of Real-World Data with
  Semi-structured Dynamics Models
Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models
Jacob Levy
T. Westenbroek
David Fridovich-Keil
102
8
0
11 Oct 2024
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Mark Towers
Ariel Kwiatkowski
Jordan Terry
John U. Balis
Gianluca De Cola
...
Andrea Pierré
Sander Schulhoff
Jun Jet Tai
Hannah Tan
Omar G. Younis
AuLLMOffRL
94
215
0
24 Jul 2024
Multi-Fidelity Reinforcement Learning for Time-Optimal Quadrotor
  Re-planning
Multi-Fidelity Reinforcement Learning for Time-Optimal Quadrotor Re-planning
Gilhyun Ryou
Geoffrey Wang
S. Karaman
116
3
0
13 Mar 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLMLRM
212
1,289
0
05 Feb 2024
Searching for High-Value Molecules Using Reinforcement Learning and
  Transformers
Searching for High-Value Molecules Using Reinforcement Learning and Transformers
Raj Ghugare
Santiago Miret
Adriana Hugessen
Mariano Phielipp
Glen Berseth
99
17
0
04 Oct 2023
Multi-fidelity reinforcement learning framework for shape optimization
Multi-fidelity reinforcement learning framework for shape optimization
Sahil Bhola
Suraj Pawar
Prasanna Balaprakash
R. Maulik
AI4CE
71
24
0
22 Feb 2022
Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in
  the Real World
Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Laura M. Smith
J. Kew
Xue Bin Peng
Sehoon Ha
Jie Tan
Sergey Levine
118
104
0
11 Oct 2021
A Review of the Gumbel-max Trick and its Extensions for Discrete
  Stochasticity in Machine Learning
A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning
Iris A. M. Huijben
W. Kool
Max B. Paulus
Ruud J. G. van Sloun
113
99
0
04 Oct 2021
Coordinate-wise Control Variates for Deep Policy Gradients
Coordinate-wise Control Variates for Deep Policy Gradients
Yuanyi Zhong
Yuanshuo Zhou
Jian-wei Peng
BDL
88
1
0
11 Jul 2021
Flow Network based Generative Models for Non-Iterative Diverse Candidate
  Generation
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation
Emmanuel Bengio
Moksh Jain
Maksym Korablyov
Doina Precup
Yoshua Bengio
105
340
0
08 Jun 2021
Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a
  Survey
Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey
Wenshuai Zhao
Jorge Peña Queralta
Tomi Westerlund
OffRL
253
743
0
24 Sep 2020
Decision-Making with Auto-Encoding Variational Bayes
Decision-Making with Auto-Encoding Variational Bayes
Romain Lopez
Pierre Boyeau
Nir Yosef
Michael I. Jordan
Jeffrey Regier
BDL
810
10,591
0
17 Feb 2020
From Importance Sampling to Doubly Robust Policy Gradient
From Importance Sampling to Doubly Robust Policy Gradient
Jiawei Huang
Nan Jiang
OffRL
82
24
0
20 Oct 2019
Meta Reinforcement Learning for Sim-to-real Domain Adaptation
Meta Reinforcement Learning for Sim-to-real Domain Adaptation
Karol Arndt
Murtaza Hazara
Ali Ghadirzadeh
Ville Kyrki
173
106
0
16 Sep 2019
Trajectory-wise Control Variates for Variance Reduction in Policy
  Gradient Methods
Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods
Ching-An Cheng
Xinyan Yan
Byron Boots
71
22
0
08 Aug 2019
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
129
965
0
19 Jun 2019
BayesSim: adaptive domain randomization via probabilistic inference for
  robotics simulators
BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators
F. Ramos
Rafael Possas
Dieter Fox
62
158
0
04 Jun 2019
Reward-estimation variance elimination in sequential decision processes
Reward-estimation variance elimination in sequential decision processes
S. Pankov
44
5
0
15 Nov 2018
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with
  Real World Experience
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience
Yevgen Chebotar
Ankur Handa
Viktor Makoviychuk
Miles Macklin
J. Issac
Nathan D. Ratliff
Dieter Fox
173
508
0
12 Oct 2018
Survey of multifidelity methods in uncertainty propagation, inference,
  and optimization
Survey of multifidelity methods in uncertainty propagation, inference, and optimization
Benjamin Peherstorfer
Karen E. Willcox
M. Gunzburger
AI4CE
90
761
0
28 Jun 2018
Variance Reduction for Policy Gradient with Action-Dependent Factorized
  Baselines
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines
Cathy Wu
Aravind Rajeswaran
Yan Duan
Vikash Kumar
Alexandre M. Bayen
Sham Kakade
Igor Mordatch
Pieter Abbeel
OffRL
92
153
0
20 Mar 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
112
127
0
27 Feb 2018
Multi-Fidelity Reinforcement Learning with Gaussian Processes
Multi-Fidelity Reinforcement Learning with Gaussian Processes
Varun Suryan
Nahush Gondhalekar
Pratap Tokekar
52
3
0
18 Dec 2017
Backpropagation through the Void: Optimizing control variates for
  black-box gradient estimation
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
163
300
0
31 Oct 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
703
19,377
0
20 Jul 2017
Virtual vs. Real: Trading Off Simulations and Physical Experiments in
  Reinforcement Learning with Bayesian Optimization
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization
A. Marco
Felix Berkenkamp
Philipp Hennig
Angela P. Schoellig
Andreas Krause
S. Schaal
Sebastian Trimpe
109
128
0
03 Mar 2017
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
OffRLBDL
115
345
0
07 Nov 2016
Sim-to-Real Robot Learning from Pixels with Progressive Nets
Sim-to-Real Robot Learning from Pixels with Progressive Nets
Andrei A. Rusu
Matej Vecerík
Thomas Rothörl
N. Heess
Razvan Pascanu
R. Hadsell
128
535
0
13 Oct 2016
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
169
3,453
0
08 Jun 2015
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning
Lex Weaver
Nigel Tao
128
249
0
10 Jan 2013
1