v1v2 (latest)

Multi-Fidelity Policy Gradient Algorithms

7 March 2025

Papers citing "Multi-Fidelity Policy Gradient Algorithms"

33 / 33 papers shown

Title
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning Patrick Yin Tyler Westenbroek Simran Bagaria Kevin Huang Ching-an Cheng Andrey Kobolov Abhishek Gupta 179 4 0 04 Feb 2025
Adaptive Learning of Design Strategies over Non-Hierarchical Multi-Fidelity Models via Policy Alignment Akash Agrawal Christopher McComb 36 1 0 16 Nov 2024
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning Jiafei Lyu Kang Xu Jiacheng Xu Mengbei Yan Jingwen Yang Zongzhang Zhang Chenjia Bai Zongqing Lu Xiaochen Li OffRL 39 5 0 28 Oct 2024
Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models Jacob Levy T. Westenbroek David Fridovich-Keil 102 8 0 11 Oct 2024
Gymnasium: A Standard Interface for Reinforcement Learning Environments Mark Towers Ariel Kwiatkowski Jordan Terry John U. Balis Gianluca De Cola ... Andrea Pierré Sander Schulhoff Jun Jet Tai Hannah Tan Omar G. Younis AuLLM OffRL 94 215 0 24 Jul 2024
Multi-Fidelity Reinforcement Learning for Time-Optimal Quadrotor Re-planning Gilhyun Ryou Geoffrey Wang S. Karaman 116 3 0 13 Mar 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Zhihong Shao Peiyi Wang Qihao Zhu Runxin Xu Jun-Mei Song ... Haowei Zhang Mingchuan Zhang Yiming Li Yu-Huan Wu Daya Guo ReLM LRM 212 1,289 0 05 Feb 2024
Searching for High-Value Molecules Using Reinforcement Learning and Transformers Raj Ghugare Santiago Miret Adriana Hugessen Mariano Phielipp Glen Berseth 99 17 0 04 Oct 2023
Multi-fidelity reinforcement learning framework for shape optimization Sahil Bhola Suraj Pawar Prasanna Balaprakash R. Maulik AI4CE 71 24 0 22 Feb 2022
Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World Laura M. Smith J. Kew Xue Bin Peng Sehoon Ha Jie Tan Sergey Levine 118 104 0 11 Oct 2021
A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning Iris A. M. Huijben W. Kool Max B. Paulus Ruud J. G. van Sloun 113 99 0 04 Oct 2021
Coordinate-wise Control Variates for Deep Policy Gradients Yuanyi Zhong Yuanshuo Zhou Jian-wei Peng BDL 88 1 0 11 Jul 2021
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation Emmanuel Bengio Moksh Jain Maksym Korablyov Doina Precup Yoshua Bengio 105 340 0 08 Jun 2021
Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey Wenshuai Zhao Jorge Peña Queralta Tomi Westerlund OffRL 253 743 0 24 Sep 2020
Decision-Making with Auto-Encoding Variational Bayes Romain Lopez Pierre Boyeau Nir Yosef Michael I. Jordan Jeffrey Regier BDL 810 10,591 0 17 Feb 2020
From Importance Sampling to Doubly Robust Policy Gradient Jiawei Huang Nan Jiang OffRL 82 24 0 20 Oct 2019
Meta Reinforcement Learning for Sim-to-real Domain Adaptation Karol Arndt Murtaza Hazara Ali Ghadirzadeh Ville Kyrki 173 106 0 16 Sep 2019
Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods Ching-An Cheng Xinyan Yan Byron Boots 71 22 0 08 Aug 2019
When to Trust Your Model: Model-Based Policy Optimization Michael Janner Justin Fu Marvin Zhang Sergey Levine OffRL 129 965 0 19 Jun 2019
BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators F. Ramos Rafael Possas Dieter Fox 62 158 0 04 Jun 2019
Reward-estimation variance elimination in sequential decision processes S. Pankov 44 5 0 15 Nov 2018
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience Yevgen Chebotar Ankur Handa Viktor Makoviychuk Miles Macklin J. Issac Nathan D. Ratliff Dieter Fox 173 508 0 12 Oct 2018
Survey of multifidelity methods in uncertainty propagation, inference, and optimization Benjamin Peherstorfer Karen E. Willcox M. Gunzburger AI4CE 90 761 0 28 Jun 2018
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines Cathy Wu Aravind Rajeswaran Yan Duan Vikash Kumar Alexandre M. Bayen Sham Kakade Igor Mordatch Pieter Abbeel OffRL 92 153 0 20 Mar 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning George Tucker Surya Bhupatiraju S. Gu Richard Turner Zoubin Ghahramani Sergey Levine OffRL 112 127 0 27 Feb 2018
Multi-Fidelity Reinforcement Learning with Gaussian Processes Varun Suryan Nahush Gondhalekar Pratap Tokekar 52 3 0 18 Dec 2017
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation Will Grathwohl Dami Choi Yuhuai Wu Geoffrey Roeder David Duvenaud 163 300 0 31 Oct 2017
Proximal Policy Optimization Algorithms John Schulman Filip Wolski Prafulla Dhariwal Alec Radford Oleg Klimov OffRL 703 19,377 0 20 Jul 2017
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization A. Marco Felix Berkenkamp Philipp Hennig Angela P. Schoellig Andreas Krause S. Schaal Sebastian Trimpe 109 128 0 03 Mar 2017
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic S. Gu Timothy Lillicrap Zoubin Ghahramani Richard Turner Sergey Levine OffRL BDL 115 345 0 07 Nov 2016
Sim-to-Real Robot Learning from Pixels with Progressive Nets Andrei A. Rusu Matej Vecerík Thomas Rothörl N. Heess Razvan Pascanu R. Hadsell 128 535 0 13 Oct 2016
High-Dimensional Continuous Control Using Generalized Advantage Estimation John Schulman Philipp Moritz Sergey Levine Michael I. Jordan Pieter Abbeel OffRL 169 3,453 0 08 Jun 2015
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning Lex Weaver Nigel Tao 128 249 0 10 Jan 2013