ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.10592
  4. Cited By
Model-Ensemble Trust-Region Policy Optimization

Model-Ensemble Trust-Region Policy Optimization

28 February 2018
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
ArXivPDFHTML

Papers citing "Model-Ensemble Trust-Region Policy Optimization"

27 / 27 papers shown
Title
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
402
5
0
10 Mar 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
271
2
0
18 Feb 2025
On Reward Transferability in Adversarial Inverse Reinforcement Learning: Insights from Random Matrix Theory
On Reward Transferability in Adversarial Inverse Reinforcement Learning: Insights from Random Matrix Theory
Yangchun Zhang
Wang Zhou
Yirui Zhou
75
0
0
31 Dec 2024
Overcoming Model Bias for Robust Offline Deep Reinforcement Learning
Overcoming Model Bias for Robust Offline Deep Reinforcement Learning
Phillip Swazinna
Steffen Udluft
Thomas Runkler
OffRL
50
83
0
12 Aug 2020
Sample Complexity of Reinforcement Learning using Linearly Combined
  Model Ensembles
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
Aditya Modi
Nan Jiang
Ambuj Tewari
Satinder Singh
61
131
0
23 Oct 2019
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with
  Model-Free Fine-Tuning
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
91
973
0
08 Aug 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
444
18,931
0
20 Jul 2017
Noisy Networks for Exploration
Noisy Networks for Exploration
Meire Fortunato
M. G. Azar
Bilal Piot
Jacob Menick
Ian Osband
...
Rémi Munos
Demis Hassabis
Olivier Pietquin
Charles Blundell
Shane Legg
79
893
0
30 Jun 2017
Parameter Space Noise for Exploration
Parameter Space Noise for Exploration
Matthias Plappert
Rein Houthooft
Prafulla Dhariwal
Szymon Sidor
Richard Y. Chen
Xi Chen
Tamim Asfour
Pieter Abbeel
Marcin Andrychowicz
52
595
0
06 Jun 2017
Prediction and Control with Temporal Segment Models
Prediction and Control with Temporal Segment Models
Nikhil Mishra
Pieter Abbeel
Igor Mordatch
BDL
51
64
0
12 Mar 2017
Combining Self-Supervised Learning and Imitation for Vision-Based Rope
  Manipulation
Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation
Ashvin Nair
Dian Chen
Pulkit Agrawal
Phillip Isola
Pieter Abbeel
Jitendra Malik
Sergey Levine
SSL
50
311
0
06 Mar 2017
Deep Visual Foresight for Planning Robot Motion
Deep Visual Foresight for Planning Robot Motion
Chelsea Finn
Sergey Levine
111
783
0
03 Oct 2016
Learning to Poke by Poking: Experiential Learning of Intuitive Physics
Learning to Poke by Poking: Experiential Learning of Intuitive Physics
Pulkit Agrawal
Ashvin Nair
Pieter Abbeel
Jitendra Malik
Sergey Levine
SSL
62
563
0
23 Jun 2016
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian
  Neural Networks
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks
Stefan Depeweg
José Miguel Hernández-Lobato
Finale Doshi-Velez
Steffen Udluft
BDL
52
159
0
23 May 2016
Benchmarking Deep Reinforcement Learning for Continuous Control
Benchmarking Deep Reinforcement Learning for Continuous Control
Yan Duan
Xi Chen
Rein Houthooft
John Schulman
Pieter Abbeel
OffRL
76
1,693
0
22 Apr 2016
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning
  and Large-Scale Data Collection
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection
Sergey Levine
P. Pastor
A. Krizhevsky
Deirdre Quillen
160
2,071
0
07 Mar 2016
Learning Continuous Control Policies by Stochastic Value Gradients
Learning Continuous Control Policies by Stochastic Value Gradients
N. Heess
Greg Wayne
David Silver
Timothy Lillicrap
Yuval Tassa
Tom Erez
95
560
0
30 Oct 2015
One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation
  and Neural Network Priors
One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors
Justin Fu
Sergey Levine
Pieter Abbeel
OffRL
59
159
0
23 Sep 2015
Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700
  Robot Hours
Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours
Lerrel Pinto
Abhinav Gupta
SSL
94
1,152
0
23 Sep 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
310
13,214
0
09 Sep 2015
Action-Conditional Video Prediction using Deep Networks in Atari Games
Action-Conditional Video Prediction using Deep Networks in Atari Games
Junhyuk Oh
Xiaoxiao Guo
Honglak Lee
Richard L. Lewis
Satinder Singh
103
852
0
31 Jul 2015
Embed to Control: A Locally Linear Latent Dynamics Model for Control
  from Raw Images
Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images
Manuel Watter
Jost Tobias Springenberg
Joschka Boedecker
Martin Riedmiller
BDL
63
844
0
24 Jun 2015
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
82
3,399
0
08 Jun 2015
End-to-End Training of Deep Visuomotor Policies
End-to-End Training of Deep Visuomotor Policies
Sergey Levine
Chelsea Finn
Trevor Darrell
Pieter Abbeel
BDL
288
3,431
0
02 Apr 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
274
6,755
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
149,842
0
22 Dec 2014
On the difficulty of training Recurrent Neural Networks
On the difficulty of training Recurrent Neural Networks
Razvan Pascanu
Tomas Mikolov
Yoshua Bengio
ODL
182
5,334
0
21 Nov 2012
1