ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04349
  4. Cited By
Learning to Score Behaviors for Guided Policy Optimization

Learning to Score Behaviors for Guided Policy Optimization

11 June 2019
Aldo Pacchiano
Jack Parker-Holder
Yunhao Tang
A. Choromańska
K. Choromanski
Michael I. Jordan
ArXivPDFHTML

Papers citing "Learning to Score Behaviors for Guided Policy Optimization"

15 / 15 papers shown
Title
Iteratively Learn Diverse Strategies with State Distance Information
Iteratively Learn Diverse Strategies with State Distance Information
Wei Fu
Weihua Du
Jingwei Li
Sunli Chen
Jingzhao Zhang
Yi Wu
51
3
0
23 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
36
47
0
06 Oct 2023
Wasserstein Diversity-Enriched Regularizer for Hierarchical
  Reinforcement Learning
Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning
Haorui Li
Jiaqi Liang
Linjing Li
D. Zeng
11
0
0
02 Aug 2023
Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies
Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies
Hanna Ziesche
Leonel Rozo
26
5
0
17 May 2023
On Pathologies in KL-Regularized Reinforcement Learning from Expert
  Demonstrations
On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations
Tim G. J. Rudner
Cong Lu
Michael A. Osborne
Yarin Gal
Yee Whye Teh
OffRL
27
27
0
28 Dec 2022
Transfer RL via the Undo Maps Formalism
Transfer RL via the Undo Maps Formalism
Abhi Gupta
Theodore H. Moskovitz
David Alvarez-Melis
Aldo Pacchiano
OffRL
33
0
0
26 Nov 2022
Learning General World Models in a Handful of Reward-Free Deployments
Learning General World Models in a Handful of Reward-Free Deployments
Yingchen Xu
Jack Parker-Holder
Aldo Pacchiano
Philip J. Ball
Oleh Rybkin
Stephen J. Roberts
Tim Rocktaschel
Edward Grefenstette
OffRL
55
8
0
23 Oct 2022
Towards A Unified Policy Abstraction Theory and Representation Learning
  Approach in Markov Decision Processes
Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Mengdi Zhang
Hongyao Tang
Jianye Hao
Yan Zheng
OffRL
28
0
0
16 Sep 2022
Minimum Description Length Control
Minimum Description Length Control
Theodore H. Moskovitz
Ta-Chu Kao
M. Sahani
M. Botvinick
26
1
0
17 Jul 2022
Agent Spaces
Agent Spaces
John C. Raisbeck
M. W. Allen
Hakho Lee
27
1
0
11 Nov 2021
Dueling RL: Reinforcement Learning with Trajectory Preferences
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
33
81
0
08 Nov 2021
Towards an Understanding of Default Policies in Multitask Policy
  Optimization
Towards an Understanding of Default Policies in Multitask Policy Optimization
Theodore H. Moskovitz
Michael Arbel
Jack Parker-Holder
Aldo Pacchiano
25
9
0
04 Nov 2021
MICo: Improved representations via sampling-based state similarity for
  Markov decision processes
MICo: Improved representations via sampling-based state similarity for Markov decision processes
Pablo Samuel Castro
Tyler Kastner
Prakash Panangaden
Mark Rowland
43
35
0
03 Jun 2021
Differentiable Trust Region Layers for Deep Reinforcement Learning
Differentiable Trust Region Layers for Deep Reinforcement Learning
Fabian Otto
P. Becker
Ngo Anh Vien
Hanna Ziesche
Gerhard Neumann
OffRL
41
19
0
22 Jan 2021
Primal Wasserstein Imitation Learning
Primal Wasserstein Imitation Learning
Robert Dadashi
Léonard Hussenot
M. Geist
Olivier Pietquin
23
124
0
08 Jun 2020
1