ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.07889
  4. Cited By
Distributionally-Constrained Policy Optimization via Unbalanced Optimal
  Transport

Distributionally-Constrained Policy Optimization via Unbalanced Optimal Transport

15 February 2021
A. Givchi
Pei Wang
Junqi Wang
Patrick Shafto
    OT
    OffRL
ArXivPDFHTML

Papers citing "Distributionally-Constrained Policy Optimization via Unbalanced Optimal Transport"

20 / 20 papers shown
Title
Imitation Learning with Sinkhorn Distances
Imitation Learning with Sinkhorn Distances
Georgios Papagiannis
Yunpeng Li
OT
31
27
0
20 Aug 2020
Variational Policy Gradient Method for Reinforcement Learning with
  General Utilities
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
64
140
0
04 Jul 2020
Primal Wasserstein Imitation Learning
Primal Wasserstein Imitation Learning
Robert Dadashi
Léonard Hussenot
Matthieu Geist
Olivier Pietquin
64
129
0
08 Jun 2020
Cautious Reinforcement Learning via Distributional Risk in the Dual
  Domain
Cautious Reinforcement Learning via Distributional Risk in the Dual Domain
Junyu Zhang
Amrit Singh Bedi
Mengdi Wang
Alec Koppel
51
28
0
27 Feb 2020
Reinforcement Learning via Fenchel-Rockafellar Duality
Reinforcement Learning via Fenchel-Rockafellar Duality
Ofir Nachum
Bo Dai
OffRL
146
122
0
07 Jan 2020
AlgaeDICE: Policy Gradient from Arbitrary Experience
AlgaeDICE: Policy Gradient from Arbitrary Experience
Ofir Nachum
Bo Dai
Ilya Kostrikov
Yinlam Chow
Lihong Li
Dale Schuurmans
OffRL
156
243
0
04 Dec 2019
Wasserstein Adversarial Imitation Learning
Wasserstein Adversarial Imitation Learning
Huang Xiao
Michael Herman
Joerg Wagner
Sebastian Ziesche
Jalal Etesami
T. H. Linh
41
72
0
19 Jun 2019
Learning to Score Behaviors for Guided Policy Optimization
Learning to Score Behaviors for Guided Policy Optimization
Aldo Pacchiano
Jack Parker-Holder
Yunhao Tang
A. Choromańska
K. Choromanski
Michael I. Jordan
57
39
0
11 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
151
337
0
10 Jun 2019
Understanding the impact of entropy on policy optimization
Understanding the impact of entropy on policy optimization
Zafarali Ahmed
Nicolas Le Roux
Mohammad Norouzi
Dale Schuurmans
73
233
0
27 Nov 2018
Policy Optimization as Wasserstein Gradient Flows
Policy Optimization as Wasserstein Gradient Flows
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Lawrence Carin
65
68
0
09 Aug 2018
Computational Optimal Transport
Computational Optimal Transport
Gabriel Peyré
Marco Cuturi
OT
217
2,148
0
01 Mar 2018
Large-Scale Optimal Transport and Mapping Estimation
Large-Scale Optimal Transport and Mapping Estimation
Vivien Seguy
B. Damodaran
Rémi Flamary
Nicolas Courty
Antoine Rolet
Mathieu Blondel
OT
68
248
0
07 Nov 2017
A unified view of entropy-regularized Markov decision processes
A unified view of entropy-regularized Markov decision processes
Gergely Neu
Anders Jonsson
Vicencc Gómez
95
262
0
22 May 2017
Equivalence Between Policy Gradients and Soft Q-Learning
Equivalence Between Policy Gradients and Soft Q-Learning
John Schulman
Xi Chen
Pieter Abbeel
OffRL
83
346
0
21 Apr 2017
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
103
1,340
0
27 Feb 2017
Stochastic Optimization for Large-scale Optimal Transport
Stochastic Optimization for Large-scale Optimal Transport
Aude Genevay
Marco Cuturi
Gabriel Peyré
Francis R. Bach
OT
73
467
0
27 May 2016
An Emphatic Approach to the Problem of Off-policy Temporal-Difference
  Learning
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
88
271
0
14 Mar 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,776
0
19 Feb 2015
Sinkhorn Distances: Lightspeed Computation of Optimal Transportation
  Distances
Sinkhorn Distances: Lightspeed Computation of Optimal Transportation Distances
Marco Cuturi
OT
215
4,262
0
04 Jun 2013
1