Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.03501
Cited By
Path Consistency Learning in Tsallis Entropy Regularized MDPs
10 February 2018
Ofir Nachum
Yinlam Chow
Mohammad Ghavamzadeh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Path Consistency Learning in Tsallis Entropy Regularized MDPs"
12 / 12 papers shown
Title
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
47
16
0
28 Jan 2025
q-exponential family for policy optimization
Lingwei Zhu
Haseeb Shah
Han Wang
Yukie Nagai
Martha White
OffRL
78
0
0
14 Aug 2024
Learning accurate and interpretable tree-based models
Maria-Florina Balcan
Dravyansh Sharma
47
7
0
24 May 2024
Bridging the Gap between Newton-Raphson Method and Regularized Policy Iteration
Zeyang Li
Chuxiong Hu
Yunan Wang
Guojian Zhan
Jie Li
Shengbo Eben Li
32
0
0
11 Oct 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
36
73
0
28 Mar 2023
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning
Wenzhuo Zhou
Ruoqing Zhu
Annie Qu
40
22
0
20 Oct 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
21
76
0
24 May 2021
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
Kaipeng Zhang
Sham Kakade
Tamer Bacsar
Lin F. Yang
47
120
0
15 Jul 2020
Variational Model-based Policy Optimization
Yinlam Chow
Brandon Cui
Moonkyung Ryu
Mohammad Ghavamzadeh
OffRL
13
12
0
09 Jun 2020
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
25
83
0
20 May 2020
Multi-Path Policy Optimization
L. Pan
Qingpeng Cai
Longbo Huang
18
2
0
11 Nov 2019
Maximum Causal Tsallis Entropy Imitation Learning
Kyungjae Lee
Sungjoon Choi
Songhwai Oh
OOD
29
20
0
22 May 2018
1