ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.12620
  4. Cited By
A maximum-entropy approach to off-policy evaluation in average-reward
  MDPs

A maximum-entropy approach to off-policy evaluation in average-reward MDPs

17 June 2020
N. Lazić
Dong Yin
Mehrdad Farajtabar
Nir Levine
Dilan Görür
Chris Harris
Dale Schuurmans
    OffRL
ArXiv (abs)PDFHTML

Papers citing "A maximum-entropy approach to off-policy evaluation in average-reward MDPs"

9 / 9 papers shown
Title
Imitation Learning in Discounted Linear MDPs without exploration
  assumptions
Imitation Learning in Discounted Linear MDPs without exploration assumptions
Luca Viano
Stratis Skoulakis
Volkan Cevher
55
5
0
03 May 2024
What can online reinforcement learning with function approximation
  benefit from general coverage conditions?
What can online reinforcement learning with function approximation benefit from general coverage conditions?
Fanghui Liu
Luca Viano
Volkan Cevher
OffRL
57
3
0
25 Apr 2023
Proximal Point Imitation Learning
Proximal Point Imitation Learning
Luca Viano
Angeliki Kamoutsi
Gergely Neu
Igor Krawczuk
Volkan Cevher
105
16
0
22 Sep 2022
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CMLFAttOffRL
65
2
0
06 Oct 2021
Infinite-Horizon Offline Reinforcement Learning with Linear Function
  Approximation: Curse of Dimensionality and Algorithm
Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm
Lin Chen
B. Scherrer
Peter L. Bartlett
OffRL
213
16
0
17 Mar 2021
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and
  Dual Bounds
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds
Yihao Feng
Ziyang Tang
Na Zhang
Qiang Liu
OffRL
73
14
0
09 Mar 2021
Average-Reward Off-Policy Policy Evaluation with Function Approximation
Average-Reward Off-Policy Policy Evaluation with Function Approximation
Shangtong Zhang
Yi Wan
R. Sutton
Shimon Whiteson
OffRL
73
31
0
08 Jan 2021
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample
  Efficient
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
Botao Hao
Yaqi Duan
Tor Lattimore
Csaba Szepesvári
Mengdi Wang
OffRL
142
27
0
08 Nov 2020
Online Sparse Reinforcement Learning
Online Sparse Reinforcement Learning
Botao Hao
Tor Lattimore
Csaba Szepesvári
Mengdi Wang
OffRL
137
29
0
08 Nov 2020
1