Actor-critic is implicitly biased towards high entropy optimal policies

21 October 2021

Papers citing "Actor-critic is implicitly biased towards high entropy optimal policies"

2 / 2 papers shown

Title
Optimal Rates of Convergence for Entropy Regularization in Discounted Markov Decision Processes Johannes Muller Semih Cayci 44 0 0 06 Jun 2024
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation Chengqian Gao Kelvin Xu Liu Liu Deheng Ye P. Zhao Zhiqiang Xu OffRL 39 2 0 19 Oct 2022