Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.16602
Cited By
Functional Acceleration for Policy Mirror Descent
23 July 2024
Veronica Chelu
Doina Precup
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Functional Acceleration for Policy Mirror Descent"
30 / 30 papers shown
Title
Policy Mirror Descent with Lookahead
Kimon Protopapas
Anas Barakat
49
2
0
21 Mar 2024
Diversifying AI: Towards Creative Chess with AlphaZero
Tom Zahavy
Vivek Veeriah
Shaobo Hou
Kevin Waugh
Matthew Lai
Edouard Leurent
Nenad Tomašev
Lisa Schut
Demis Hassabis
Satinder Singh
55
15
0
17 Aug 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
65
4
0
24 May 2023
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
45
11
0
22 Feb 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
95
15
0
30 Jan 2023
No-Regret Dynamics in the Fenchel Game: A Unified Framework for Algorithmic Convex Optimization
Jun-Kun Wang
Jacob D. Abernethy
Kfir Y. Levy
58
22
0
22 Nov 2021
Understanding the Effect of Stochasticity in Policy Optimization
Jincheng Mei
Bo Dai
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
48
17
0
29 Oct 2021
Approximation Benefits of Policy Gradient Methods with Aggregated States
Daniel Russo
80
7
0
22 Jul 2020
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
74
60
0
21 Jul 2020
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
86
85
0
20 May 2020
On the Global Convergence Rates of Softmax Policy Gradient Methods
Jincheng Mei
Chenjun Xiao
Csaba Szepesvári
Dale Schuurmans
111
287
0
13 May 2020
Momentum in Reinforcement Learning
Nino Vieillard
B. Scherrer
Olivier Pietquin
Matthieu Geist
35
35
0
21 Oct 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
56
320
0
01 Aug 2019
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
65
188
0
05 Jun 2019
Learning When-to-Treat Policies
Xinkun Nie
Emma Brunskill
Stefan Wager
CML
OffRL
52
90
0
23 May 2019
The Value Function Polytope in Reinforcement Learning
Robert Dadashi
Adrien Ali Taïga
Nicolas Le Roux
Dale Schuurmans
Marc G. Bellemare
33
46
0
31 Jan 2019
Predictor-Corrector Policy Optimization
Ching-An Cheng
Xinyan Yan
Nathan D. Ratliff
Byron Boots
OnRL
43
23
0
15 Oct 2018
Acceleration through Optimistic No-Regret Dynamics
Jun-Kun Wang
Jacob D. Abernethy
78
44
0
27 Jul 2018
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
69
471
0
14 Jun 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
236
8,236
0
04 Jan 2018
Rainbow: Combining Improvements in Deep Reinforcement Learning
Matteo Hessel
Joseph Modayil
H. V. Hasselt
Tom Schaul
Georg Ostrovski
Will Dabney
Dan Horgan
Bilal Piot
M. G. Azar
David Silver
OffRL
99
2,255
0
06 Oct 2017
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
82
1,497
0
21 Jul 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
259
18,685
0
20 Jul 2017
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
168
8,805
0
04 Feb 2016
Accelerating Optimization via Adaptive Prediction
M. Mohri
Scott Yang
AI4CE
33
8
0
18 Sep 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
254
6,722
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.1K
149,474
0
22 Dec 2014
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
107
12,163
0
19 Dec 2013
Online Learning with Predictable Sequences
Alexander Rakhlin
Karthik Sridharan
135
355
0
18 Aug 2012
Solving variational inequalities with Stochastic Mirror-Prox algorithm
A. Juditsky
A. Nemirovskii
Claire Tauvel
117
441
0
04 Sep 2008
1