ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.14156
  4. Cited By
Policy Mirror Descent with Lookahead

Policy Mirror Descent with Lookahead

21 March 2024
Kimon Protopapas
Anas Barakat
ArXivPDFHTML

Papers citing "Policy Mirror Descent with Lookahead"

10 / 10 papers shown
Title
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Jincheng Mei
Bo Dai
Alekh Agarwal
Mohammad Ghavamzadeh
Csaba Szepesvári
Dale Schuurmans
122
4
0
02 Apr 2025
Functional Acceleration for Policy Mirror Descent
Functional Acceleration for Policy Mirror Descent
Veronica Chelu
Doina Precup
62
0
0
23 Jul 2024
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
806
12,893
0
04 Mar 2022
Planning and Learning with Adaptive Lookahead
Planning and Learning with Adaptive Lookahead
Aviv A. Rosenberg
Assaf Hallak
Shie Mannor
Gal Chechik
Gal Dalal
58
8
0
28 Jan 2022
The Role of Lookahead and Approximate Policy Evaluation in Reinforcement
  Learning with Linear Value Function Approximation
The Role of Lookahead and Approximate Policy Evaluation in Reinforcement Learning with Linear Value Function Approximation
Anna Winnicki
Joseph Lubars
Michael Livesay
R. Srikant
45
3
0
28 Sep 2021
Monte-Carlo Tree Search as Regularized Policy Optimization
Monte-Carlo Tree Search as Regularized Policy Optimization
Jean-Bastien Grill
Florent Altché
Yunhao Tang
Thomas Hubert
Michal Valko
Ioannis Antonoglou
Rémi Munos
74
75
0
24 Jul 2020
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
72
193
0
05 Jun 2019
How to Combine Tree-Search Methods in Reinforcement Learning
How to Combine Tree-Search Methods in Reinforcement Learning
Yonathan Efroni
Gal Dalal
B. Scherrer
Shie Mannor
51
31
0
06 Sep 2018
Beyond the One Step Greedy Approach in Reinforcement Learning
Beyond the One Step Greedy Approach in Reinforcement Learning
Yonathan Efroni
Gal Dalal
B. Scherrer
Shie Mannor
OffRL
80
50
0
10 Feb 2018
Improved and Generalized Upper Bounds on the Complexity of Policy
  Iteration
Improved and Generalized Upper Bounds on the Complexity of Policy Iteration
B. Scherrer
80
76
0
03 Jun 2013
1