ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.11559
  4. Cited By
Provably Correct Optimization and Exploration with Non-linear Policies

Provably Correct Optimization and Exploration with Non-linear Policies

22 March 2021
Fei Feng
W. Yin
Alekh Agarwal
Lin F. Yang
ArXiv (abs)PDFHTML

Papers citing "Provably Correct Optimization and Exploration with Non-linear Policies"

12 / 12 papers shown
Title
The Central Role of the Loss Function in Reinforcement Learning
The Central Role of the Loss Function in Reinforcement Learning
Kaiwen Wang
Nathan Kallus
Wen Sun
OffRL
268
10
0
19 Sep 2024
FLAMBE: Structural Complexity and Representation Learning of Low Rank
  MDPs
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
173
227
0
18 Jun 2020
Optimistic Policy Optimization with Bandit Feedback
Optimistic Policy Optimization with Bandit Feedback
Yonathan Efroni
Lior Shani
Aviv A. Rosenberg
Shie Mannor
63
90
0
19 Feb 2020
Provably Efficient Exploration in Policy Optimization
Provably Efficient Exploration in Policy Optimization
Qi Cai
Zhuoran Yang
Chi Jin
Zhaoran Wang
85
283
0
12 Dec 2019
Kinematic State Abstraction and Provably Efficient Rich-Observation
  Reinforcement Learning
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra
Mikael Henaff
A. Krishnamurthy
John Langford
83
151
0
13 Nov 2019
Provably Efficient Reinforcement Learning with Linear Function
  Approximation
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
109
560
0
11 Jul 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
93
193
0
05 Jun 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and
  Regret Bound
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin F. Yang
Mengdi Wang
OffRLGP
79
288
0
24 May 2019
A Theory of Regularized Markov Decision Processes
A Theory of Regularized Markov Decision Processes
Matthieu Geist
B. Scherrer
Olivier Pietquin
137
333
0
31 Jan 2019
Practical Contextual Bandits with Regression Oracles
Practical Contextual Bandits with Regression Oracles
Dylan J. Foster
Alekh Agarwal
Miroslav Dudík
Haipeng Luo
Robert Schapire
395
127
0
03 Mar 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
317
8,420
0
04 Jan 2018
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
210
8,881
0
04 Feb 2016
1