ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.15503
  4. Cited By
Curious Explorer: a provable exploration strategy in Policy Learning

Curious Explorer: a provable exploration strategy in Policy Learning

29 June 2021
M. Miani
Maurizio Parton
M. Romito
ArXiv (abs)PDFHTML

Papers citing "Curious Explorer: a provable exploration strategy in Policy Learning"

11 / 11 papers shown
Title
OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in
  Noisy Environments
OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments
Jinyi Liu
Zhi Wang
Yan Zheng
Jianye Hao
Chenjia Bai
Junjie Ye
Zhen Wang
Haiyin Piao
Yang Sun
91
7
0
19 Dec 2023
Exploration in Deep Reinforcement Learning: From Single-Agent to
  Multiagent Domain
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Jianye Hao
Tianpei Yang
Hongyao Tang
Chenjia Bai
Jinyi Liu
Zhaopeng Meng
Peng Liu
Zhen Wang
OffRL
74
101
0
14 Sep 2021
First return, then explore
First return, then explore
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
82
363
0
27 Apr 2020
Kinematic State Abstraction and Provably Efficient Rich-Observation
  Reinforcement Learning
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra
Mikael Henaff
A. Krishnamurthy
John Langford
79
151
0
13 Nov 2019
Model-free Reinforcement Learning in Infinite-horizon Average-reward
  Markov Decision Processes
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
136
108
0
15 Oct 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and
  Distribution Shift
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
72
321
0
01 Aug 2019
Global Optimality Guarantees For Policy Gradient Methods
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
82
194
0
05 Jun 2019
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
R. Ortner
69
46
0
06 Aug 2018
Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRMSSL
122
2,451
0
15 May 2017
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement
  Learning
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
83
311
0
22 Mar 2017
OpenAI Gym
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRLODL
223
5,086
0
05 Jun 2016
1