ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.19807
  4. Cited By
MetaCURL: Non-stationary Concave Utility Reinforcement Learning

MetaCURL: Non-stationary Concave Utility Reinforcement Learning

30 May 2024
B. Moreno
Margaux Brégère
Pierre Gaillard
Nadia Oudjane
    OffRL
ArXivPDFHTML

Papers citing "MetaCURL: Non-stationary Concave Utility Reinforcement Learning"

30 / 30 papers shown
Title
A Model Selection Approach for Corruption Robust Reinforcement Learning
A Model Selection Approach for Corruption Robust Reinforcement Learning
Chen-Yu Wei
Christoph Dann
Julian Zimmert
109
45
0
31 Dec 2024
High-Probability Risk Bounds via Sequential Predictors
High-Probability Risk Bounds via Sequential Predictors
Dirk van der Hoeven
Nikita Zhivotovskiy
Nicolò Cesa-Bianchi
OffRL
55
2
0
15 Aug 2023
Reinforcement Learning with General Utilities: Simpler Variance
  Reduction and Large State-Action Space
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
Anas Barakat
Ilyas Fatkhullin
Niao He
51
11
0
02 Jun 2023
No-Regret Online Reinforcement Learning with Adversarial Losses and
  Transitions
No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions
Tiancheng Jin
Junyan Liu
Chloé Rouyer
William Chang
Chen-Yu Wei
Haipeng Luo
AAML
30
8
0
27 May 2023
Improved Policy Optimization for Online Imitation Learning
Improved Policy Optimization for Online Imitation Learning
J. Lavington
Sharan Vaswani
Mark Schmidt
OffRL
50
6
0
29 Jul 2022
Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint
Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint
Matthieu Geist
Julien Pérolat
Mathieu Laurière
Romuald Elie
Sarah Perrin
Olivier Bachem
Rémi Munos
Olivier Pietquin
71
64
0
07 Jun 2021
Reward is enough for convex MDPs
Reward is enough for convex MDPs
Tom Zahavy
Brendan O'Donoghue
Guillaume Desjardins
Satinder Singh
89
74
0
01 Jun 2021
Minimax Regret for Stochastic Shortest Path
Minimax Regret for Stochastic Shortest Path
Alon Cohen
Yonathan Efroni
Yishay Mansour
Aviv A. Rosenberg
53
28
0
24 Mar 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy
  Gradient Method
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
78
68
0
17 Feb 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement
  Learning
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin Jamieson
50
23
0
13 Feb 2021
Robust Policy Gradient against Strong Data Corruption
Robust Policy Gradient against Strong Data Corruption
Xuezhou Zhang
Yiding Chen
Xiaojin Zhu
Wen Sun
AAML
82
38
0
11 Feb 2021
Non-stationary Reinforcement Learning without Prior Knowledge: An
  Optimal Black-box Approach
Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach
Chen-Yu Wei
Haipeng Luo
OffRL
124
105
0
10 Feb 2021
Non-stationary Online Regression
Non-stationary Online Regression
Anant Raj
Pierre Gaillard
Christophe Saad
AI4TS
46
7
0
13 Nov 2020
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in
  Metric Spaces
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces
O. D. Domingues
Pierre Ménard
Matteo Pirotta
E. Kaufmann
Michal Valko
66
40
0
09 Jul 2020
Variational Policy Gradient Method for Reinforcement Learning with
  General Utilities
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
57
139
0
04 Jul 2020
Dynamic Regret of Policy Optimization in Non-stationary Environments
Dynamic Regret of Policy Optimization in Non-stationary Environments
Yingjie Fei
Zhuoran Yang
Zhaoran Wang
Qiaomin Xie
65
54
0
30 Jun 2020
Reinforcement Learning for Non-Stationary Markov Decision Processes: The
  Blessing of (More) Optimism
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
55
95
0
24 Jun 2020
Stochastic Shortest Path with Adversarially Changing Costs
Stochastic Shortest Path with Adversarially Changing Costs
Aviv A. Rosenberg
Yishay Mansour
AAML
66
33
0
20 Jun 2020
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with
  Known Transition
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition
Tiancheng Jin
Haipeng Luo
48
57
0
10 Jun 2020
Optimistic Policy Optimization with Bandit Feedback
Optimistic Policy Optimization with Bandit Feedback
Yonathan Efroni
Lior Shani
Aviv A. Rosenberg
Shie Mannor
46
90
0
19 Feb 2020
Learning Adversarial MDPs with Bandit Feedback and Unknown Transition
Learning Adversarial MDPs with Bandit Feedback and Unknown Transition
Chi Jin
Tiancheng Jin
Haipeng Luo
S. Sra
Tiancheng Yu
59
104
0
03 Dec 2019
A Divergence Minimization Perspective on Imitation Learning Methods
A Divergence Minimization Perspective on Imitation Learning Methods
Seyed Kamyar Seyed Ghasemipour
R. Zemel
S. Gu
67
249
0
06 Nov 2019
Apprenticeship Learning via Frank-Wolfe
Apprenticeship Learning via Frank-Wolfe
Tom Zahavy
Alon Cohen
Haim Kaplan
Yishay Mansour
63
18
0
05 Nov 2019
Online Convex Optimization in Adversarial Markov Decision Processes
Online Convex Optimization in Adversarial Markov Decision Processes
Aviv A. Rosenberg
Yishay Mansour
52
138
0
19 May 2019
Variational Regret Bounds for Reinforcement Learning
Variational Regret Bounds for Reinforcement Learning
Pratik Gajane
R. Ortner
P. Auer
83
61
0
14 May 2019
Provably Efficient Maximum Entropy Exploration
Provably Efficient Maximum Entropy Exploration
Elad Hazan
Sham Kakade
Karan Singh
A. V. Soest
65
297
0
06 Dec 2018
A Sliding-Window Algorithm for Markov Decision Processes with
  Arbitrarily Changing Rewards and Transitions
A Sliding-Window Algorithm for Markov Decision Processes with Arbitrarily Changing Rewards and Transitions
Pratik Gajane
R. Ortner
P. Auer
48
64
0
25 May 2018
Improved Strongly Adaptive Online Learning using Coin Betting
Improved Strongly Adaptive Online Learning using Coin Betting
Kwang-Sung Jun
Francesco Orabona
Rebecca Willett
S. Wright
141
82
0
14 Oct 2016
Strongly Adaptive Online Learning
Strongly Adaptive Online Learning
Amit Daniely
Alon Gonen
Shai Shalev-Shwartz
ODL
148
178
0
25 Feb 2015
Efficient Tracking of Large Classes of Experts
Efficient Tracking of Large Classes of Experts
András Gyorgy
Tamás Linder
Gábor Lugosi
102
75
0
12 Oct 2011
1