ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.04939
  4. Cited By
RL for Latent MDPs: Regret Guarantees and a Lower Bound

RL for Latent MDPs: Regret Guarantees and a Lower Bound

9 February 2021
Jeongyeol Kwon
Yonathan Efroni
C. Caramanis
Shie Mannor
ArXivPDFHTML

Papers citing "RL for Latent MDPs: Regret Guarantees and a Lower Bound"

50 / 57 papers shown
Title
Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control
Model-based controller assisted domain randomization in deep reinforcement learning: application to nonlinear powertrain control
Heisei Yonezawa
Ansei Yonezawa
Itsuro Kajiwara
36
0
0
28 Apr 2025
Improving Controller Generalization with Dimensionless Markov Decision Processes
Improving Controller Generalization with Dimensionless Markov Decision Processes
V. Charvet
Sebastian Stein
R. Murray-Smith
34
0
0
14 Apr 2025
A Classification View on Meta Learning Bandits
A Classification View on Meta Learning Bandits
Mirco Mutti
Jeongyeol Kwon
Shie Mannor
Aviv Tamar
23
0
0
06 Apr 2025
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
Yuheng Zhang
Nan Jiang
OffRL
61
0
0
03 Mar 2025
Personalized and Sequential Text-to-Image Generation
Personalized and Sequential Text-to-Image Generation
Ofir Nabati
Guy Tennenholtz
ChihWei Hsu
Moonkyung Ryu
Deepak Ramachandran
Yinlam Chow
Xiang Li
Craig Boutilier
MLLM
77
0
0
10 Dec 2024
Learning to Cooperate with Humans using Generative Agents
Learning to Cooperate with Humans using Generative Agents
Yancheng Liang
Daphne Chen
Abhishek Gupta
S. Du
Natasha Jaques
SyDa
77
4
0
21 Nov 2024
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from
  Shifted-Dynamics Data
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRL
OnRL
36
0
0
06 Nov 2024
Learning in Markov Games with Adaptive Adversaries: Policy Regret,
  Fundamental Barriers, and Efficient Algorithms
Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient Algorithms
Thanh Nguyen-Tang
Raman Arora
74
1
0
01 Nov 2024
Test-Time Regret Minimization in Meta Reinforcement Learning
Test-Time Regret Minimization in Meta Reinforcement Learning
Mirco Mutti
Aviv Tamar
23
4
0
04 Jun 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy
  Evaluation
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
Jeongyeol Kwon
Shie Mannor
C. Caramanis
Yonathan Efroni
OffRL
37
2
0
03 Jun 2024
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
Vanshaj Khattar
Yuhao Ding
Bilgehan Sel
Javad Lavaei
Ming Jin
OffRL
32
12
0
26 May 2024
Pausing Policy Learning in Non-stationary Reinforcement Learning
Pausing Policy Learning in Non-stationary Reinforcement Learning
Hyunin Lee
Ming Jin
Javad Lavaei
Somayeh Sojoudi
OffRL
34
2
0
25 May 2024
Preparing for Black Swans: The Antifragility Imperative for Machine
  Learning
Preparing for Black Swans: The Antifragility Imperative for Machine Learning
Ming Jin
36
2
0
18 May 2024
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement
  Learning
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning
Anthony Liang
Guy Tennenholtz
Chih-Wei Hsu
Yinlam Chow
Erdem Biyik
Craig Boutilier
OffRL
38
1
0
25 Feb 2024
On the Curses of Future and History in Future-dependent Value Functions
  for Off-policy Evaluation
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
27
4
0
22 Feb 2024
Weakly Coupled Deep Q-Networks
Weakly Coupled Deep Q-Networks
Ibrahim El Shar
Daniel R. Jiang
19
2
0
28 Oct 2023
Prospective Side Information for Latent MDPs
Prospective Side Information for Latent MDPs
Jeongyeol Kwon
Yonathan Efroni
Shie Mannor
C. Caramanis
23
5
0
11 Oct 2023
Tempo Adaptation in Non-stationary Reinforcement Learning
Tempo Adaptation in Non-stationary Reinforcement Learning
Hyunin Lee
Yuhao Ding
Jongmin Lee
Ming Jin
Javad Lavaei
Somayeh Sojoudi
9
3
0
26 Sep 2023
JoinGym: An Efficient Query Optimization Environment for Reinforcement
  Learning
JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning
Kaiwen Wang
Junxiong Wang
Yueying Li
Nathan Kallus
Immanuel Trummer
Wen Sun
GP
44
2
0
21 Jul 2023
Sample-Efficient Learning of POMDPs with Multiple Observations In
  Hindsight
Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight
Jiacheng Guo
Minshuo Chen
Haiquan Wang
Caiming Xiong
Mengdi Wang
Yu Bai
19
5
0
06 Jul 2023
Provably Efficient UCB-type Algorithms For Learning Predictive State
  Representations
Provably Efficient UCB-type Algorithms For Learning Predictive State Representations
Ruiquan Huang
Yitao Liang
J. Yang
OffRL
24
5
0
01 Jul 2023
Context-lumpable stochastic bandits
Context-lumpable stochastic bandits
Chung-Wei Lee
Qinghua Liu
Yasin Abbasi-Yadkori
Chi Jin
Tor Lattimore
Csaba Szepesvári
OffRL
100
2
0
22 Jun 2023
Theoretical Hardness and Tractability of POMDPs in RL with Partial
  Online State Information
Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information
Ming Shi
Yingbin Liang
Ness B. Shroff
29
2
0
14 Jun 2023
Provably Efficient Offline Reinforcement Learning with Perturbed Data
  Sources
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources
Chengshuai Shi
Wei Xiong
Cong Shen
Jing Yang
OffRL
30
3
0
14 Jun 2023
Representations and Exploration for Deep Reinforcement Learning using
  Singular Value Decomposition
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Yash Chandak
S. Thakoor
Z. Guo
Yunhao Tang
Rémi Munos
Will Dabney
Diana Borsa
13
2
0
01 May 2023
Hardness of Independent Learning and Sparse Equilibrium Computation in
  Markov Games
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games
Dylan J. Foster
Noah Golowich
Sham Kakade
20
10
0
22 Mar 2023
POPGym: Benchmarking Partially Observable Reinforcement Learning
POPGym: Benchmarking Partially Observable Reinforcement Learning
Steven D. Morad
Ryan Kortvelesy
Matteo Bettini
Stephan Liwicki
Amanda Prorok
OffRL
14
37
0
03 Mar 2023
Reinforcement Learning with History-Dependent Dynamic Contexts
Reinforcement Learning with History-Dependent Dynamic Contexts
Guy Tennenholtz
Nadav Merlis
Lior Shani
Martin Mladenov
Craig Boutilier
AI4CE
11
6
0
04 Feb 2023
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Jonathan Lee
Alekh Agarwal
Christoph Dann
Tong Zhang
26
19
0
31 Jan 2023
Adversarial Online Multi-Task Reinforcement Learning
Adversarial Online Multi-Task Reinforcement Learning
Quan Nguyen
Nishant A. Mehta
14
1
0
11 Jan 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
Yang Xu
Jin Zhu
C. Shi
S. Luo
R. Song
OffRL
21
12
0
29 Dec 2022
Offline Policy Evaluation and Optimization under Confounding
Offline Policy Evaluation and Optimization under Confounding
Chinmaya Kausik
Yangyi Lu
Kevin Tan
Maggie Makar
Yixin Wang
Ambuj Tewari
OffRL
18
8
0
29 Nov 2022
Learning Mixtures of Markov Chains and MDPs
Learning Mixtures of Markov Chains and MDPs
Chinmaya Kausik
Kevin Tan
Ambuj Tewari
13
11
0
17 Nov 2022
Group Distributionally Robust Reinforcement Learning with Hierarchical
  Latent Variables
Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables
Mengdi Xu
Peide Huang
Yaru Niu
Visak C. V. Kumar
Jielin Qiu
...
Kuan-Hui Lee
Xuewei Qi
H. Lam
Bo-wen Li
Ding Zhao
OOD
54
9
0
21 Oct 2022
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent
  Markov Decision Processes
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
Runlong Zhou
Ruosong Wang
S. Du
31
3
0
20 Oct 2022
Tractable Optimality in Episodic Latent MABs
Tractable Optimality in Episodic Latent MABs
Jeongyeol Kwon
Yonathan Efroni
C. Caramanis
Shie Mannor
50
3
0
05 Oct 2022
Reward-Mixing MDPs with a Few Latent Contexts are Learnable
Reward-Mixing MDPs with a Few Latent Contexts are Learnable
Jeongyeol Kwon
Yonathan Efroni
C. Caramanis
Shie Mannor
29
5
0
05 Oct 2022
Partially Observable RL with B-Stability: Unified Structural Condition
  and Sharp Sample-Efficient Algorithms
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms
Fan Chen
Yu Bai
Song Mei
53
22
0
29 Sep 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
29
16
0
26 Jul 2022
PAC Reinforcement Learning for Predictive State Representations
PAC Reinforcement Learning for Predictive State Representations
Wenhao Zhan
Masatoshi Uehara
Wen Sun
Jason D. Lee
31
38
0
12 Jul 2022
On the Complexity of Adversarial Decision Making
On the Complexity of Adversarial Decision Making
Dylan J. Foster
Alexander Rakhlin
Ayush Sekhari
Karthik Sridharan
AAML
21
28
0
27 Jun 2022
Computationally Efficient PAC RL in POMDPs with Latent Determinism and
  Conditional Embeddings
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
58
6
0
24 Jun 2022
Provably Efficient Reinforcement Learning in Partially Observable
  Dynamical Systems
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
OffRL
49
31
0
24 Jun 2022
Learning in Observable POMDPs, without Computationally Intractable
  Oracles
Learning in Observable POMDPs, without Computationally Intractable Oracles
Noah Golowich
Ankur Moitra
Dhruv Rohatgi
24
26
0
07 Jun 2022
Reinforcement Learning from Partial Observation: Linear Function
  Approximation with Provable Sample Efficiency
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
Qi Cai
Zhuoran Yang
Zhaoran Wang
30
13
0
20 Apr 2022
When Is Partially Observable Reinforcement Learning Not Scary?
When Is Partially Observable Reinforcement Learning Not Scary?
Qinghua Liu
Alan Chung
Csaba Szepesvári
Chi Jin
14
92
0
19 Apr 2022
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
David Bruns-Smith
CML
ELM
OffRL
22
12
0
02 Apr 2022
Learning Markov Games with Adversarial Opponents: Efficient Algorithms
  and Fundamental Limits
Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits
Qinghua Liu
Yuanhao Wang
Chi Jin
AAML
24
15
0
14 Mar 2022
Understanding Curriculum Learning in Policy Optimization for Online
  Combinatorial Optimization
Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization
Runlong Zhou
Zelin He
Yuandong Tian
Yi Wu
S. Du
OffRL
18
3
0
11 Feb 2022
The Importance of Non-Markovianity in Maximum State Entropy Exploration
The Importance of Non-Markovianity in Maximum State Entropy Exploration
Mirco Mutti
Ric De Santi
Marcello Restelli
30
31
0
07 Feb 2022
12
Next