Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.04337
Cited By
Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management
10 May 2019
Shipra Agrawal
Randy Jia
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management"
7 / 7 papers shown
Title
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
63
4
0
18 Jul 2024
Restarted Bayesian Online Change-point Detection for Non-Stationary Markov Decision Processes
Réda Alami
Mohammed Mahfoud
Eric Moulines
24
2
0
01 Apr 2023
Hindsight Learning for MDPs with Exogenous Inputs
Sean R. Sinclair
Felipe Vieira Frujeri
Ching-An Cheng
Luke Marshall
Hugo Barbalho
Jingling Li
Jennifer Neville
Ishai Menache
Adith Swaminathan
18
23
0
13 Jul 2022
Learning to Order for Inventory Systems with Lost Sales and Uncertain Supplies
Boxiao Chen
Jiashuo Jiang
Jiawei Zhang
Zhengyuan Zhou
26
10
0
10 Jul 2022
Learning a Discrete Set of Optimal Allocation Rules in a Queueing System with Unknown Service Rate
Saghar Adler
Mehrdad Moharrami
V. Subramanian
44
1
0
04 Feb 2022
Learning and Information in Stochastic Networks and Queues
N. Walton
Kuang Xu
32
20
0
18 May 2021
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
22
93
0
24 Jun 2020
1