NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL

NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL

5 October 2021

Ping-Chun Hsieh

Papers citing "NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL"

15 / 15 papers shown

Title
The Bandit Whisperer: Communication Learning for Restless Bandits Yunfan Zhao Tonghan Wang Dheeraj M. Nagaraj Aparna Taneja Milind Tambe 73 5 0 11 Aug 2024
Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare Arpita Biswas Gaurav Aggarwal Pradeep Varakantham Milind Tambe 28 41 0 17 May 2021
MOPO: Model-based Offline Policy Optimization Tianhe Yu G. Thomas Lantao Yu Stefano Ermon James Zou Sergey Levine Chelsea Finn Tengyu Ma OffRL 58 759 0 27 May 2020
MOReL : Model-Based Offline Reinforcement Learning Rahul Kidambi Aravind Rajeswaran Praneeth Netrapalli Thorsten Joachims OffRL 61 662 0 12 May 2020
Whittle index based Q-learning for restless bandits with average reward Konstantin Avrachenkov Vivek Borkar 26 69 0 29 Apr 2020
Q-Learning in enormous action spaces via amortized approximate maximization T. Wiele David Warde-Farley A. Mnih Volodymyr Mnih 36 60 0 22 Jan 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury ... Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai Soumith Chintala ODL 181 42,038 0 03 Dec 2019
Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs Andrea Zanette Emma Brunskill 76 15 0 03 Nov 2019
Recovering Bandits Ciara Pike-Burke Steffen Grunewalder 108 40 0 31 Oct 2019
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling C. Riquelme George Tucker Jasper Snoek BDL 55 365 0 26 Feb 2018
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning Christoph Dann Tor Lattimore Emma Brunskill 52 307 0 22 Mar 2017
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable Nan Jiang A. Krishnamurthy Alekh Agarwal John Langford Robert Schapire 71 417 0 29 Oct 2016
Deep Reinforcement Learning in Large Discrete Action Spaces Gabriel Dulac-Arnold Richard Evans H. V. Hasselt P. Sunehag Timothy Lillicrap Jonathan J. Hunt Timothy A. Mann T. Weber T. Degris Ben Coppin OffRL 52 572 0 24 Dec 2015
When are Kalman-filter restless bandits indexable? C. Dance T. Silander 17 12 0 15 Sep 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 626 149,474 0 22 Dec 2014