Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.04127
Cited By
Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of Information Selection
8 October 2021
Michael Rawson
Jade Freeman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of Information Selection"
2 / 2 papers shown
Title
Improving Reward-Conditioned Policies for Multi-Armed Bandits using Normalized Weight Functions
Kai Xu
Farid Tajaddodianfar
Ben Allison
21
0
0
16 Jun 2024
Convergence Guarantees for Deep Epsilon Greedy Policy Learning
Michael Rawson
R. Balan
32
8
0
02 Dec 2021
1