ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.04127
  4. Cited By
Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of
  Information Selection

Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of Information Selection

8 October 2021
Michael Rawson
Jade Freeman
ArXivPDFHTML

Papers citing "Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of Information Selection"

2 / 2 papers shown
Title
Improving Reward-Conditioned Policies for Multi-Armed Bandits using
  Normalized Weight Functions
Improving Reward-Conditioned Policies for Multi-Armed Bandits using Normalized Weight Functions
Kai Xu
Farid Tajaddodianfar
Ben Allison
21
0
0
16 Jun 2024
Convergence Guarantees for Deep Epsilon Greedy Policy Learning
Convergence Guarantees for Deep Epsilon Greedy Policy Learning
Michael Rawson
R. Balan
32
8
0
02 Dec 2021
1