Title |
---|
![]() Worst-case Performance of Greedy Policies in Bandits with Imperfect
Context Observations Hongju Park Mohamad Kazem Shirani Faradonbeh |
![]() Efficient Algorithms for Learning to Control Bandits with Unobserved
Contexts Hongju Park Mohamad Kazem Shirani Faradonbeh |
![]() Joint Learning-Based Stabilization of Multiple Unknown Linear Systems Mohamad Kazem Shirani Faradonbeh Aditya Modi |
![]() Analysis of Thompson Sampling for Partially Observable Contextual
Multi-Armed Bandits Yash J. Patel Mohamad Kazem Shirani Faradonbeh |