7
0

BanditWare: A Contextual Bandit-based Framework for Hardware Prediction

Main:15 Pages
13 Figures
Bibliography:3 Pages
1 Tables
Abstract

Distributed computing systems are essential for meeting the demands of modern applications, yet transitioning from single-system to distributed environments presents significant challenges. Misallocating resources in shared systems can lead to resource contention, system instability, degraded performance, priority inversion, inefficient utilization, increased latency, and environmental impact.We present BanditWare, an online recommendation system that dynamically selects the most suitable hardware for applications using a contextual multi-armed bandit algorithm. BanditWare balances exploration and exploitation, gradually refining its hardware recommendations based on observed application performance while continuing to explore potentially better options. Unlike traditional statistical and machine learning approaches that rely heavily on large historical datasets, BanditWare operates online, learning and adapting in real-time as new workloads arrive.We evaluated BanditWare on three workflow applications: Cycles (an agricultural science scientific workflow) BurnPro3D (a web-based platform for fire science) and a matrix multiplication application. Designed for seamless integration with the National Data Platform (NDP), BanditWare enables users of all experience levels to optimize resource allocation efficiently.

View on arXiv
@article{coleman2025_2506.13730,
  title={ BanditWare: A Contextual Bandit-based Framework for Hardware Prediction },
  author={ Tainã Coleman and Hena Ahmed and Ravi Shende and Ismael Perez and Ïlkay Altintaş },
  journal={arXiv preprint arXiv:2506.13730},
  year={ 2025 }
}
Comments on this paper