Hyperband-based Bayesian Optimization for Black-box Prompt Selection

Optimal prompt selection is crucial for maximizing large language model (LLM) performance on downstream tasks, especially in black-box settings where models are only accessible via APIs. Black-box prompt selection is challenging due to potentially large, combinatorial search spaces, absence of gradient information, and high evaluation cost of prompts on a validation set. We propose HbBoPs, a novel method that combines a structural-aware deep kernel Gaussian Process with Hyperband as a multi-fidelity scheduler to efficiently select prompts. HbBoPs uses embeddings of instructions and few-shot exemplars, treating them as modular components within prompts. This enhances the surrogate model's ability to predict which prompt to evaluate next in a sample-efficient manner. Hyperband improves query-efficiency by adaptively allocating resources across different fidelity levels, reducing the number of validation instances required for evaluating prompts. Extensive experiments across ten diverse benchmarks and three LLMs demonstrate that HbBoPs outperforms state-of-the-art methods in both performance and efficiency.
View on arXiv@article{schneider2025_2412.07820, title={ Hyperband-based Bayesian Optimization for Black-box Prompt Selection }, author={ Lennart Schneider and Martin Wistuba and Aaron Klein and Jacek Golebiowski and Giovanni Zappella and Felice Antonio Merra }, journal={arXiv preprint arXiv:2412.07820}, year={ 2025 } }