Aligning Web Query Generation with Ranking Objectives via Direct Preference Optimization

25 May 2025

Main:4 Pages

2 Figures

Bibliography:1 Pages

3 Tables

Abstract

Neural retrieval models excel in Web search, but their training requires substantial amounts of labeled query-document pairs, which are costly to obtain. With the widespread availability of Web document collections like ClueWeb22, synthetic queries generated by large language models offer a scalable alternative. Still, synthetic training queries often vary in quality, which leads to suboptimal downstream retrieval performance. Existing methods typically filter out noisy query-document pairs based on signals from an external re-ranker. In contrast, we propose a framework that leverages Direct Preference Optimization (DPO) to integrate ranking signals into the query generation process, aiming to directly optimize the model towards generating high-quality queries that maximize downstream retrieval effectiveness. Experiments show higher ranker-assessed relevance between query-document pairs after DPO, leading to stronger downstream performance on the MS~MARCO benchmark when compared to baseline models trained with synthetic data.

View on arXiv

@article{coelho2025_2505.19307,
  title={ Aligning Web Query Generation with Ranking Objectives via Direct Preference Optimization },
  author={ João Coelho and Bruno Martins and João Magalhães and Chenyan Xiong },
  journal={arXiv preprint arXiv:2505.19307},
  year={ 2025 }
}

Comments on this paper