Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive- $k$

10 June 2025

Main:8 Pages

5 Figures

Bibliography:3 Pages

17 Tables

Appendix:15 Pages

Abstract

Retrieval-augmented generation (RAG) and long-context language models (LCLMs) both address context limitations of LLMs in open-domain question answering (QA). However, optimal external context to retrieve remains an open problem: fixing the retrieval size risks either wasting tokens or omitting key evidence. Existing adaptive methods like Self-RAG and Self-Route rely on iterative LLM prompting and perform well on factoid QA, but struggle with aggregation QA, where the optimal context size is both unknown and variable. We present Adaptive- $k$ retrieval, a simple and effective single-pass method that adaptively selects the number of passages based on the distribution of the similarity scores between the query and the candidate passages. It does not require model fine-tuning, extra LLM inferences or changes to existing retriever-reader pipelines. On both factoid and aggregation QA benchmarks, Adaptive- $k$ matches or outperforms fixed- $k$ baselines while using up to 10x fewer tokens than full-context input, yet still retrieves 70% of relevant passages. It improves accuracy across five LCLMs and two embedding models, highlighting that dynamically adjusting context size leads to more efficient and accurate QA.

View on arXiv

@article{taguchi2025_2506.08479,
  title={ Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-$k$ },
  author={ Chihiro Taguchi and Seiji Maekawa and Nikita Bhutani },
  journal={arXiv preprint arXiv:2506.08479},
  year={ 2025 }
}

Comments on this paper

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-kkk

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive- $k$