XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

8 May 2024

Abstract

Recent studies have shown that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving high-quality in-context examples, significantly improves in-context learning of English. However, adapting these methods to other languages, especially low-resource ones, presents challenges due to the scarcity of available cross-lingual retrievers and annotated data. In this paper, we introduce XAMPLER: Cross-Lingual Example Retrieval, a method tailored to tackle the challenge of cross-lingual in-context learning using only annotated English data. XAMPLER first trains a retriever with positive/negative English samples, which are constructed based on the predictions of the multilingual large language model for in-context learning. Then, the trained retriever is directly employed to retrieve English examples as few-shot examples for in-context learning of target languages. Experiments on the massively multilingual text classification benchmark of SIB200 with 176 languages demonstrate that XAMPLER substantially improves the in-context learning performance across languages. Our code is available at https://github.com/cisnlp/XAMPLER.

View on arXiv

@article{lin2025_2405.05116,
  title={ XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples },
  author={ Peiqin Lin and André F. T. Martins and Hinrich Schütze },
  journal={arXiv preprint arXiv:2405.05116},
  year={ 2025 }
}

Comments on this paper