ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.16502
10
0

Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples

19 June 2025
Soumya Suvra Ghosal
Vaibhav Singh
Akash Ghosh
Soumyabrata Pal
Subhadip Baidya
Sriparna Saha
Dinesh Manocha
ArXiv (abs)PDFHTML
Main:8 Pages
5 Figures
Bibliography:4 Pages
5 Tables
Appendix:4 Pages
Abstract

Reward models are essential for aligning large language models (LLMs) with human preferences. However, most open-source multilingual reward models are primarily trained on preference datasets in high-resource languages, resulting in unreliable reward signals for low-resource Indic languages. Collecting large-scale, high-quality preference data for these languages is prohibitively expensive, making preference-based training approaches impractical. To address this challenge, we propose RELIC, a novel in-context learning framework for reward modeling in low-resource Indic languages. RELIC trains a retriever with a pairwise ranking objective to select in-context examples from auxiliary high-resource languages that most effectively highlight the distinction between preferred and less-preferred responses. Extensive experiments on three preference datasets- PKU-SafeRLHF, WebGPT, and HH-RLHF-using state-of-the-art open-source reward models demonstrate that RELIC significantly improves reward model accuracy for low-resource Indic languages, consistently outperforming existing example selection methods. For example, on Bodo-a low-resource Indic language-using a LLaMA-3.2-3B reward model, RELIC achieves a 12.81% and 10.13% improvement in accuracy over zero-shot prompting and state-of-the-art example selection method, respectively.

View on arXiv
@article{ghosal2025_2506.16502,
  title={ Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples },
  author={ Soumya Suvra Ghosal and Vaibhav Singh and Akash Ghosh and Soumyabrata Pal and Subhadip Baidya and Sriparna Saha and Dinesh Manocha },
  journal={arXiv preprint arXiv:2506.16502},
  year={ 2025 }
}
Comments on this paper