Reinforcement Learning with Human Feedback (RLHF) is considered a standard approach to fine-tuning Large Language Models (LLMs). However, such methods often face limitations such as unsound black-box reward models, difficulties in collecting human preference data, and the reliance on sparse scalar rewards. These methods often fall short when applied to tasks that require complex domain-specific understanding.
View on arXiv@article{jha2025_2405.16661, title={ RLSF: Fine-tuning LLMs via Symbolic Feedback }, author={ Piyush Jha and Prithwish Jana and Pranavkrishna Suresh and Arnav Arora and Vijay Ganesh }, journal={arXiv preprint arXiv:2405.16661}, year={ 2025 } }