Reinforcement learning (RL) agents typically optimize their policies by performing expensive backward passes to update their network parameters. However, some agents can solve new tasks without updating any parameters by simply conditioning on additional context such as their action-observation histories. This paper surveys work on such behavior, known as in-context reinforcement learning.
View on arXiv@article{moeini2025_2502.07978, title={ A Survey of In-Context Reinforcement Learning }, author={ Amir Moeini and Jiuqi Wang and Jacob Beck and Ethan Blaser and Shimon Whiteson and Rohan Chandra and Shangtong Zhang }, journal={arXiv preprint arXiv:2502.07978}, year={ 2025 } }