Title |
---|
![]() Towards Rationality in Language and Multimodal Agents: A Survey Bowen Jiang Yangxinyu Xie Xiaomeng Wang Yuan Yuan Camillo J Taylor Tanwi Mallick Weijie J. Su Camillo J. Taylor Tanwi Mallick |
![]() Transfer Q Star: Principled Decoding for LLM Alignment Souradip Chakraborty Soumya Suvra Ghosal Ming Yin Dinesh Manocha Mengdi Wang Amrit Singh Bedi Furong Huang |
![]() Multi-turn Reinforcement Learning from Preference Human Feedback Lior Shani Aviv Rosenberg Asaf B. Cassel Oran Lang Daniele Calandriello ...Bilal Piot Idan Szpektor Avinatan Hassidim Yossi Matias Rémi Munos |