Title |
---|
![]() Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning Jared Joselowitz Arjun Jagota Satyapriya Krishna Sonali Parbhoo Nyal Patel Satyapriya Krishna Sonali Parbhoo |
![]() Demonstration Based Explainable AI for Learning from Demonstration
Methods Morris Gu Elizabeth Croft Dana Kulic |
![]() Attribute Controlled Fine-tuning for Large Language Models: A Case Study
on Detoxification Tao Meng Ninareh Mehrabi Palash Goyal Anil Ramakrishna Aram Galstyan Richard Zemel Kai-Wei Chang Rahul Gupta Charith Peris |