ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.04040
36
0

ADAPT: Actively Discovering and Adapting to Preferences for any Task

5 April 2025
Maithili Patel
Xavier Puig
Ruta Desai
Roozbeh Mottaghi
Sonia Chernova
Joanne Truong
Akshara Rai
ArXivPDFHTML
Abstract

Assistive agents should be able to perform under-specified long-horizon tasks while respecting user preferences. We introduce Actively Discovering and Adapting to Preferences for any Task (ADAPT) -- a benchmark designed to evaluate agents' ability to adhere to user preferences across various household tasks through active questioning. Next, we propose Reflection-DPO, a novel training approach for adapting large language models (LLMs) to the task of active questioning. Reflection-DPO finetunes a 'student' LLM to follow the actions of a privileged 'teacher' LLM, and optionally ask a question to gather necessary information to better predict the teacher action. We find that prior approaches that use state-of-the-art LLMs fail to sufficiently follow user preferences in ADAPT due to insufficient questioning and poor adherence to elicited preferences. In contrast, Reflection-DPO achieves a higher rate of satisfying user preferences, outperforming a zero-shot chain-of-thought baseline by 6.1% on unseen users.

View on arXiv
@article{patel2025_2504.04040,
  title={ ADAPT: Actively Discovering and Adapting to Preferences for any Task },
  author={ Maithili Patel and Xavier Puig and Ruta Desai and Roozbeh Mottaghi and Sonia Chernova and Joanne Truong and Akshara Rai },
  journal={arXiv preprint arXiv:2504.04040},
  year={ 2025 }
}
Comments on this paper