ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.17282
49
0

Cracking the Code of Action: a Generative Approach to Affordances for Reinforcement Learning

24 April 2025
Lynn Cherif
Flemming Kondrup
David Venuto
Ankit Anand
Doina Precup
Khimya Khetarpal
    LM&Ro
ArXivPDFHTML
Abstract

Agents that can autonomously navigate the web through a graphical user interface (GUI) using a unified action space (e.g., mouse and keyboard actions) can require very large amounts of domain-specific expert demonstrations to achieve good performance. Low sample efficiency is often exacerbated in sparse-reward and large-action-space environments, such as a web GUI, where only a few actions are relevant in any given situation. In this work, we consider the low-data regime, with limited or no access to expert behavior. To enable sample-efficient learning, we explore the effect of constraining the action space through intent-based affordances\textit{intent-based affordances}intent-based affordances -- i.e., considering in any situation only the subset of actions that achieve a desired outcome. We propose Code as Generative Affordances\textbf{Code as Generative Affordances}Code as Generative Affordances (\textbf{\texttt{CoGA}}), a method that leverages pre-trained vision-language models (VLMs) to generate code that determines affordable actions through implicit intent-completion functions and using a fully-automated program generation and verification pipeline. These programs are then used in-the-loop of a reinforcement learning agent to return a set of affordances given a pixel observation. By greatly reducing the number of actions that an agent must consider, we demonstrate on a wide range of tasks in the MiniWob++ benchmark that: 1)\textbf{1)}1) CoGA\texttt{CoGA}CoGA is orders of magnitude more sample efficient than its RL agent, 2)\textbf{2)}2) CoGA\texttt{CoGA}CoGA's programs can generalize within a family of tasks, and 3)\textbf{3)}3) CoGA\texttt{CoGA}CoGA performs better or on par compared with behavior cloning when a small number of expert demonstrations is available.

View on arXiv
@article{cherif2025_2504.17282,
  title={ Cracking the Code of Action: a Generative Approach to Affordances for Reinforcement Learning },
  author={ Lynn Cherif and Flemming Kondrup and David Venuto and Ankit Anand and Doina Precup and Khimya Khetarpal },
  journal={arXiv preprint arXiv:2504.17282},
  year={ 2025 }
}
Comments on this paper