ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.14573
  4. Cited By
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

23 May 2024
Christopher Rawles
Sarah Clinckemaillie
Yifan Chang
Jonathan Waltz
Gabrielle Lau
Marybeth Fair
Alice Li
Will Bishop
Wei Li
Folawiyo Campbell-Ajala
Daniel Toyama
Robert Berry
Divya Tyamagundlu
Timothy Lillicrap
Oriana Riva
    LLMAG
ArXivPDFHTML

Papers citing "AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents"

50 / 50 papers shown
Title
EcoAgent: An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation
EcoAgent: An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation
Biao Yi
Xavier Hu
Y. Chen
Shengyu Zhang
Hongxia Yang
Fan Wu
Fei Wu
LLMAG
161
0
0
08 May 2025
ScaleTrack: Scaling and back-tracking Automated GUI Agents
ScaleTrack: Scaling and back-tracking Automated GUI Agents
Jing Huang
Zhixiong Zeng
WenKang Han
Yufeng Zhong
Liming Zheng
Shuai Fu
Jingyuan Chen
Lin Ma
126
0
0
01 May 2025
AndroidGen: Building an Android Language Agent under Data Scarcity
AndroidGen: Building an Android Language Agent under Data Scarcity
Hanyu Lai
Junjie Gao
Xiao-Yang Liu
Y. Xu
S. Zhang
Yuxiao Dong
Jie Tang
LLMAG
74
0
0
27 Apr 2025
Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation
Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation
Zhiyuan Hu
Shiyun Xiong
Yifan Zhang
See-Kiong Ng
Anh Tuan Luu
Bo An
Shuicheng Yan
Bryan Hooi
41
0
0
22 Apr 2025
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners
Yuhang Liu
Pengxiang Li
C. Xie
Xavier Hu
Xiaotian Han
Shengyu Zhang
Hongxia Yang
Fei Wu
LLMAG
LM&Ro
LRM
AI4CE
72
1
0
19 Apr 2025
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
Bofei Zhang
Zirui Shang
Zhi Gao
Wang Zhang
Rui Xie
Xiaojian Ma
Tao Yuan
Xinxiao Wu
Song-Chun Zhu
Qing Li
LLMAG
35
1
0
17 Apr 2025
ViMo: A Generative Visual GUI World Model for App Agent
ViMo: A Generative Visual GUI World Model for App Agent
Dezhao Luo
Bohan Tang
Kang Li
Georgios Papoudakis
Jifei Song
S. Gong
Jianye Hao
Jun Wang
Kun Shao
LM&Ro
VGen
51
0
0
15 Apr 2025
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Junlei Zhang
Zichen Ding
Chang Ma
Zijie Chen
Qiushi Sun
Zhenzhong Lan
Junxian He
127
0
0
14 Apr 2025
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents
Saaket Agashe
Kyle Wong
Vincent Tu
Jiachen Yang
Ang Li
Xin Eric Wang
LLMAG
68
1
0
01 Apr 2025
Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study
Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study
Li Lyna Zhang
Longxi Gao
Mengwei Xu
LRM
37
0
0
21 Mar 2025
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment
Gaole Dai
Shiqi Jiang
Ting Cao
Yuanchun Li
Y. Yang
Rui Tan
Mo Li
Lili Qiu
46
0
0
20 Mar 2025
DeskVision: Large Scale Desktop Region Captioning for Advanced GUI Agents
Yibin Xu
Liang Yang
Hao Chen
Hua Wang
Zhi Chen
Yaohua Tang
3DV
58
0
0
14 Mar 2025
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
Yu Zhang
Shutong Qiao
Jiaqi Zhang
Tzu-Heng Lin
Chen Gao
Y. Li
LM&Ro
LM&MA
87
1
0
07 Mar 2025
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
Wenjia Jiang
Yangyang Zhuang
Chenxi Song
Xu Yang
Chi Zhang
Chi Zhang
LLMAG
96
1
0
04 Mar 2025
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Taiyi Wang
Zhihao Wu
Jianheng Liu
Jianye Hao
J. Wang
Kun Shao
OffRL
36
13
0
24 Feb 2025
AgentStudio: A Toolkit for Building General Virtual Agents
AgentStudio: A Toolkit for Building General Virtual Agents
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
80
14
0
17 Feb 2025
AppVLM: A Lightweight Vision Language Model for Online App Control
AppVLM: A Lightweight Vision Language Model for Online App Control
Georgios Papoudakis
Thomas Coste
Zhihao Wu
Jianye Hao
J. Wang
Kun Shao
49
1
0
10 Feb 2025
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Zhenhailong Wang
Haiyang Xu
Junyang Wang
Xi Zhang
Ming Yan
J. Zhang
Fei Huang
Heng Ji
43
9
0
20 Jan 2025
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection
Y. Liu
Pengxiang Li
Zishu Wei
C. Xie
Xueyu Hu
Xinchen Xu
Shengyu Zhang
Xiaotian Han
Hongxia Yang
Fei Wu
LLMAG
LRM
53
11
0
08 Jan 2025
Aria-UI: Visual Grounding for GUI Instructions
Aria-UI: Visual Grounding for GUI Instructions
Yuhao Yang
Yue Wang
Dongxu Li
Ziyang Luo
Bei Chen
C. Huang
Junnan Li
LM&Ro
LLMAG
106
14
0
20 Dec 2024
Falcon-UI: Understanding GUI Before Following User Instructions
Falcon-UI: Understanding GUI Before Following User Instructions
Huawen Shen
Chang-Shu Liu
Gengluo Li
Xinlong Wang
Yu Zhou
Can Ma
Xiangyang Ji
LLMAG
83
4
0
12 Dec 2024
GUI Agents with Foundation Models: A Comprehensive Survey
GUI Agents with Foundation Models: A Comprehensive Survey
Shuai Wang
W. Liu
Jingxuan Chen
Weinan Gan
Xingshan Zeng
...
Bin Wang
Chuhan Wu
Yasheng Wang
Ruiming Tang
Jianye Hao
LLMAG
68
14
0
07 Nov 2024
Foundations and Recent Trends in Multimodal Mobile Agents: A Survey
Foundations and Recent Trends in Multimodal Mobile Agents: A Survey
Biao Wu
Yanda Li
Meng Fang
Zirui Song
Zhiwei Zhang
Yunchao Wei
L. Chen
LM&Ro
LLMAG
OffRL
AI4TS
41
4
0
04 Nov 2024
AutoGLM: Autonomous Foundation Agents for GUIs
AutoGLM: Autonomous Foundation Agents for GUIs
Xiao Liu
Bo Qin
Dongzhu Liang
Guang Dong
Hanyu Lai
...
Yujia Wang
Y. Xu
Zehan Qi
Yuxiao Dong
Jie Tang
LLMAG
59
11
0
28 Oct 2024
OSCAR: Operating System Control via State-Aware Reasoning and
  Re-Planning
OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning
Xiaoqiang Wang
Bang Liu
LLMAG
LM&Ro
LRM
31
6
0
24 Oct 2024
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Zhangheng Li
Keen You
H. Zhang
Di Feng
Harsh Agrawal
Xiujun Li
Mohana Prasad Sathya Moorthy
Jeff Nichols
Y. Yang
Zhe Gan
MLLM
57
18
0
24 Oct 2024
MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile
  Device Control
MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Juyong Lee
Dongyoon Hahm
June Suk Choi
W. Bradley Knox
Kimin Lee
LLMAG
ELM
AAML
LM&Ro
43
2
0
23 Oct 2024
Lightweight Neural App Control
Lightweight Neural App Control
Filippos Christianos
Georgios Papoudakis
Thomas Coste
Jianye Hao
Jun Wang
Kun Shao
LM&Ro
52
4
0
23 Oct 2024
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Jingxuan Chen
Derek Yuen
Bin Xie
Y. Yang
Gongwei Chen
...
Liqiang Nie
Yasheng Wang
Jianye Hao
Jun Wang
Kun Shao
LLMAG
45
5
0
19 Oct 2024
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Saaket Agashe
Jiuzhou Han
Shuyu Gan
Jiachen Yang
Ang Li
Xin Eric Wang
LLMAG
LM&Ro
AIFin
39
20
0
10 Oct 2024
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Boyu Gou
Ruohan Wang
Boyuan Zheng
Yanan Xie
Cheng Chang
Yiheng Shu
Huan Sun
Yu Su
LM&Ro
LLMAG
76
49
0
07 Oct 2024
Turn Every Application into an Agent: Towards Efficient
  Human-Agent-Computer Interaction with API-First LLM-Based Agents
Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents
Junting Lu
Zhiyang Zhang
Fangkai Yang
Jue Zhang
Lu Wang
Chao Du
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
LLMAG
28
1
0
25 Sep 2024
MobileViews: A Large-Scale Mobile GUI Dataset
MobileViews: A Large-Scale Mobile GUI Dataset
Longxi Gao
Li Zhang
Shihe Wang
Shangguang Wang
Yuanchun Li
Mengwei Xu
28
5
0
22 Sep 2024
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Rogerio Bonatti
Dan Zhao
Francesco Bonacci
Dillon Dupont
Sara Abdali
...
Justin Wagle
K. Koishida
A. Bucker
Lawrence Jang
Zack Hui
LLMAG
43
26
0
12 Sep 2024
Agent Workflow Memory
Agent Workflow Memory
Zora Zhiruo Wang
Jiayuan Mao
Daniel Fried
Graham Neubig
LLMAG
38
21
0
11 Sep 2024
TinyAgent: Function Calling at the Edge
TinyAgent: Function Calling at the Edge
Lutfi Eren Erdogan
Nicholas Lee
Siddharth Jha
Sehoon Kim
Ryan Tabrizi
Suhong Moon
Coleman Hooper
Gopala Anumanchipalli
Kurt Keutzer
Amir Gholami
LLMAG
39
11
0
01 Sep 2024
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Ori Yoran
S. Amouyal
Chaitanya Malaviya
Ben Bogin
Ofir Press
Jonathan Berant
LLMAG
39
31
0
22 Jul 2024
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work
  Tasks?
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Alexandre Drouin
Maxime Gasse
Massimo Caccia
I. Laradji
Manuel Del Verme
...
Megh Thakkar
Quentin Cappart
David Vazquez
Nicolas Chapados
Alexandre Lacoste
LLMAG
51
53
0
12 Mar 2024
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist
  Autonomous Agents for Desktop and Web
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor
Y. Butala
M. Russak
Jing Yu Koh
Kiran Kamble
Waseem Alshikh
Ruslan Salakhutdinov
LLMAG
51
44
0
27 Feb 2024
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Zhiyong Wu
Chengcheng Han
Zichen Ding
Zhenmin Weng
Zhoumianze Liu
Shunyu Yao
Tao Yu
Lingpeng Kong
LLMAG
LM&Ro
123
83
0
12 Feb 2024
UFO: A UI-Focused Agent for Windows OS Interaction
UFO: A UI-Focused Agent for Windows OS Interaction
Chaoyun Zhang
Liqun Li
Shilin He
Xu Zhang
Bo Qiao
...
Yu Kang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
LLMAG
64
67
0
08 Feb 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Kanzhi Cheng
Qiushi Sun
Yougang Chu
Fangzhi Xu
Yantao Li
Jianbing Zhang
Zhiyong Wu
LLMAG
172
138
0
17 Jan 2024
CogAgent: A Visual Language Model for GUI Agents
CogAgent: A Visual Language Model for GUI Agents
Wenyi Hong
Weihan Wang
Qingsong Lv
Jiazheng Xu
Wenmeng Yu
...
Juanzi Li
Bin Xu
Yuxiao Dong
Ming Ding
Jie Tang
MLLM
142
319
0
14 Dec 2023
A Zero-Shot Language Agent for Computer Control with Structured
  Reflection
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Tao Li
Gang Li
Zhiwei Deng
Bryan Wang
Yang Li
LM&Ro
LLMAG
57
23
0
12 Oct 2023
Mobile-Env: Building Qualified Evaluation Benchmarks for LLM-GUI
  Interaction
Mobile-Env: Building Qualified Evaluation Benchmarks for LLM-GUI Interaction
Danyang Zhang
Zhennan Shen
Rui Xie
Situo Zhang
Tianbao Xie
...
Siyuan Chen
Lu Chen
Hongshen Xu
Ruisheng Cao
Kai Yu
ELM
LLMAG
32
3
0
14 May 2023
Vision-Language Models as Success Detectors
Vision-Language Models as Success Detectors
Yuqing Du
Ksenia Konyushkova
Misha Denil
A. Raju
Jessica Landon
Felix Hill
Nando de Freitas
Serkan Cabi
MLLM
LRM
89
77
0
13 Mar 2023
Understanding HTML with Large Language Models
Understanding HTML with Large Language Models
Izzeddin Gur
Ofir Nachum
Yingjie Miao
Mustafa Safdari
Austin Huang
Aakanksha Chowdhery
Sharan Narang
Noah Fiedel
Aleksandra Faust
AI4CE
136
70
0
08 Oct 2022
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
240
2,479
0
06 Oct 2022
Enabling Conversational Interaction with Mobile UI using Large Language
  Models
Enabling Conversational Interaction with Mobile UI using Large Language Models
Bryan Wang
Gang Li
Yang Li
175
132
0
18 Sep 2022
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
279
1,121
0
18 Apr 2021
1