ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.16796
  4. Cited By
MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions

MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions

24 February 2025
Yuxuan Liu
Hongda Sun
Wei Liu
Jian Luan
Bo Du
Rui Yan
ArXivPDFHTML

Papers citing "MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions"

42 / 42 papers shown
Title
MAPLE: A Mobile Agent with Persistent Finite State Machines for Structured Task Reasoning
MAPLE: A Mobile Agent with Persistent Finite State Machines for Structured Task Reasoning
Linqiang Guo
Wei Liu
Yi Wen Heng
Tse-Hsun
Chen Chen
Yang Wang
LLMAG
23
0
0
29 May 2025
The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News
The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News
Yuhan Liu
Yang Liu
Xiaoqing Zhang
Xiuying Chen
Rui Yan
LLMAG
107
2
0
13 May 2025
BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking
BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking
Yuxuan Liu
Hongda Sun
Wenya Guo
Xinyan Xiao
Cunli Mao
Zhengtao Yu
Rui Yan
95
3
0
22 Feb 2025
MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI
  Understanding
MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI Understanding
Qinzhuo Wu
Weikai Xu
Wei Liu
Tao Tan
Jianfeng Liu
Ang Li
Jian Luan
Bin Wang
Shuo Shang
VLM
69
14
0
23 Sep 2024
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
Shihan Deng
Weikai Xu
Hongda Sun
Wei Liu
Tao Tan
...
Ang Li
Jian Luan
Bin Wang
Rui Yan
Shuo Shang
LLMAG
64
13
0
01 Jul 2024
MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile
  LLM Agents
MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents
Luyuan Wang
Yongyu Deng
Yiwei Zha
Guodong Mao
Qinmin Wang
Tianchen Min
Wei Chen
Shoufa Chen
LLMAG
60
18
0
12 Jun 2024
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
  Navigation via Multi-Agent Collaboration
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
Junyang Wang
Haiyang Xu
Haitao Jia
Xi Zhang
Ming Yan
Weizhou Shen
Ji Zhang
Fei Huang
Jitao Sang
LM&Ro
LLMAG
66
67
0
03 Jun 2024
Facilitating Multi-Role and Multi-Behavior Collaboration of Large
  Language Models for Online Job Seeking and Recruiting
Facilitating Multi-Role and Multi-Behavior Collaboration of Large Language Models for Online Job Seeking and Recruiting
Hongda Sun
Hongzhan Lin
Haiyu Yan
Chen Zhu
Yang Song
Xin Gao
Shuo Shang
Rui Yan
43
8
0
28 May 2024
LLMs with Personalities in Multi-issue Negotiation Games
LLMs with Personalities in Multi-issue Negotiation Games
Sean Noh
Ho-Chun Herbert Chang
LLMAG
77
13
0
08 May 2024
Training a Vision Language Model as Smartphone Assistant
Training a Vision Language Model as Smartphone Assistant
Nicolai Dorka
Janusz Marecki
Ammar Anwar
43
3
0
12 Apr 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Jen-tse Huang
E. Li
Man Ho Lam
Tian Liang
Wenxuan Wang
Youliang Yuan
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Michael R. Lyu
ELM
LLMAG
122
37
0
18 Mar 2024
From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward
  Fake News
From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News
Yuhan Liu
Preslav Nakov
Xiaoqing Zhang
Xing Gao
Ji Zhang
Rui Yan
AI4CE
49
32
0
14 Mar 2024
Harnessing Multi-Role Capabilities of Large Language Models for
  Open-Domain Question Answering
Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering
Hongda Sun
Yuxuan Liu
Chengwei Wu
Haiyu Yan
Cheng Tai
Xin Gao
Shuo Shang
Rui Yan
56
10
0
08 Mar 2024
Android in the Zoo: Chain-of-Action-Thought for GUI Agents
Android in the Zoo: Chain-of-Action-Thought for GUI Agents
Jiwen Zhang
Jihao Wu
Yihua Teng
Minghui Liao
Nuo Xu
Xiao Xiao
Zhongyu Wei
Duyu Tang
LLMAG
LM&Ro
76
72
0
05 Mar 2024
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Gilles Baechler
Srinivas Sunkara
Maria Wang
Fedir Zubach
Hassan Mansoor
Vincent Etter
Victor Carbune
Jason Lin
Jindong Chen
Abhanshu Sharma
142
54
0
07 Feb 2024
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual
  Perception
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
Junyang Wang
Haiyang Xu
Jiabo Ye
Mingshi Yan
Weizhou Shen
Ji Zhang
Fei Huang
Jitao Sang
88
124
0
29 Jan 2024
Large Language Model based Multi-Agents: A Survey of Progress and
  Challenges
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
Taicheng Guo
Preslav Nakov
Yaqi Wang
Ruidi Chang
Shichao Pei
Nitesh Chawla
Olaf Wiest
Xiangliang Zhang
LLMAG
LM&Ro
AI4CE
LRM
118
301
0
21 Jan 2024
GPT-4V(ision) is a Generalist Web Agent, if Grounded
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Boyuan Zheng
Boyu Gou
Jihyung Kil
Huan Sun
Yu-Chuan Su
MLLM
VLM
LLMAG
85
252
0
03 Jan 2024
Intelligent Virtual Assistants with LLM-based Process Automation
Intelligent Virtual Assistants with LLM-based Process Automation
Yanchu Guan
Dong Wang
Zhixuan Chu
Shiyu Wang
Feiyue Ni
Ruihua Song
Longfei Li
Jinjie Gu
Chenyi Zhuang
58
21
0
04 Dec 2023
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone
  GUI Navigation
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
An Yan
Zhengyuan Yang
Wanrong Zhu
Kevin Qinghong Lin
Linjie Li
...
Yiwu Zhong
Julian McAuley
Jianfeng Gao
Zicheng Liu
Lijuan Wang
LLMAG
LM&Ro
114
110
0
13 Nov 2023
DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to
  Determinacy
DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy
Hongda Sun
Weikai Xu
Wei Liu
Jian Luan
Bin Wang
Shuo Shang
Ji-Rong Wen
Rui Yan
LRM
96
26
0
28 Oct 2023
On Generative Agents in Recommendation
On Generative Agents in Recommendation
An Zhang
Yuxin Chen
Leheng Sheng
Xiang Wang
Tat-Seng Chua
78
55
0
16 Oct 2023
Lyfe Agents: Generative agents for low-cost real-time social
  interactions
Lyfe Agents: Generative agents for low-cost real-time social interactions
Zhao Kaiya
Michelangelo Naim
J. Kondic
Manuel Cortes
Jiaxin Ge
Shuying Luo
Guangyu Robert Yang
Andrew Ahn
VLM
66
32
0
03 Oct 2023
You Only Look at Screens: Multimodal Chain-of-Action Agents
You Only Look at Screens: Multimodal Chain-of-Action Agents
Zhuosheng Zhang
Aston Zhang
LLMAG
LM&Ro
53
114
0
20 Sep 2023
AutoDroid: LLM-powered Task Automation in Android
AutoDroid: LLM-powered Task Automation in Android
Hao Wen
Yuanchun Li
Guohong Liu
Shanhui Zhao
Tao Yu
Toby Jia-Jun Li
Shiqi Jiang
Yunhao Liu
Yaqin Zhang
Yunxin Liu
75
92
0
29 Aug 2023
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Chi-Min Chan
Weize Chen
Yusheng Su
Jianxuan Yu
Wei Xue
Shan Zhang
Jie Fu
Zhiyuan Liu
ELM
LLMAG
ALM
79
489
0
14 Aug 2023
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
Sirui Hong
Mingchen Zhuge
Jonathan Chen
Xiawu Zheng
Yuheng Cheng
...
Liyang Zhou
Chenyu Ran
Lingfeng Xiao
Chenglin Wu
Jürgen Schmidhuber
LLMAG
AIFin
76
548
0
01 Aug 2023
Android in the Wild: A Large-Scale Dataset for Android Device Control
Android in the Wild: A Large-Scale Dataset for Android Device Control
Christopher Rawles
Alice Li
Daniel Rodriguez
Oriana Riva
Timothy Lillicrap
LM&Ro
77
159
0
19 Jul 2023
Improving Factuality and Reasoning in Language Models through Multiagent
  Debate
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAG
LRM
137
709
0
23 May 2023
DroidBot-GPT: GPT-powered UI Automation for Android
DroidBot-GPT: GPT-powered UI Automation for Android
Hao Wen
Hongmin Wang
Jiaxuan Liu
Yan Liang
LM&Ro
LM&MA
42
43
0
14 Apr 2023
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
374
1,907
0
07 Apr 2023
CAMEL: Communicative Agents for "Mind" Exploration of Large Language
  Model Society
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
Ge Li
Hasan Hammoud
Hani Itani
Dmitrii Khizbullin
Guohao Li
SyDa
ALM
111
488
0
31 Mar 2023
UGIF: UI Grounded Instruction Following
UGIF: UI Grounded Instruction Following
S. Venkatesh
Partha P. Talukdar
S. Narayanan
88
11
0
14 Nov 2022
Enabling Conversational Interaction with Mobile UI using Large Language
  Models
Enabling Conversational Interaction with Mobile UI using Large Language Models
Bryan Wang
Gang Li
Yang Li
203
136
0
18 Sep 2022
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
Liangtai Sun
Xingyu Chen
Lu Chen
Tianle Dai
Zichen Zhu
Kai Yu
LLMAG
53
59
0
23 May 2022
A Dataset for Interactive Vision-Language Navigation with Unknown
  Command Feasibility
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
73
65
0
04 Feb 2022
Learning UI Navigation through Demonstrations composed of Macro Actions
Learning UI Navigation through Demonstrations composed of Macro Actions
Wei Li
LLMAG
44
9
0
16 Oct 2021
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
192
156
0
07 Aug 2021
UIBert: Learning Generic Multimodal Representations for UI Understanding
UIBert: Learning Generic Multimodal Representations for UI Understanding
Chongyang Bai
Xiaoxue Zang
Ying Xu
Srinivas Sunkara
Abhinav Rastogi
Jindong Chen
Blaise Agüera y Arcas
58
94
0
29 Jul 2021
Screen Recognition: Creating Accessibility Metadata for Mobile
  Applications from Pixels
Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels
Xiaoyi Zhang
Lilian de Greef
Amanda Swearngin
Samuel White
Kyle I. Murray
...
Jeffrey Nichols
Jason Wu
Chris Fleizach
Aaron Everitt
Jeffrey P. Bigham
306
170
0
13 Jan 2021
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Toby Jia-Jun Li
Lindsay Popowski
Tom Michael Mitchell
Brad A. Myers
53
104
0
11 Jan 2021
Widget Captioning: Generating Natural Language Description for Mobile
  User Interface Elements
Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements
Yongqian Li
Gang Li
Luheng He
Jingjie Zheng
Hong Li
Zhiwei Guan
49
108
0
08 Oct 2020
1