Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.11029
Cited By
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
23 May 2022
Liangtai Sun
Xingyu Chen
Lu Chen
Tianle Dai
Zichen Zhu
Kai Yu
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI"
15 / 15 papers shown
Title
XBOUND: Exploring the Capability Boundaries of Device-Control Agents through Trajectory Tree Exploration
Shaoqing Zhang
Kehai Chen
Zhuosheng Zhang
Rumei Li
Rongxiang Weng
Yang Xiang
Liqiang Nie
Min Zhang
5
0
0
27 May 2025
X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System
Peng Wang
Ruihan Tao
Qiguang Chen
Mengkang Hu
Libo Qin
LLMAG
33
0
0
21 May 2025
MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions
Yuxuan Liu
Hongda Sun
Wei Liu
Jian Luan
Bo Du
Rui Yan
69
3
0
24 Feb 2025
Large Language Models Empowered Personalized Web Agents
Hongru Cai
Yongqi Li
Wenjie Wang
Fengbin Zhu
Xiaoyu Shen
Wenjie Li
Tat-Seng Chua
LLMAG
71
12
0
22 Oct 2024
Benchmarking Mobile Device Control Agents across Diverse Configurations
Juyong Lee
Taywon Min
Minyong An
Changyeon Kim
Kimin Lee
49
10
0
25 Apr 2024
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Kevin Xu
Yeganeh Kordi
Kate Sanders
Yizhong Wang
Adam Byerly
Kate Sanders
Adam Byerly
Jingyu Zhang
Benjamin Van Durme
Daniel Khashabi
LLMAG
75
6
0
18 Mar 2024
Android in the Zoo: Chain-of-Action-Thought for GUI Agents
Jiwen Zhang
Jihao Wu
Yihua Teng
Minghui Liao
Nuo Xu
Xiao Xiao
Zhongyu Wei
Duyu Tang
LLMAG
LM&Ro
55
58
0
05 Mar 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
39
63
0
08 Feb 2024
Dual-View Visual Contextualization for Web Navigation
Jihyung Kil
Chan Hee Song
Boyuan Zheng
Xiang Deng
Yu-Chuan Su
Wei-Lun Chao
EgoV
29
14
0
06 Feb 2024
Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?
Sang-Woo Lee
Sungdong Kim
Donghyeon Ko
Dong-hyun Ham
Youngki Hong
...
Wangkyo Jung
Kyunghyun Cho
Donghyun Kwak
H. Noh
W. Park
58
2
0
20 Dec 2022
DFM: Dialogue Foundation Model for Universal Large-Scale Dialogue-Oriented Task Learning
Zhi Chen
Jijia Bao
Lu Chen
Yuncong Liu
Da Ma
...
Xinhsuai Dong
Fujiang Ge
Qingliang Miao
Jian-Guang Lou
Kai Yu
ALM
AI4CE
58
3
0
25 May 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
161
506
0
29 Dec 2020
FLIN: A Flexible Natural Language Interface for Web Navigation
Sahisnu Mazumder
Oriana Riva
LRM
62
23
0
24 Oct 2020
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
Hung Le
Doyen Sahoo
Nancy F. Chen
Guosheng Lin
73
30
0
20 Oct 2020
Task-Oriented Dialogue as Dataflow Synthesis
Semantic Machines
Jacob Andreas
J. Bufe
David Burkett
Charles C. Chen
...
Izabela Witoszko
Jason Wolfe
A. Wray
Yuchen Zhang
Alexander Zotov
AIFin
200
154
0
24 Sep 2020
1