ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.14603
  4. Cited By
UFO2: The Desktop AgentOS
v1v2 (latest)

UFO2: The Desktop AgentOS

20 April 2025
Chaoyun Zhang
He Huang
Chiming Ni
J. Mu
Si Qin
Shilin He
Lu Wang
Fangkai Yang
Pu Zhao
Chao Du
Liqun Li
Yu Kang
Zhao Jiang
Suzhen Zheng
Rujia Wang
Jiaxu Qian
Minghua Ma
Jian-Guang Lou
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
    LLMAG
ArXiv (abs)PDFHTML

Papers citing "UFO2: The Desktop AgentOS"

38 / 38 papers shown
Title
BIMgent: Towards Autonomous Building Modeling via Computer-use Agents
BIMgent: Towards Autonomous Building Modeling via Computer-use Agents
Zihan Deng
Changyu Du
Stavros Nousias
A. Borrmann
LM&RoAI4CE
15
0
0
08 Jun 2025
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Qianhui Wu
Kanzhi Cheng
Rui Yang
Chaoyun Zhang
Jianwei Yang
...
Huan Zhang
Tong Zhang
Jianbing Zhang
Dongmei Zhang
J. Gao
LM&Ro
57
0
0
03 Jun 2025
Text2Grad: Reinforcement Learning from Natural Language Feedback
Text2Grad: Reinforcement Learning from Natural Language Feedback
Hanyang Wang
Lu Wang
Chaoyun Zhang
Tianjun Mao
Si Qin
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
74
0
0
28 May 2025
ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World
ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World
Runliang Niu
Jinglong Ji
Yi Chang
Qi Wang
50
0
0
25 May 2025
LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOS
LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOS
Kai Mei
Xi Zhu
Hang Gao
Shuhang Lin
Yongfeng Zhang
198
0
0
24 May 2025
API Agents vs. GUI Agents: Divergence and Convergence
API Agents vs. GUI Agents: Divergence and Convergence
Chaoyun Zhang
Shilin He
Liqun Li
Si Qin
Yu Kang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
LLMAG
139
3
0
14 Mar 2025
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Yujia Qin
Yining Ye
Junjie Fang
Han Wang
Shihao Liang
...
Haifeng Liu
F. Lin
Tao Peng
Xin Liu
Guang Shi
LLMAGLM&Ro
104
69
0
21 Jan 2025
Beyond Browsing: API-Based Web Agents
Beyond Browsing: API-Based Web Agents
Yueqi Song
Frank F. Xu
Shuyan Zhou
Graham Neubig
133
23
0
21 Oct 2024
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Saaket Agashe
Jiuzhou Han
Shuyu Gan
Jiachen Yang
Ang Li
Xin Eric Wang
LLMAGLM&RoAIFin
103
38
0
10 Oct 2024
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents
Ido Levy
Ben Wiesel
Sami Marreed
Alon Oved
Avi Yaeli
Segev Shlomov
LLMAG
131
23
0
09 Oct 2024
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Boyu Gou
Ruohan Wang
Boyuan Zheng
Yanan Xie
Cheng Chang
Yiheng Shu
Huan Sun
Yu Su
LM&RoLLMAG
245
96
0
07 Oct 2024
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Rogerio Bonatti
Dan Zhao
Francesco Bonacci
Dillon Dupont
Sara Abdali
...
Justin Wagle
K. Koishida
A. Bucker
Lawrence Jang
Zack Hui
LLMAG
118
45
0
12 Sep 2024
OmniParser for Pure Vision Based GUI Agent
OmniParser for Pure Vision Based GUI Agent
Yadong Lu
Jianwei Yang
Yelong Shen
Ahmed Hassan Awadallah
MLLM
91
53
0
01 Aug 2024
Large Language Models can Deliver Accurate and Interpretable Time Series
  Anomaly Detection
Large Language Models can Deliver Accurate and Interpretable Time Series Anomaly Detection
Jun Liu
Chaoyun Zhang
Jiaxu Qian
Ming-Jie Ma
Si Qin
Chetan Bansal
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
AI4TS
80
11
0
24 May 2024
A Survey on the Memory Mechanism of Large Language Model based Agents
A Survey on the Memory Mechanism of Large Language Model based Agents
Zeyu Zhang
Xiaohe Bo
Chen Ma
Rui Li
Xu Chen
Quanyu Dai
Jieming Zhu
Zhenhua Dong
Ji-Rong Wen
LLMAGKELM
95
143
0
21 Apr 2024
AIOS: LLM Agent Operating System
AIOS: LLM Agent Operating System
Kai Mei
Zelong Li
Wujiang Xu
Wenyue Hua
Mingyu Jin
Yongfeng Zhang
Shuyuan Xu
Ruosong Ye
Yingqiang Ge
Yongfeng Zhang
LLMAG
145
25
0
25 Mar 2024
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM
  Evaluation
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation
Siyuan Wang
Zhuohan Long
Zhihao Fan
Zhongyu Wei
Xuanjing Huang
LLMAG
94
38
0
18 Feb 2024
ScreenAgent: A Vision Language Model-driven Computer Control Agent
ScreenAgent: A Vision Language Model-driven Computer Control Agent
Runliang Niu
Jindong Li
Shiqi Wang
Yali Fu
Xiyu Hu
Xueyuan Leng
He Kong
Yi Chang
Qi Wang
LLMAGMLLMLM&Ro
122
47
0
09 Feb 2024
UFO: A UI-Focused Agent for Windows OS Interaction
UFO: A UI-Focused Agent for Windows OS Interaction
Chaoyun Zhang
Liqun Li
Shilin He
Xu Zhang
Bo Qiao
...
Yu Kang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
LLMAG
166
83
0
08 Feb 2024
LLM Multi-Agent Systems: Challenges and Open Problems
LLM Multi-Agent Systems: Challenges and Open Problems
Shanshan Han
Qifan Zhang
Yuhang Yao
Weizhao Jin
Zhaozhuo Xu
LLMAG
91
47
0
05 Feb 2024
MM-LLMs: Recent Advances in MultiModal Large Language Models
MM-LLMs: Recent Advances in MultiModal Large Language Models
Duzhen Zhang
Yahan Yu
Jiahua Dong
Chenxing Li
Dan Su
Chenhui Chu
Dong Yu
OffRLLRM
164
216
0
24 Jan 2024
In-context Learning with Retrieved Demonstrations for Language Models: A
  Survey
In-context Learning with Retrieved Demonstrations for Language Models: A Survey
an Luo
Xin Xu
Yue Liu
Panupong Pasupat
Mehran Kazemi
RALM
136
70
0
21 Jan 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Kanzhi Cheng
Qiushi Sun
Yougang Chu
Fangzhi Xu
Yantao Li
Jianbing Zhang
Zhiyong Wu
LLMAG
275
189
0
17 Jan 2024
Xpert: Empowering Incident Management with Query Recommendations via
  Large Language Models
Xpert: Empowering Incident Management with Query Recommendations via Large Language Models
Yuxuan Jiang
Chaoyun Zhang
Shilin He
Zhihao Yang
Ming-Jie Ma
...
Yu Kang
Yingnong Dang
Saravan Rajmohan
Qingwei Lin
Dongmei Zhang
90
21
0
19 Dec 2023
Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation for Large Language Models: A Survey
Yunfan Gao
Yun Xiong
Xinyu Gao
Kangxiang Jia
Jinliu Pan
Yuxi Bi
Yi Dai
Jiawei Sun
Meng Wang
Haofen Wang
3DVRALM
277
1,831
1
18 Dec 2023
LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent
  Ecosystem
LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem
Yingqiang Ge
Yujie Ren
Wenyue Hua
Shuyuan Xu
Juntao Tan
Yongfeng Zhang
LLMAG
68
30
0
06 Dec 2023
TaskWeaver: A Code-First Agent Framework
TaskWeaver: A Code-First Agent Framework
Bo Qiao
Liqun Li
Xu Zhang
Shilin He
Yu Kang
...
Chao Du
Yong Xu
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
LLMAG
92
42
0
29 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision
  Tasks
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
109
174
0
10 Nov 2023
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought
  Generation
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation
Ruomeng Ding
Chaoyun Zhang
Lu Wang
Yong Xu
Ming-Jie Ma
Wei Zhang
Si Qin
Saravan Rajmohan
Qingwei Lin
Dongmei Zhang
LRM
107
68
0
07 Nov 2023
CogVLM: Visual Expert for Pretrained Language Models
CogVLM: Visual Expert for Pretrained Language Models
Weihan Wang
Qingsong Lv
Wenmeng Yu
Wenyi Hong
Ji Qi
...
Bin Xu
Juanzi Li
Yuxiao Dong
Ming Ding
Jie Tang
VLMMLLM
153
517
0
06 Nov 2023
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Jianwei Yang
Hao Zhang
Feng Li
Xueyan Zou
Chun-yue Li
Jianfeng Gao
MLLMVLM
118
188
0
17 Oct 2023
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Zhengyuan Yang
Linjie Li
Kevin Qinghong Lin
Jianfeng Wang
Chung-Ching Lin
Nasim Shakouri Mahmoudabadi
Lijuan Wang
LM&MA
92
646
0
29 Sep 2023
A Survey on Large Language Model based Autonomous Agents
A Survey on Large Language Model based Autonomous Agents
Lei Wang
Chengbang Ma
Xueyang Feng
Zeyu Zhang
Hao-ran Yang
...
Xu Chen
Yankai Lin
Wayne Xin Zhao
Zhewei Wei
Ji-Rong Wen
LLMAGAI4CELM&Ro
124
1,321
0
22 Aug 2023
Real-Time Flying Object Detection with YOLOv8
Real-Time Flying Object Detection with YOLOv8
Dillon Reis
Jordan Kupec
Jacqueline Hong
Ahmad Daoudi
ObjD
86
462
0
17 May 2023
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAGReLMLRM
460
2,990
0
06 Oct 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning
  Work?
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAGLRM
193
1,502
0
25 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
897
9,752
0
28 Jan 2022
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,343
0
27 Aug 2019
1