Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.04869
Cited By
ILuvUI: Instruction-tuned LangUage-Vision modeling of UIs from Machine Conversations
7 October 2023
Yue Jiang
E. Schoop
Amanda Swearngin
Jeffrey Nichols
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ILuvUI: Instruction-tuned LangUage-Vision modeling of UIs from Machine Conversations"
16 / 16 papers shown
Title
A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models
Liangbo Ning
Ziran Liang
Zhuohang Jiang
Haohao Qu
Yujuan Ding
...
Xiao Wei
Shanru Lin
Hui Liu
Philip S. Yu
Qing Li
LLMAG
LM&Ro
91
6
0
30 Mar 2025
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection
Yunxing Liu
Pengxiang Li
Zishu Wei
C. Xie
Xueyu Hu
Xinchen Xu
Shengyu Zhang
Xiaotian Han
Hongxia Yang
Fei Wu
LLMAG
LRM
55
11
0
08 Jan 2025
GUI Agents with Foundation Models: A Comprehensive Survey
Shuai Wang
Wei Liu
Jingxuan Chen
Weinan Gan
Xingshan Zeng
...
Bin Wang
Chuhan Wu
Yasheng Wang
Ruiming Tang
Jianye Hao
LLMAG
76
15
0
07 Nov 2024
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation
Yi-Hao Peng
Faria Huq
Yue Jiang
Jason Wu
Amanda Li
Jeffrey P. Bigham
Amy Pavel
DiffM
35
4
0
30 Sep 2024
Inferring Alt-text For UI Icons With Large Language Models During App Development
Sabrina Haque
Christoph Csallner
VLM
36
0
0
26 Sep 2024
Graph4GUI: Graph Neural Networks for Representing Graphical User Interfaces
Yue Jiang
Changkong Zhou
Vikas Garg
Antti Oulasvirta
41
7
0
21 Apr 2024
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Keen You
Haotian Zhang
E. Schoop
Floris Weers
Amanda Swearngin
Jeffrey Nichols
Yinfei Yang
Zhe Gan
MLLM
47
84
0
08 Apr 2024
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELM
VLM
44
102
0
20 Feb 2024
Visual Instruction Tuning towards General-Purpose Multimodal Model: A Survey
Jiaxing Huang
Jingyi Zhang
Kai Jiang
Han Qiu
Shijian Lu
44
22
0
27 Dec 2023
AXNav: Replaying Accessibility Tests from Natural Language
Maryam Taeb
Amanda Swearngin
E. Schoop
Ruijia Cheng
Yue Jiang
Jeffrey Nichols
34
37
0
03 Oct 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
165
579
0
06 Apr 2023
Enabling Conversational Interaction with Mobile UI using Large Language Models
Bryan Wang
Gang Li
Yang Li
181
132
0
18 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
366
12,003
0
04 Mar 2022
ReverseORC: Reverse Engineering of Resizable User Interface Layouts with OR-Constraints
Yue Jiang
W. Stuerzlinger
C. Lutteroth
23
18
0
23 Feb 2022
Screen Parsing: Towards Reverse Engineering of UI Models from Screenshots
Jason Wu
Xiaoyi Zhang
Jeffrey Nichols
Jeffrey P. Bigham
3DV
163
71
0
17 Sep 2021
Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels
Xiaoyi Zhang
Lilian de Greef
Amanda Swearngin
Samuel White
Kyle I. Murray
...
Jeffrey Nichols
Jason Wu
Chris Fleizach
Aaron Everitt
Jeffrey P. Bigham
220
167
0
13 Jan 2021
1