ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17399
  4. Cited By
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow

FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow

23 May 2025
Haoyu Sun
Huichen Will Wang
Jiawei Gu
Linjie Li
Yu Cheng
    VLM
ArXivPDFHTML

Papers citing "FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow"

26 / 26 papers shown
Title
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
102
56
1
14 Apr 2025
Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping
Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping
Jingyu Xiao
Yuxuan Wan
Yintong Huo
Zihan Wang
Xinyi Xu
Wenxuan Wang
Zhiyao Xu
Yansen Wang
Michael R. Lyu
139
1
0
21 Feb 2025
Qwen2.5-VL Technical Report
Qwen2.5-VL Technical Report
S. Bai
Keqin Chen
Xuejing Liu
Jialin Wang
Wenbin Ge
...
Zesen Cheng
Hang Zhang
Zhibo Yang
Haiyang Xu
Junyang Lin
VLM
144
430
0
20 Feb 2025
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Zhenhailong Wang
Haiyang Xu
Junyang Wang
Xi Zhang
Ming Yan
Junxuan Zhang
Fei Huang
Heng Ji
95
18
0
20 Jan 2025
WebWalker: Benchmarking LLMs in Web Traversal
WebWalker: Benchmarking LLMs in Web Traversal
Jialong Wu
Wenbiao Yin
Yong Jiang
Zhenglin Wang
Zekun Xi
...
Linhai Zhang
Yulan He
Deyu Zhou
Pengjun Xie
Fei Huang
78
12
0
13 Jan 2025
Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark
Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark
Yunzhuo Hao
Jiawei Gu
Huichen Will Wang
Linjie Li
Zhiyong Yang
Lijuan Wang
Yu Cheng
LRM
57
30
0
10 Jan 2025
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Boyu Gou
Ruohan Wang
Boyuan Zheng
Yanan Xie
Cheng Chang
Yiheng Shu
Huan Sun
Yu Su
LM&Ro
LLMAG
93
72
0
07 Oct 2024
IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web
IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web
Hongcheng Guo
Wei Zhang
Junhao Chen
Yaonan Gu
Jian Yang
...
Binyuan Hui
Tianyu Liu
Jianxin Ma
Chang Zhou
Zhoujun Li
35
3
0
14 Sep 2024
WebQuest: A Benchmark for Multimodal QA on Web Page Sequences
WebQuest: A Benchmark for Multimodal QA on Web Page Sequences
Maria Wang
Srinivas Sunkara
Gilles Baechler
Jason Lin
Yun Zhu
Fedir Zubach
Lei Shu
Jindong Chen
LRM
LLMAG
48
2
0
06 Sep 2024
LLaVA-OneVision: Easy Visual Task Transfer
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li
Yuanhan Zhang
Dong Guo
Renrui Zhang
Feng Li
Hao Zhang
Kaichen Zhang
Yanwei Li
Ziwei Liu
Chunyuan Li
MLLM
SyDa
VLM
81
721
0
06 Aug 2024
OmniParser for Pure Vision Based GUI Agent
OmniParser for Pure Vision Based GUI Agent
Yadong Lu
Jianwei Yang
Yelong Shen
Ahmed Hassan Awadallah
MLLM
55
40
0
01 Aug 2024
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Ori Yoran
S. Amouyal
Chaitanya Malaviya
Ben Bogin
Ofir Press
Jonathan Berant
LLMAG
62
35
0
22 Jul 2024
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework
  for Multimodal LLMs
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Sukmin Yun
Haokun Lin
Rusiru Thushara
Mohammad Qazim Bhat
Yongxin Wang
...
Timothy Baldwin
Zhengzhong Liu
Eric P. Xing
Xiaodan Liang
Zhiqiang Shen
66
12
0
28 Jun 2024
GUICourse: From General Vision Language Models to Versatile GUI Agents
GUICourse: From General Vision Language Models to Versatile GUI Agents
Wentong Chen
Junbo Cui
Jinyi Hu
Yujia Qin
Junjie Fang
...
Yupeng Huo
Yuan Yao
Yankai Lin
Zhiyuan Liu
Maosong Sun
LLMAG
44
37
0
17 Jun 2024
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page
  Understanding and Grounding?
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?
Junpeng Liu
Yifan Song
Bill Yuchen Lin
Wai Lam
Graham Neubig
Yuanzhi Li
Xiang Yue
VLM
84
41
0
09 Apr 2024
Unlocking the conversion of Web Screenshots into HTML Code with the
  WebSight Dataset
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Hugo Laurençon
Léo Tronchon
Victor Sanh
VLM
62
37
0
14 Mar 2024
Cradle: Empowering Foundation Agents Towards General Computer Control
Cradle: Empowering Foundation Agents Towards General Computer Control
Weihao Tan
Wentao Zhang
Xinrun Xu
Haochong Xia
Ziluo Ding
...
Xinrun Wang
Börje F. Karlsson
Bo An
Shuicheng Yan
Zongqing Lu
48
25
0
05 Mar 2024
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Zhiyong Wu
Chengcheng Han
Zichen Ding
Zhenmin Weng
Zhoumianze Liu
Shunyu Yao
Tao Yu
Lingpeng Kong
LLMAG
LM&Ro
144
92
0
12 Feb 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Kanzhi Cheng
Qiushi Sun
Yougang Chu
Fangzhi Xu
Yantao Li
Jianbing Zhang
Zhiyong Wu
LLMAG
192
163
0
17 Jan 2024
GPT-4V(ision) is a Generalist Web Agent, if Grounded
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Boyuan Zheng
Boyu Gou
Jihyung Kil
Huan Sun
Yu-Chuan Su
MLLM
VLM
LLMAG
54
243
0
03 Jan 2024
Mind2Web: Towards a Generalist Agent for the Web
Mind2Web: Towards a Generalist Agent for the Web
Xiang Deng
Yu Gu
Boyuan Zheng
Shijie Chen
Samuel Stevens
Boshi Wang
Huan Sun
Yu-Chuan Su
LLMAG
58
448
0
09 Jun 2023
WebQA: Multihop and Multimodal QA
WebQA: Multihop and Multimodal QA
Yingshan Chang
M. Narang
Hisami Suzuki
Guihong Cao
Jianfeng Gao
Yonatan Bisk
LRM
32
81
0
01 Sep 2021
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
188
156
0
07 Aug 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
694
28,659
0
26 Feb 2021
WebSRC: A Dataset for Web-Based Structural Reading Comprehension
WebSRC: A Dataset for Web-Based Structural Reading Comprehension
Xingyu Chen
Zihan Zhao
Lu Chen
Danyang Zhang
Jiabao Ji
Ao Luo
Yuxuan Xiong
Kai Yu
RALM
50
91
0
23 Jan 2021
pix2code: Generating Code from a Graphical User Interface Screenshot
pix2code: Generating Code from a Graphical User Interface Screenshot
Tony Beltramelli
52
269
0
22 May 2017
1