Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.10956
Cited By
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
15 July 2024
Ruisheng Cao
Fangyu Lei
Haoyuan Wu
Jixuan Chen
Yeqiao Fu
Hongcheng Gao
Xinzhuang Xiong
Hanchong Zhang
Yuchen Mao
Wenjing Hu
Tianbao Xie
Hongshen Xu
Danyang Zhang
Sida Wang
Ruoxi Sun
Pengcheng Yin
Caiming Xiong
Ansong Ni
Qian Liu
Victor Zhong
Lu Chen
Kai Yu
Tao Yu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?"
11 / 11 papers shown
Title
Visual Test-time Scaling for GUI Agent Grounding
Tiange Luo
Lajanugen Logeswaran
Justin Johnson
Honglak Lee
51
0
0
01 May 2025
Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs
Paiheng Xu
Gang Wu
Xiang Chen
Tong Yu
Chang Xiao
Franck Dernoncourt
Dinesh Manocha
Wei Ai
Viswanathan Swaminathan
OffRL
52
1
0
29 Apr 2025
ELT-Bench: An End-to-End Benchmark for Evaluating AI Agents on ELT Pipelines
Tengjun Jin
Yuxuan Zhu
Daniel Kang
LMTD
ELM
47
0
0
07 Apr 2025
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Fangyu Lei
Jixuan Chen
Yuxiao Ye
Ruisheng Cao
Dongchan Shin
...
Caiming Xiong
Ruoxi Sun
Qian Liu
Sida I. Wang
Tao Yu
LMTD
79
21
0
12 Nov 2024
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
Boyu Gou
Ruohan Wang
Boyuan Zheng
Yanan Xie
Cheng Chang
Yiheng Shu
Huan Sun
Yu Su
LM&Ro
LLMAG
78
49
0
07 Oct 2024
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Teng Wang
Zhenqi He
Wing-Yin Yu
Xiaojin Fu
Xiongwei Han
LRM
50
5
0
17 Sep 2024
Do Multimodal Foundation Models Understand Enterprise Workflows? A Benchmark for Business Process Management Tasks
Michael Wornow
A. Narayan
Ben T Viggiano
Ishan S. Khare
Tathagat Verma
...
Joshua Martinez
Vardhan Agrawal
Althea Hudson
N. Shah
Christopher Ré
37
4
0
19 Jun 2024
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
John Yang
Carlos E. Jimenez
Alexander Wettig
K. Lieret
Shunyu Yao
Karthik Narasimhan
Ofir Press
LLMAG
103
194
0
06 May 2024
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Alexandre Drouin
Maxime Gasse
Massimo Caccia
I. Laradji
Manuel Del Verme
...
Megh Thakkar
Quentin Cappart
David Vazquez
Nicolas Chapados
Alexandre Lacoste
LLMAG
51
53
0
12 Mar 2024
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web
Raghav Kapoor
Y. Butala
M. Russak
Jing Yu Koh
Kiran Kamble
Waseem Alshikh
Ruslan Salakhutdinov
LLMAG
51
44
0
27 Feb 2024
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Zhiyong Wu
Chengcheng Han
Zichen Ding
Zhenmin Weng
Zhoumianze Liu
Shunyu Yao
Tao Yu
Lingpeng Kong
LLMAG
LM&Ro
132
83
0
12 Feb 2024
1