Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.16125
Cited By
SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
30 July 2023
Bohao Li
Rui Wang
Guangzhi Wang
Yuying Ge
Yixiao Ge
Ying Shan
MLLM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension"
16 / 66 papers shown
Title
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs
Rohit K Bharadwaj
Hanan Gani
Muzammal Naseer
Fahad Shahbaz Khan
Salman Khan
94
4
0
14 Jun 2024
Parrot: Multilingual Visual Instruction Tuning
Hai-Long Sun
Da-Wei Zhou
Yangfu Li
Shiyin Lu
Chao Yi
...
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
MLLM
73
10
0
04 Jun 2024
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
Zejun Li
Ruipu Luo
Jiwen Zhang
Minghui Qiu
Zhongyu Wei
Zhongyu Wei
LRM
MLLM
108
14
0
27 May 2024
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
Hongyu Wang
Jiayu Xu
Senwei Xie
Ruiping Wang
Jialin Li
Zhaojie Xie
Bin Zhang
Chuyan Xiong
Xilin Chen
ELM
VLM
LRM
112
6
0
24 May 2024
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
Xiyao Wang
Jiuhai Chen
Zhaoyang Wang
Yuhang Zhou
Yiyang Zhou
...
Dinesh Manocha
Tom Goldstein
Parminder Bhatia
Furong Huang
Cao Xiao
106
35
0
24 May 2024
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
Prannay Kaul
Zhizhong Li
Hao Yang
Yonatan Dukler
Ashwin Swaminathan
C. Taylor
Stefano Soatto
HILM
106
16
0
08 May 2024
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
An Yan
Zhengyuan Yang
Junda Wu
Wanrong Zhu
Jianwei Yang
...
Kevin Qinghong Lin
Jianfeng Wang
Julian McAuley
Jianfeng Gao
Lijuan Wang
LRM
54
12
0
25 Apr 2024
Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models
Yifan Li
Hangyu Guo
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
78
48
0
14 Mar 2024
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models
Xiujie Song
Mengyue Wu
Ke Zhu
Chunhao Zhang
Yanyi Chen
LRM
ELM
65
3
0
28 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
155
112
0
08 Feb 2024
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
Hongzhan Lin
Ziyang Luo
Bo Wang
Ruichao Yang
Jing Ma
74
28
0
03 Jan 2024
What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
Yifan Du
Hangyu Guo
Kun Zhou
Wayne Xin Zhao
Jinpeng Wang
Chuyuan Wang
Mingchen Cai
Ruihua Song
Ji-Rong Wen
VLM
MLLM
LRM
116
23
0
02 Nov 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
93
195
0
12 Jun 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
251
931
0
27 Apr 2023
Tag2Text: Guiding Vision-Language Model via Image Tagging
Xinyu Huang
Youcai Zhang
Jinyu Ma
Weiwei Tian
Rui Feng
Yuejie Zhang
Yaqian Li
Yandong Guo
Lei Zhang
CLIP
MLLM
VLM
3DV
80
74
0
10 Mar 2023
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
304
3,187
0
02 Dec 2016
Previous
1
2