ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.07536
  4. Cited By
A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual
  Question Answering
v1v2 (latest)

A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering

13 November 2023
Yunxin Li
Longyue Wang
Baotian Hu
Xinyu Chen
Wanqi Zhong
Chenyang Lyu
Wei Wang
Min Zhang
    ELM
ArXiv (abs)PDFHTML

Papers citing "A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering"

16 / 16 papers shown
Title
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic
  Manipulation
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
Kai Zhang
Pengzhen Ren
Bingqian Lin
Junfan Lin
Shikui Ma
Hang Xu
Xiaodan Liang
61
2
0
14 Oct 2024
Explore the Hallucination on Low-level Perception for MLLMs
Explore the Hallucination on Low-level Perception for MLLMs
Yinan Sun
Zicheng Zhang
H. Wu
Xiaohong Liu
Weisi Lin
Guangtao Zhai
Xiongkuo Min
82
2
0
15 Sep 2024
VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
VideoVista: A Versatile Benchmark for Video Understanding and Reasoning
Yunxin Li
Xinyu Chen
Baotian Hu
Longyue Wang
Haoyuan Shi
Min Zhang
MLLMLRM
167
38
0
17 Jun 2024
INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance
  in Insurance
INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance
Chenwei Lin
Hanjia Lyu
Xian Xu
Jiebo Luo
69
2
0
13 Jun 2024
An Early Investigation into the Utility of Multimodal Large Language
  Models in Medical Imaging
An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging
Sulaiman Khan
Md. Rafiul Biswas
Alina Murad
Hazrat Ali
Zubair Shah
91
4
0
02 Jun 2024
Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs
Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs
Jialiang Xu
Michael Moor
J. Leskovec
66
3
0
29 May 2024
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Yunxin Li
Shenyuan Jiang
Baotian Hu
Longyue Wang
Wanqi Zhong
Wenhan Luo
Lin Ma
Min Zhang
MoE
111
42
0
18 May 2024
VisionGraph: Leveraging Large Multimodal Models for Graph Theory
  Problems in Visual Context
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
Yunxin Li
Baotian Hu
Haoyuan Shi
Wei Wang
Longyue Wang
Min Zhang
LRM
68
16
0
08 May 2024
Comp4D: LLM-Guided Compositional 4D Scene Generation
Comp4D: LLM-Guided Compositional 4D Scene Generation
Dejia Xu
Hanwen Liang
N. Bhatt
Hezhen Hu
Hanxue Liang
Konstantinos N. Plataniotis
Zhangyang Wang
87
27
0
25 Mar 2024
Benchmarking LLMs via Uncertainty Quantification
Benchmarking LLMs via Uncertainty Quantification
Fanghua Ye
Mingming Yang
Jianhui Pang
Longyue Wang
Derek F. Wong
Emine Yilmaz
Shuming Shi
Zhaopeng Tu
ELM
247
59
0
23 Jan 2024
DrugAssist: A Large Language Model for Molecule Optimization
DrugAssist: A Large Language Model for Molecule Optimization
Geyan Ye
Xibao Cai
Houtim Lai
Xing Wang
Junhong Huang
Longyue Wang
Wei Liu
Xian Zeng
123
33
0
28 Dec 2023
An Evaluation of GPT-4V and Gemini in Online VQA
An Evaluation of GPT-4V and Gemini in Online VQA
Mengchen Liu
Chongyan Chen
Danna Gurari
MLLM
123
7
0
17 Dec 2023
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large
  Language Models
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large Language Models
Bingshuai Liu
Chenyang Lyu
Zijun Min
Zhanyu Wang
Jinsong Su
Longyue Wang
LRM
96
8
0
04 Dec 2023
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage
  and Sharing in LLMs
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs
Yunxin Li
Baotian Hu
Wei Wang
Xiaochun Cao
Min Zhang
74
5
0
27 Nov 2023
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical
  Image Analysis
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis
Yingshu Li
Yunyi Liu
Zhanyu Wang
Xinyu Liang
Lei Wang
Lingqiao Liu
Leyang Cui
Zhaopeng Tu
Longyue Wang
Luping Zhou
ELMLM&MA
90
39
0
31 Oct 2023
LMEye: An Interactive Perception Network for Large Language Models
LMEye: An Interactive Perception Network for Large Language Models
Yunxin Li
Baotian Hu
Xinyu Chen
Lin Ma
Yong-mei Xu
Hao Fei
MLLMVLM
93
28
0
05 May 2023
1