Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.00653
Cited By
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
1 October 2023
Tianyu Yu
Jinyi Hu
Yuan Yao
Haoye Zhang
Yue Zhao
Chongyi Wang
Shanonan Wang
Yinxv Pan
Jiao Xue
Dahai Li
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
VLM
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (63★)
Papers citing
"Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants"
6 / 6 papers shown
Title
A Light and Smart Wearable Platform with Multimodal Foundation Model for Enhanced Spatial Reasoning in People with Blindness and Low Vision
Alexey Magay
Dhurba Tripathi
Yu Hao
Yi Fang
81
0
0
16 May 2025
Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization
Kesen Zhao
B. Zhu
Qianru Sun
Hanwang Zhang
MLLM
LRM
163
1
0
25 Apr 2025
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Shuo Xing
Yuping Wang
Peiran Li
Ruizheng Bai
Yansen Wang
Chan-wei Hu
Chengxuan Qian
Huaxiu Yao
Zhengzhong Tu
193
8
0
18 Feb 2025
Hallucination of Multimodal Large Language Models: A Survey
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
289
197
0
29 Apr 2024
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback
Wenyi Xiao
Ziwei Huang
Leilei Gan
Wanggui He
Haoyuan Li
Zhelun Yu
Hao Jiang
Leilei Gan
Linchao Zhu
MLLM
104
34
0
22 Apr 2024
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Chaoyou Fu
Peixian Chen
Yunhang Shen
Yulei Qin
Mengdan Zhang
...
Xiawu Zheng
Ke Li
Xing Sun
Zhenyu Qiu
Rongrong Ji
ELM
MLLM
161
860
0
23 Jun 2023
1