A Review of Multi-Modal Large Language and Vision Models

A Review of Multi-Modal Large Language and Vision Models

28 March 2024

Alan F. Smeaton

ArXiv (abs)PDF HTML

Papers citing "A Review of Multi-Modal Large Language and Vision Models"

16 / 16 papers shown

Title
HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs Zijian Zhang Xuecheng Wu Danlei Huang Siyu Yan Chong Peng Xuezhi Cao VLM 73 0 0 16 Jun 2025
Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models Ruiyang Zhang Hu Zhang Hao Fei Zhedong Zheng UQCV 23 0 0 09 Jun 2025
StressTest: Can YOUR Speech LM Handle the Stress? Iddo Yosha Gallil Maimon Yossi Adi 42 0 0 28 May 2025
ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations Xuecheng Wu Jiaxing Liu Danlei Huang Xiaoyu Li Yifan Wang Chen Chen Liya Ma Xuezhi Cao Junxiao Xue LRM 106 0 0 20 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities Wei Wei Jintao Guo Shanshan Zhao Minghao Fu Lunhao Duan ... Guo-Hua Wang Qing-Guo Chen Zhao Xu Weihua Luo Kaifu Zhang DiffM 295 1 0 05 May 2025
Building Trustworthy Multimodal AI: A Review of Fairness, Transparency, and Ethics in Vision-Language Tasks Mohammad Saleha Azadeh Tabatabaeib 146 0 0 14 Apr 2025
Towards Understanding the Use of MLLM-Enabled Applications for Visual Interpretation by Blind and Low Vision People Ricardo E Gonzalez Penuela Ruiying Hu Sharon Lin Tanisha Shende Shiri Azenkot 79 2 0 07 Mar 2025
TOKON: TOKenization-Optimized Normalization for time series analysis with a large language model Janghoon Yang AI4TS 219 0 0 08 Feb 2025
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs Hong Li Nanxi Li Yuanjie Chen Jianbin Zhu Qinlu Guo Cewu Lu Yong-Lu Li MLLM 111 1 0 02 Oct 2024
Surveying the MLLM Landscape: A Meta-Review of Current Surveys Ming Li Keyu Chen Ziqian Bi Ming Liu Benji Peng ... Jinlang Wang Sen Zhang X. Pan Jiawei Xu Pohsun Feng OffRL 115 2 0 17 Sep 2024
From Text to Multimodality: Exploring the Evolution and Impact of Large Language Models in Medical Practice Qian Niu Keyu Chen Ming Li Pohsun Feng Ziqian Bi ... Junyu Liu Benji Peng Tianyang Wang Yunze Wang Silin Chen LM&MA 102 7 0 14 Sep 2024
Understanding Foundation Models: Are We Back in 1924? Alan F. Smeaton AI4CE 70 3 0 11 Sep 2024
Can Transformers Do Enumerative Geometry? Baran Hashemi Roderic G. Corominas Alessandro Giacchetto 536 5 0 27 Aug 2024
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective Zhen Qin Daoyuan Chen Wenhao Zhang Liuyi Yao Yilun Huang Bolin Ding Yaliang Li Shuiguang Deng 142 7 0 11 Jul 2024
Towards a Science Exocortex Kevin G. Yager 110 2 0 24 Jun 2024
Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition Zeyu Li Suncheng Xiang Tong Yu Jingsheng Gao Jiacheng Ruan Yanping Hu Ting Liu Yuzhuo Fu 42 0 0 04 Jan 2024