ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.18930
  4. Cited By
Hallucination of Multimodal Large Language Models: A Survey
v1v2 (latest)

Hallucination of Multimodal Large Language Models: A Survey

29 April 2024
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
    VLMLRM
ArXiv (abs)PDFHTML

Papers citing "Hallucination of Multimodal Large Language Models: A Survey"

50 / 261 papers shown
Title
Reformulating Vision-Language Foundation Models and Datasets Towards
  Universal Multimodal Assistants
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
Tianyu Yu
Jinyi Hu
Yuan Yao
Haoye Zhang
Yue Zhao
...
Jiao Xue
Dahai Li
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
VLMMLLM
45
20
0
01 Oct 2023
Aligning Large Multimodal Models with Factually Augmented RLHF
Aligning Large Multimodal Models with Factually Augmented RLHF
Zhiqing Sun
Sheng Shen
Shengcao Cao
Haotian Liu
Chunyuan Li
...
Liangyan Gui
Yu-Xiong Wang
Yiming Yang
Kurt Keutzer
Trevor Darrell
VLM
143
396
0
25 Sep 2023
Language Modeling Is Compression
Language Modeling Is Compression
Grégoire Delétang
Anian Ruoss
Paul-Ambroise Duquenne
Elliot Catt
Tim Genewein
...
Wenliang Kevin Li
Matthew Aitchison
Laurent Orseau
Marcus Hutter
J. Veness
AI4CE
121
146
0
19 Sep 2023
Unsupervised Open-Vocabulary Object Localization in Videos
Unsupervised Open-Vocabulary Object Localization in Videos
Ke Fan
Zechen Bai
Tianjun Xiao
Dominik Zietlow
Max Horn
...
Bernt Schiele
Thomas Brox
Zheng Zhang
Yanwei Fu
Tong He
114
10
0
18 Sep 2023
MMICL: Empowering Vision-language Model with Multi-Modal In-Context
  Learning
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao
Zefan Cai
Shuzheng Si
Xiaojian Ma
Kaikai An
Liang Chen
Zixuan Liu
Sheng Wang
Wenjuan Han
Baobao Chang
MLLMVLM
130
143
0
14 Sep 2023
ImageBind-LLM: Multi-modality Instruction Tuning
ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han
Renrui Zhang
Wenqi Shao
Peng Gao
Peng Xu
...
Yafei Wen
Xiaoxin Chen
Xiangyu Yue
Hongsheng Li
Yu Qiao
MLLM
94
125
0
07 Sep 2023
DoLa: Decoding by Contrasting Layers Improves Factuality in Large
  Language Models
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Yung-Sung Chuang
Yujia Xie
Hongyin Luo
Yoon Kim
James R. Glass
Pengcheng He
HILM
79
167
0
07 Sep 2023
CIEM: Contrastive Instruction Evaluation Method for Better Instruction
  Tuning
CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning
Hongyu Hu
Jiyuan Zhang
Minyi Zhao
Zhenbang Sun
MLLM
82
49
0
05 Sep 2023
Evaluation and Analysis of Hallucination in Large Vision-Language Models
Evaluation and Analysis of Hallucination in Large Vision-Language Models
Junyan Wang
Yi Zhou
Guohai Xu
Pengcheng Shi
Chenlin Zhao
...
Mingshi Yan
Ji Zhang
Jihua Zhu
Jitao Sang
Haoyu Tang
MLLM
116
70
0
29 Aug 2023
VIGC: Visual Instruction Generation and Correction
VIGC: Visual Instruction Generation and Correction
Bin Wang
Fan Wu
Xiao Han
Jiahui Peng
Huaping Zhong
...
Xiao-wen Dong
Weijia Li
Wei Li
Jiaqi Wang
Conghui He
MLLM
109
69
0
24 Aug 2023
Instruction Tuning for Large Language Models: A Survey
Instruction Tuning for Large Language Models: A Survey
Shengyu Zhang
Linfeng Dong
Xiaoya Li
Sen Zhang
Xiaofei Sun
...
Jiwei Li
Runyi Hu
Tianwei Zhang
Leilei Gan
Guoyin Wang
LM&MA
116
610
0
21 Aug 2023
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual
  Questions
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
Wenbo Hu
Y. Xu
Yuante Li
W. Li
Zhe Chen
Zhuowen Tu
MLLMVLM
109
133
0
19 Aug 2023
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative
  Instructions
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
Juncheng Li
Kaihang Pan
Zhiqi Ge
Minghe Gao
Wei Ji
Wenqiao Zhang
Tat-Seng Chua
Siliang Tang
Hanwang Zhang
Yueting Zhuang
MLLM
121
73
0
08 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
539
12,132
0
18 Jul 2023
MMBench: Is Your Multi-modal Model an All-around Player?
MMBench: Is Your Multi-modal Model an All-around Player?
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
...
Jiaqi Wang
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
189
1,059
0
12 Jul 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
MLLMVLM
168
238
0
07 Jul 2023
What Matters in Training a GPT4-Style Language Model with Multimodal
  Inputs?
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Yan Zeng
Hanbo Zhang
Jiani Zheng
Jiangnan Xia
Guoqiang Wei
Yang Wei
Yuchen Zhang
Tao Kong
MLLM
109
79
0
05 Jul 2023
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Ke Chen
Zhao Zhang
Weili Zeng
Richong Zhang
Feng Zhu
Rui Zhao
ObjD
138
652
0
27 Jun 2023
Mitigating Hallucination in Large Multi-Modal Models via Robust
  Instruction Tuning
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
Fuxiao Liu
Kevin Qinghong Lin
Linjie Li
Jianfeng Wang
Yaser Yacoob
Lijuan Wang
VLMMLLM
179
287
0
26 Jun 2023
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language
  Models
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Chaoyou Fu
Peixian Chen
Yunhang Shen
Yulei Qin
Mengdan Zhang
...
Xiawu Zheng
Ke Li
Xing Sun
Zhenyu Qiu
Rongrong Ji
ELMMLLM
156
860
0
23 Jun 2023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
182
778
0
01 Jun 2023
PandaGPT: One Model To Instruction-Follow Them All
PandaGPT: One Model To Instruction-Follow Them All
Yixuan Su
Tian Lan
Huayang Li
Jialu Xu
Yan Wang
Deng Cai
MLLM
99
295
0
25 May 2023
LIMA: Less Is More for Alignment
LIMA: Less Is More for Alignment
Chunting Zhou
Pengfei Liu
Puxin Xu
Srini Iyer
Jiao Sun
...
Susan Zhang
Gargi Ghosh
M. Lewis
Luke Zettlemoyer
Omer Levy
ALM
145
855
0
18 May 2023
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for
  Foundation Models
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Yuzhen Huang
Yuzhuo Bai
Zhihao Zhu
Junlei Zhang
Jinghan Zhang
...
Yikai Zhang
Jiayi Lei
Yao Fu
Maosong Sun
Junxian He
ELMLRM
148
552
0
15 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with
  Instruction Tuning
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLMVLM
247
2,102
0
11 May 2023
MultiModal-GPT: A Vision and Language Model for Dialogue with Humans
MultiModal-GPT: A Vision and Language Model for Dialogue with Humans
T. Gong
Chengqi Lyu
Shilong Zhang
Yudong Wang
Miao Zheng
Qianmengke Zhao
Kuikun Liu
Wenwei Zhang
Ping Luo
Kai-xiang Chen
MLLM
110
273
0
08 May 2023
Otter: A Multi-Modal Model with In-Context Instruction Tuning
Otter: A Multi-Modal Model with In-Context Instruction Tuning
Yue Liu
Yuanhan Zhang
Liangyu Chen
Jinghao Wang
Jingkang Yang
Ziwei Liu
MLLM
85
522
0
05 May 2023
VPGTrans: Transfer Visual Prompt Generator across LLMs
VPGTrans: Transfer Visual Prompt Generator across LLMs
Ao Zhang
Hao Fei
Yuan Yao
Wei Ji
Li Li
Zhiyuan Liu
Tat-Seng Chua
MLLMVLM
89
89
0
02 May 2023
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Peng Gao
Jiaming Han
Renrui Zhang
Ziyi Lin
Shijie Geng
...
Pan Lu
Conghui He
Xiangyu Yue
Hongsheng Li
Yu Qiao
MLLM
118
588
0
28 Apr 2023
mPLUG-Owl: Modularization Empowers Large Language Models with
  Multimodality
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLMMLLM
311
957
0
27 Apr 2023
Instruction Tuning with GPT-4
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDaALMLM&MA
240
625
0
06 Apr 2023
Segment Anything
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLMVLM
479
7,478
0
05 Apr 2023
Benchmarking Large Language Models for News Summarization
Benchmarking Large Language Models for News Summarization
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Percy Liang
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
135
535
0
31 Jan 2023
Reasoning with Language Model Prompting: A Survey
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao
Yixin Ou
Ningyu Zhang
Xiang Chen
Yunzhi Yao
Shumin Deng
Chuanqi Tan
Fei Huang
Huajun Chen
ReLMELMLRM
232
327
0
19 Dec 2022
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Zixian Ma
Jerry Hong
Mustafa Omer Gul
Mona Gandhi
Irena Gao
Ranjay Krishna
CoGe
94
143
0
13 Dec 2022
Contrastive Decoding: Open-ended Text Generation as Optimization
Contrastive Decoding: Open-ended Text Generation as Optimization
Xiang Lisa Li
Ari Holtzman
Daniel Fried
Percy Liang
Jason Eisner
Tatsunori Hashimoto
Luke Zettlemoyer
M. Lewis
156
375
0
27 Oct 2022
When and why vision-language models behave like bags-of-words, and what
  to do about it?
When and why vision-language models behave like bags-of-words, and what to do about it?
Mert Yuksekgonul
Federico Bianchi
Pratyusha Kalluri
Dan Jurafsky
James Zou
VLMCoGe
162
394
0
04 Oct 2022
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Joel Jang
Dongkeun Yoon
Sohee Yang
Sungmin Cha
Moontae Lee
Lajanugen Logeswaran
Minjoon Seo
KELMPILMMU
226
239
0
04 Oct 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLMLRM
606
4,545
0
24 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLMCLIPOffRL
308
1,314
0
04 May 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLMDiffM
514
6,944
0
13 Apr 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILMLRM
588
6,322
0
05 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
936
13,285
0
04 Mar 2022
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRLAI4TSAI4CEALMAIMat
771
10,647
0
17 Jun 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
973
6,153
0
29 Apr 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDLAI4CE
162
1,566
0
18 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
1.1K
30,096
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLMCLIP
545
3,914
0
11 Feb 2021
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELMRALM
390
4,585
0
07 Sep 2020
Learning to summarize from human feedback
Learning to summarize from human feedback
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
304
2,195
0
02 Sep 2020
Previous
123456
Next