Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.01003
Cited By
Piculet: Specialized Models-Guided Hallucination Decrease for MultiModal Large Language Models
2 August 2024
Afia Anjum
Xiang Liu
Zhaoxiang Liu
Ning Wang
Shiguo Lian
VLM
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Piculet: Specialized Models-Guided Hallucination Decrease for MultiModal Large Language Models"
17 / 17 papers shown
Title
CogVLM: Visual Expert for Pretrained Language Models
Weihan Wang
Qingsong Lv
Wenmeng Yu
Wenyi Hong
Ji Qi
...
Bin Xu
Juanzi Li
Yuxiao Dong
Ming Ding
Jie Tang
VLM
MLLM
111
508
0
06 Nov 2023
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Shukang Yin
Chaoyou Fu
Sirui Zhao
Tong Xu
Hao Wang
Dianbo Sui
Yunhang Shen
Ke Li
Xingguo Sun
Enhong Chen
VLM
MLLM
87
132
0
24 Oct 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
Anh Tuan Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
106
577
0
03 Sep 2023
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
Fuxiao Liu
Kevin Qinghong Lin
Linjie Li
Jianfeng Wang
Yaser Yacoob
Lijuan Wang
VLM
MLLM
113
286
0
26 Jun 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
286
955
0
27 Apr 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.5K
14,699
0
15 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
429
4,641
0
30 Jan 2023
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
Wenliang Dai
Zihan Liu
Ziwei Ji
Dan Su
Pascale Fung
MLLM
VLM
82
67
0
14 Oct 2022
PP-YOLOE: An evolved version of YOLO
Shangliang Xu
Xinxin Wang
Wenyu Lv
Qinyao Chang
Cheng Cui
...
Guanzhong Wang
Qingqing Dang
Shengyun Wei
Yuning Du
Baohua Lai
ObjD
97
274
0
30 Mar 2022
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
967
29,810
0
26 Feb 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
418
4,996
0
24 Feb 2021
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Patrick Murphy
Alan Yuille
SSeg
267
18,259
0
02 Jun 2016
Deeply-Fused Nets
Jingdong Wang
Zhen Wei
Ting Zhang
Wenjun Zeng
FedML
173
98
0
25 May 2016
Fully Convolutional Networks for Semantic Segmentation
Evan Shelhamer
Jonathan Long
Trevor Darrell
VOS
SSeg
747
37,890
0
20 May 2016
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
716
37,020
0
08 Jun 2015
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
424
43,814
0
01 May 2014
Rich feature hierarchies for accurate object detection and semantic segmentation
Ross B. Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
ObjD
289
26,211
0
11 Nov 2013
1