Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.09073
Cited By
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
16 June 2020
Zihao Zhu
J. Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering"
16 / 16 papers shown
Title
Fine-Grained Retrieval-Augmented Generation for Visual Question Answering
Zhengxuan Zhang
Yin Wu
Yuyu Luo
Nan Tang
35
0
0
28 Feb 2025
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
S M Sarwar
74
1
0
25 Feb 2025
Disentangling Knowledge-based and Visual Reasoning by Question Decomposition in KB-VQA
Elham J. Barezi
Parisa Kordjamshidi
CoGe
30
0
0
27 Jun 2024
Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts
Yunshi Lan
Xiang Li
Xin Liu
Yang Li
Wei Qin
Weining Qian
LRM
ReLM
30
24
0
15 Nov 2023
Combo of Thinking and Observing for Outside-Knowledge VQA
Q. Si
Yuchen Mo
Zheng Lin
Huishan Ji
Weiping Wang
40
13
0
10 May 2023
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
Zhou Yu
Xuecheng Ouyang
Zhenwei Shao
Mei Wang
Jun Yu
MLLM
94
11
0
03 Mar 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Yikang Shen
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
31
35
0
12 Jan 2023
PromptCap: Prompt-Guided Task-Aware Image Captioning
Yushi Hu
Hang Hua
Zhengyuan Yang
Weijia Shi
Noah A. Smith
Jiebo Luo
42
101
0
15 Nov 2022
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem
Yudong Han
Liqiang Nie
Jianhua Yin
Jianlong Wu
Yan Yan
24
12
0
24 Jul 2022
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Yuanze Lin
Yujia Xie
Dongdong Chen
Yichong Xu
Chenguang Zhu
Lu Yuan
42
71
0
02 Jun 2022
DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation
Jingnong Qu
Liunian Harold Li
Jieyu Zhao
Sunipa Dev
Kai-Wei Chang
21
12
0
25 May 2022
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
Yang Ding
Jing Yu
Bangchang Liu
Yue Hu
Mingxin Cui
Qi Wu
11
62
0
17 Mar 2022
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning
Zhecan Wang
Haoxuan You
Liunian Harold Li
Alireza Zareian
Suji Park
Yiqing Liang
Kai-Wei Chang
Shih-Fu Chang
ReLM
LRM
15
30
0
16 Dec 2021
Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering
Ander Salaberria
Gorka Azkune
Oier López de Lacalle
Aitor Soroa Etxabe
Eneko Agirre
30
59
0
15 Sep 2021
Zero-shot Visual Question Answering using Knowledge Graph
Zhuo Chen
Jiaoyan Chen
Yuxia Geng
Jeff Z. Pan
Zonggang Yuan
Huajun Chen
15
70
0
12 Jul 2021
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
J. Yu
Yuan Chai
Yujing Wang
Yue Hu
Qi Wu
CML
27
111
0
16 Sep 2020
1