Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.09073
Cited By
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
16 June 2020
Zihao Zhu
J. Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering"
46 / 46 papers shown
Title
Fine-Grained Retrieval-Augmented Generation for Visual Question Answering
Zhengxuan Zhang
Yin Wu
Yuyu Luo
Nan Tang
38
0
0
28 Feb 2025
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
S M Sarwar
80
1
0
25 Feb 2025
Multi-Agents Based on Large Language Models for Knowledge-based Visual Question Answering
Zhongjian Hu
Peng Yang
Bing Li
Zhenqi Wang
42
0
0
24 Dec 2024
Prompting Large Language Models with Rationale Heuristics for Knowledge-based Visual Question Answering
Zhongjian Hu
Peng Yang
Bing Li
Fengyuan Liu
LRM
119
58
0
22 Dec 2024
IIU: Independent Inference Units for Knowledge-based Visual Question Answering
Yili Li
Jing Yu
Keke Gai
Gang Xiong
26
0
0
15 Aug 2024
Deep Learning based Visually Rich Document Content Understanding: A Survey
Muhammad Ali
Jean Lee
Salman Khan
41
6
0
02 Aug 2024
EchoSight: Advancing Visual-Language Models with Wiki Knowledge
Yibin Yan
Weidi Xie
RALM
37
9
0
17 Jul 2024
Multimodal Reranking for Knowledge-Intensive Visual Question Answering
Haoyang Wen
Honglei Zhuang
Hamed Zamani
Alexander Hauptmann
Michael Bendersky
34
0
0
17 Jul 2024
Disentangling Knowledge-based and Visual Reasoning by Question Decomposition in KB-VQA
Elham J. Barezi
Parisa Kordjamshidi
CoGe
35
0
0
27 Jun 2024
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibi Wang
Weifeng Ge
LRM
26
4
0
19 Jan 2024
Object Attribute Matters in Visual Question Answering
Peize Li
Q. Si
Peng Fu
Zheng Lin
Yan Wang
35
0
0
20 Dec 2023
Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering
Chengxiang Yin
Zhengping Che
Kun Wu
Zhiyuan Xu
Jian Tang
33
0
0
20 Dec 2023
Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts
Yunshi Lan
Xiang Li
Xin Liu
Yang Li
Wei Qin
Weining Qian
LRM
ReLM
38
24
0
15 Nov 2023
Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA
Sheng Zhou
Dan Guo
Jia Li
Xun Yang
Hao Wu
21
5
0
13 Oct 2023
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
29
4
0
05 Sep 2023
Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning
Zijie Song
Zhenzhen Hu
Yuanen Zhou
Ye Zhao
Richang Hong
Meng Wang
21
2
0
19 Jul 2023
Combo of Thinking and Observing for Outside-Knowledge VQA
Q. Si
Yuchen Mo
Zheng Lin
Huishan Ji
Weiping Wang
43
13
0
10 May 2023
FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering
Weizhe Lin
Zhilin Wang
Bill Byrne
AAML
14
4
0
19 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
33
13
0
07 Mar 2023
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
Zhou Yu
Xuecheng Ouyang
Zhenwei Shao
Mei Wang
Jun Yu
MLLM
94
11
0
03 Mar 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Yikang Shen
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
33
35
0
12 Jan 2023
VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges
R. Zakari
Jim Wilson Owusu
Hailin Wang
Ke Qin
Zaharaddeen Karami Lawal
Yue-hong Dong
LRM
33
16
0
26 Dec 2022
A survey on knowledge-enhanced multimodal learning
Maria Lymperaiou
Giorgos Stamou
38
13
0
19 Nov 2022
PromptCap: Prompt-Guided Task-Aware Image Captioning
Yushi Hu
Hang Hua
Zhengyuan Yang
Weijia Shi
Noah A. Smith
Jiebo Luo
45
101
0
15 Nov 2022
Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
Jialin Wu
Raymond J. Mooney
RALM
11
9
0
18 Oct 2022
Retrieval Augmented Visual Question Answering with Outside Knowledge
Weizhe Lin
Bill Byrne
RALM
74
69
0
07 Oct 2022
LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection
Zhuo Chen
Yufen Huang
Jiaoyan Chen
Yuxia Geng
Yin Fang
Jeff Z. Pan
Ningyu Zhang
Wen Zhang
26
35
0
26 Jul 2022
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem
Yudong Han
Liqiang Nie
Jianhua Yin
Jianlong Wu
Yan Yan
24
12
0
24 Jul 2022
Semantic-aware Modular Capsule Routing for Visual Question Answering
Yudong Han
Jianhua Yin
Jianlong Wu
Yin-wei Wei
Liqiang Nie
29
7
0
21 Jul 2022
From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering
Zihao Zhu
NAI
ReLM
GNN
30
3
0
25 Jun 2022
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Yuanze Lin
Yujia Xie
Dongdong Chen
Yichong Xu
Chenguang Zhu
Lu Yuan
44
71
0
02 Jun 2022
DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation
Jingnong Qu
Liunian Harold Li
Jieyu Zhao
Sunipa Dev
Kai-Wei Chang
21
12
0
25 May 2022
Hypergraph Transformer: Weakly-supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering
Y. Heo
Eun-Sol Kim
Woo Suk Choi
Byoung-Tak Zhang
19
27
0
22 Apr 2022
Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog
Shunyu Zhang
X. Jiang
Zequn Yang
T. Wan
Zengchang Qin
27
12
0
10 Apr 2022
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
Yang Ding
Jing Yu
Bangchang Liu
Yue Hu
Mingxin Cui
Qi Wu
13
62
0
17 Mar 2022
Dynamic Key-value Memory Enhanced Multi-step Graph Reasoning for Knowledge-based Visual Question Answering
Mingxiao Li
Marie-Francine Moens
9
12
0
06 Mar 2022
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning
Zhecan Wang
Haoxuan You
Liunian Harold Li
Alireza Zareian
Suji Park
Yiqing Liang
Kai-Wei Chang
Shih-Fu Chang
ReLM
LRM
15
30
0
16 Dec 2021
LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering
Jingjing Jiang
Zi-yi Liu
N. Zheng
22
13
0
29 Nov 2021
Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering
Ander Salaberria
Gorka Azkune
Oier López de Lacalle
Aitor Soroa Etxabe
Eneko Agirre
30
59
0
15 Sep 2021
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
174
402
0
10 Sep 2021
Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question Answering
Man Luo
Yankai Zeng
Pratyay Banerjee
Chitta Baral
RALM
13
64
0
09 Sep 2021
EKTVQA: Generalized use of External Knowledge to empower Scene Text in Text-VQA
Arka Ujjal Dey
Ernest Valveny
Gaurav Harit
6
3
0
22 Aug 2021
X-GGM: Graph Generative Modeling for Out-of-Distribution Generalization in Visual Question Answering
Jingjing Jiang
Zi-yi Liu
Yifan Liu
Zhixiong Nan
N. Zheng
OOD
28
19
0
24 Jul 2021
Zero-shot Visual Question Answering using Knowledge Graph
Zhuo Chen
Jiaoyan Chen
Yuxia Geng
Jeff Z. Pan
Zonggang Yuan
Huajun Chen
15
70
0
12 Jul 2021
Multi-Modal Answer Validation for Knowledge-Based VQA
Jialin Wu
Jiasen Lu
Ashish Sabharwal
Roozbeh Mottaghi
20
140
0
23 Mar 2021
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
J. Yu
Yuan Chai
Yujing Wang
Yue Hu
Qi Wu
CML
27
111
0
16 Sep 2020
1