Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.05299
Cited By
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering
14 January 2022
Feng Gao
Q. Ping
Govind Thattai
Aishwarya N. Reganti
Yingting Wu
Premkumar Natarajan
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering"
36 / 36 papers shown
Title
Fine-Grained Retrieval-Augmented Generation for Visual Question Answering
Zhengxuan Zhang
Yin Wu
Yuyu Luo
Nan Tang
91
0
0
28 Feb 2025
Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering
Ander Salaberria
Gorka Azkune
Oier López de Lacalle
Aitor Soroa Etxabe
Eneko Agirre
82
61
0
15 Sep 2021
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
256
421
0
10 Sep 2021
Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question Answering
Man Luo
Yankai Zeng
Pratyay Banerjee
Chitta Baral
RALM
117
66
0
09 Sep 2021
Multi-Modal Answer Validation for Knowledge-Based VQA
Jialin Wu
Jiasen Lu
Ashish Sabharwal
Roozbeh Mottaghi
143
144
0
23 Mar 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
325
157
0
02 Jan 2021
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
Wei Li
Can Gao
Guocheng Niu
Xinyan Xiao
Hao Liu
Jiachen Liu
Hua Wu
Haifeng Wang
107
379
0
31 Dec 2020
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA
Kenneth Marino
Xinlei Chen
Devi Parikh
Abhinav Gupta
Marcus Rohrbach
108
185
0
20 Dec 2020
Distilling Knowledge from Reader to Retriever for Question Answering
Gautier Izacard
Edouard Grave
RALM
248
266
0
08 Dec 2020
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
Gautier Izacard
Edouard Grave
RALM
141
1,181
0
02 Jul 2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
79
498
0
11 Jun 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
135
1,944
0
13 Apr 2020
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
60
320
0
10 Jan 2020
UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
112
447
0
25 Sep 2019
Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering
Zhiguo Wang
Patrick Ng
Xiaofei Ma
Ramesh Nallapati
Bing Xiang
OOD
70
244
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
250
2,488
0
20 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
243
3,695
0
06 Aug 2019
Bilinear Graph Networks for Visual Question Answering
Dalu Guo
Chang Xu
Dacheng Tao
GNN
64
52
0
23 Jul 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
89
808
0
25 Jun 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
111
1,253
0
18 Apr 2019
End-to-End Open-Domain Question Answering with BERTserini
Wei Yang
Yuqing Xie
Aileen Lin
Xingyu Li
Luchen Tan
Kun Xiong
Ming Li
Jimmy J. Lin
RALM
124
495
0
05 Feb 2019
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
84
611
0
04 Oct 2018
Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text
Haitian Sun
Bhuwan Dhingra
Manzil Zaheer
Kathryn Mazaitis
Ruslan Salakhutdinov
William W. Cohen
84
417
0
04 Sep 2018
Know What You Don't Know: Unanswerable Questions for SQuAD
Pranav Rajpurkar
Robin Jia
Percy Liang
RALM
ELM
292
2,853
0
11 Jun 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
56
79
0
24 May 2018
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
123
4,221
0
25 Jul 2017
Learning to Reason: End-to-End Module Networks for Visual Question Answering
Ronghang Hu
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Kate Saenko
KELM
GNN
ReLM
LRM
129
579
0
18 Apr 2017
Reading Wikipedia to Answer Open-Domain Questions
Danqi Chen
Adam Fisch
Jason Weston
Antoine Bordes
RALM
121
2,019
0
31 Mar 2017
Billion-scale similarity search with GPUs
Jeff Johnson
Matthijs Douze
Hervé Jégou
257
3,737
0
28 Feb 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
313
2,387
0
20 Dec 2016
FVQA: Fact-based Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton van den Hengel
A. Dick
CoGe
87
462
0
17 Jun 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
316
8,169
0
16 Jun 2016
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?
Abhishek Das
Harsh Agrawal
C. L. Zitnick
Devi Parikh
Dhruv Batra
102
466
0
11 Jun 2016
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
118
752
0
09 Dec 2015
Explicit Knowledge-based Reasoning for Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton Van Den Hengel
A. Dick
87
261
0
09 Nov 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
226
5,503
0
03 May 2015
1