Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.02370
Cited By
Improved RAMEN: Towards Domain Generalization for Visual Question Answering
6 September 2021
Bhanuka Gamage
Lim Chern Hong
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improved RAMEN: Towards Domain Generalization for Visual Question Answering"
19 / 19 papers shown
Title
From Pixels to Objects: Cubic Visual Attention for Visual Question Answering
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Heng Tao Shen
45
62
0
04 Jun 2022
Spatially Aware Multimodal Transformers for TextVQA
Yash Kant
Dhruv Batra
Peter Anderson
Alex Schwing
Devi Parikh
Jiasen Lu
Harsh Agrawal
74
86
0
23 Jul 2020
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
153
1,663
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
237
2,479
0
20 Aug 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
87
805
0
25 Jun 2019
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
52
82
0
01 Mar 2019
Pythia v0.1: the Winning Entry to the VQA Challenge 2018
Yu Jiang
Vivek Natarajan
Xinlei Chen
Marcus Rohrbach
Dhruv Batra
Devi Parikh
VLM
56
203
0
26 Jul 2018
Attention on Attention: Architectures for Visual Question Answering (VQA)
Jasdeep Singh
Vincent Ying
Alex Nutkiewicz
52
26
0
21 Mar 2018
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Aishwarya Agrawal
Dhruv Batra
Devi Parikh
Aniruddha Kembhavi
OOD
146
585
0
01 Dec 2017
Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering
Pan Lu
Hongsheng Li
Wei Zhang
Jianyong Wang
Xiaogang Wang
61
80
0
18 Nov 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
121
4,215
0
25 Jul 2017
Inferring and Executing Programs for Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Judy Hoffman
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
NAI
80
545
0
10 May 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
295
2,375
0
20 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
333
3,238
0
02 Dec 2016
Adversarial Feature Learning
Jiasen Lu
Philipp Krahenbuhl
Trevor Darrell
GAN
109
1,609
0
31 May 2016
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
107
1,882
0
07 Nov 2015
Learning to Answer Questions From Image Using Convolutional Neural Network
Lin Ma
Zhengdong Lu
Hang Li
80
262
0
01 Jun 2015
Exploring Models and Data for Image Question Answering
Mengye Ren
Ryan Kiros
R. Zemel
80
715
0
08 May 2015
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Junyoung Chung
Çağlar Gülçehre
Kyunghyun Cho
Yoshua Bengio
581
12,704
0
11 Dec 2014
1