Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.06399
Cited By
v1
v2 (latest)
Latent Variable Models for Visual Question Answering
16 January 2021
Zixu Wang
Yishu Miao
Lucia Specia
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Latent Variable Models for Visual Question Answering"
34 / 34 papers shown
Title
MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
OOD
46
142
0
18 Sep 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
135
1,944
0
13 Apr 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Long Chen
Xin Yan
Jun Xiao
Hanwang Zhang
Shiliang Pu
Yueting Zhuang
OOD
AAML
208
292
0
14 Mar 2020
Deep Bayesian Network for Visual Question Generation
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
34
18
0
23 Jan 2020
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
60
320
0
10 Jan 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
355
942
0
24 Sep 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
169
1,666
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
250
2,488
0
20 Aug 2019
Multi-modality Latent Interaction Network for Visual Question Answering
Peng Gao
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Hongsheng Li
64
82
0
10 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
153
1,963
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
243
3,695
0
06 Aug 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
87
808
0
25 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Hyounghun Kim
Joey Tianyi Zhou
CoGe
42
20
0
14 Jun 2019
Generating Question Relevant Captions to Aid Visual Question Answering
Jialin Wu
Zeyuan Hu
Raymond J. Mooney
88
43
0
03 Jun 2019
Information Maximizing Visual Question Generation
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
102
95
0
27 Mar 2019
Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
Ramakrishna Vedantam
Karan Desai
Stefan Lee
Marcus Rohrbach
Dhruv Batra
Devi Parikh
NAI
BDL
61
87
0
21 Feb 2019
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
AIMat
80
365
0
13 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,175
0
11 Oct 2018
Bilinear Attention Networks
Jin-Hwa Kim
Jaehyun Jun
Byoung-Tak Zhang
AIMat
90
877
0
21 May 2018
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
Damien Teney
Peter Anderson
Xiaodong He
Anton Van Den Hengel
101
383
0
09 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
123
4,221
0
25 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
776
132,363
0
12 Jun 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
V. Kazemi
Ali Elqursh
OOD
79
184
0
11 Apr 2017
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
352
3,270
0
02 Dec 2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
911
6,796
0
26 Sep 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
225
5,762
0
23 Feb 2016
Where To Look: Focus Regions for Visual Question Answering
Kevin J. Shih
Saurabh Singh
Derek Hoiem
76
460
0
23 Nov 2015
Neural Variational Inference for Text Processing
Yishu Miao
Lei Yu
Phil Blunsom
VLM
DRL
132
622
0
19 Nov 2015
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
114
1,884
0
07 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
528
62,377
0
04 Jun 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
226
5,503
0
03 May 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.0K
150,312
0
22 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
144
5,591
0
07 Dec 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
424
43,814
0
01 May 2014
1