Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00468
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
VQA: Visual Question Answering
3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VQA: Visual Question Answering"
50 / 2,957 papers shown
Title
Not All Relations are Equal: Mining Informative Labels for Scene Graph Generation
A. Goel
Basura Fernando
Frank Keller
Hakan Bilen
108
33
0
26 Nov 2021
Confounder Identification-free Causal Visual Feature Learning
Xin Li
Zhizheng Zhang
Guoqiang Wei
Cuiling Lan
Wenjun Zeng
Xin Jin
Zhibo Chen
CML
OOD
106
14
0
26 Nov 2021
Two-stage Rule-induction Visual Reasoning on RPMs with an Application to Video Prediction
Wentao He
Jianfeng Ren
Ruibin Bai
Xudong Jiang
LRM
70
5
0
24 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
146
117
0
23 Nov 2021
RedCaps: web-curated image-text data created by the people, for the people
Karan Desai
Gaurav Kaul
Zubin Aysola
Justin Johnson
137
169
0
22 Nov 2021
DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion
Arthur Douillard
Alexandre Ramé
Guillaume Couairon
Matthieu Cord
CLL
149
314
0
22 Nov 2021
TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguistic Reasoning
Keng Ji Chow
Samson Tan
MingSung Kan
LRM
65
4
0
21 Nov 2021
Medical Visual Question Answering: A Survey
Zhihong Lin
Donghao Zhang
Qingyi Tao
Danli Shi
Gholamreza Haffari
Qi Wu
M. He
Z. Ge
114
122
0
19 Nov 2021
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
Yaya Shi
Xu Yang
Haiyang Xu
Chunfen Yuan
Bing Li
Weiming Hu
Zhengjun Zha
82
33
0
17 Nov 2021
Achieving Human Parity on Visual Question Answering
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
...
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
63
13
0
17 Nov 2021
Language bias in Visual Question Answering: A Survey and Taxonomy
Desen Yuan
103
13
0
16 Nov 2021
Sentiment Analysis of Fashion Related Posts in Social Media
Yifei Yuan
W. Lam
70
8
0
15 Nov 2021
Visual Intelligence through Human Interaction
Ranjay Krishna
Mitchell L. Gordon
Fei-Fei Li
Michael S. Bernstein
74
8
0
12 Nov 2021
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions
Huan Ma
Zongbo Han
Changqing Zhang
Huazhu Fu
Qiufeng Wang
Q. Hu
EDL
UQCV
138
44
0
11 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
203
356
0
11 Nov 2021
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI
Jiangchao Yao
Shengyu Zhang
Yang Yao
Feng Wang
Jianxin Ma
...
Kun Kuang
Chao-Xiang Wu
Leilei Gan
Jingren Zhou
Hongxia Yang
167
103
0
11 Nov 2021
ICDAR 2021 Competition on Document VisualQuestion Answering
Rubèn Pérez Tito
Minesh Mathew
C. V. Jawahar
Ernest Valveny
Dimosthenis Karatzas
86
23
0
10 Nov 2021
Visual Question Answering based on Formal Logic
Muralikrishnna G. Sethuraman
Ali Payani
Faramarz Fekri
J. C. Kerce
NAI
28
3
0
08 Nov 2021
NarrationBot and InfoBot: A Hybrid System for Automated Video Description
Shasta Ihorn
Y. Siu
Aditya Bodi
Lothar D Narins
Jose M. Castanon
Yash Kant
Abhishek Das
Ilmi Yoon
Pooyan Fazli
46
3
0
07 Nov 2021
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
292
405
0
06 Nov 2021
Medicines Question Answering System, MeQA
Jesús Santamaría
8
0
0
04 Nov 2021
An Empirical Study of Training End-to-End Vision-and-Language Transformers
Zi-Yi Dou
Yichong Xu
Zhe Gan
Jianfeng Wang
Shuohang Wang
...
Pengchuan Zhang
Lu Yuan
Nanyun Peng
Zicheng Liu
Michael Zeng
VLM
106
381
0
03 Nov 2021
Revisiting spatio-temporal layouts for compositional action recognition
Gorjan Radevski
Marie-Francine Moens
Tinne Tuytelaars
104
26
0
02 Nov 2021
Introspective Distillation for Robust Question Answering
Yulei Niu
Hanwang Zhang
94
60
0
01 Nov 2021
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Mingyu Ding
Zhenfang Chen
Tao Du
Ping Luo
J. Tenenbaum
Chuang Gan
VGen
PINN
OCL
103
75
0
28 Oct 2021
Perceptual Score: What Data Modalities Does Your Model Perceive?
Itai Gat
Idan Schwartz
Alex Schwing
99
32
0
27 Oct 2021
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning
Pan Lu
Liang Qiu
Jiaqi Chen
Tony Xia
Yizhou Zhao
Wei Zhang
Zhou Yu
Xiaodan Liang
Song-Chun Zhu
AIMat
173
206
0
25 Oct 2021
Instance-Conditional Knowledge Distillation for Object Detection
Zijian Kang
Peizhen Zhang
Xinming Zhang
Jian Sun
N. Zheng
100
79
0
25 Oct 2021
Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark
Pritish Sahu
Karan Sikka
Ajay Divakaran
47
1
0
22 Oct 2021
Simple Dialogue System with AUDITED
Eugenio Clerico
Piotr Koniusz
75
2
0
22 Oct 2021
Single-Modal Entropy based Active Learning for Visual Question Answering
Dong-Jin Kim
Jae-Won Cho
Jinsoo Choi
Yunjae Jung
In So Kweon
63
12
0
21 Oct 2021
Evaluating and Improving Interactions with Hazy Oracles
Stephan J. Lemmer
Jason J. Corso
41
2
0
19 Oct 2021
Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Yupan Huang
Hongwei Xue
Bei Liu
Yutong Lu
81
59
0
19 Oct 2021
Towards Language-guided Visual Recognition via Dynamic Convolutions
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Yongjian Wu
Yue Gao
Rongrong Ji
ObjD
98
19
0
17 Oct 2021
Explore before Moving: A Feasible Path Estimation and Memory Recalling Framework for Embodied Navigation
Yang Wu
Shirui Feng
Guanbin Li
Liang Lin
23
0
0
16 Oct 2021
Guiding Visual Question Generation
Nihir Vedd
Zixu Wang
Marek Rei
Yishu Miao
Lucia Specia
140
22
0
15 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
434
1,115
0
13 Oct 2021
Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Ankit Parag Shah
Shijie Geng
Peng Gao
A. Cherian
Takaaki Hori
Tim K. Marks
Jonathan Le Roux
Chiori Hori
68
24
0
13 Oct 2021
Improving Users' Mental Model with Attention-directed Counterfactual Edits
Kamran Alipour
Arijit Ray
Xiaoyu Lin
Michael Cogswell
J. Schulze
Yi Yao
Giedrius Burachas
OOD
61
9
0
13 Oct 2021
Topic Scene Graph Generation by Attention Distillation from Caption
Wenbin Wang
R. Wang
X. Chen
DiffM
94
14
0
12 Oct 2021
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Yangguang Li
Feng Liang
Lichen Zhao
Yufeng Cui
Wanli Ouyang
Jing Shao
F. Yu
Junjie Yan
VLM
CLIP
167
458
0
11 Oct 2021
Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking
Dirk Vath
Pascal Tilli
Ngoc Thang Vu
82
4
0
11 Oct 2021
Pano-AVQA: Grounded Audio-Visual Question Answering on 360
∘
^\circ
∘
Videos
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
100
86
0
11 Oct 2021
Calling to CNN-LSTM for Rumor Detection: A Deep Multi-channel Model for Message Veracity Classification in Microblogs
Abderrazek Azri
Cécile Favre
Nouria Harbi
Jérôme Darmont
C. Noûs
57
11
0
11 Oct 2021
Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineering beyond Reward Maximization
S. Gu
Manfred Diaz
Daniel Freeman
Hiroki Furuta
Seyed Kamyar Seyed Ghasemipour
Anton Raichuk
Byron David
Erik Frey
Erwin Coumans
Olivier Bachem
82
14
0
10 Oct 2021
Efficient Multi-Modal Embeddings from Structured Data
A. Vero
Ann A. Copestake
35
4
0
06 Oct 2021
Coarse-to-Fine Reasoning for Visual Question Answering
Binh X. Nguyen
Tuong Khanh Long Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
A. Nguyen
NAI
138
40
0
06 Oct 2021
Word Acquisition in Neural Language Models
Tyler A. Chang
Benjamin Bergen
90
40
0
05 Oct 2021
Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering
Long Chen
Yuhang Zheng
Yulei Niu
Hanwang Zhang
Jun Xiao
AAML
OOD
119
37
0
03 Oct 2021
ProTo: Program-Guided Transformer for Program-Guided Tasks
Zelin Zhao
Karan Samel
Binghong Chen
Le Song
ViT
LM&Ro
98
30
0
02 Oct 2021
Previous
1
2
3
...
32
33
34
...
58
59
60
Next