ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering
v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown
Title
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video
  Captioning
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
78
63
0
11 Mar 2020
MQA: Answering the Question via Robotic Manipulation
MQA: Answering the Question via Robotic Manipulation
Yuhong Deng
Di Guo
F. Sun
Naifu Zhang
Huaping Liu
Chen Pang
76
22
0
10 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Deconfounded Image Captioning: A Causal Retrospect
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
79
127
0
09 Mar 2020
Investigating the Decoders of Maximum Likelihood Sequence Models: A
  Look-ahead Approach
Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach
Yu-Siang Wang
Yen-Ling Kuo
Boris Katz
57
3
0
08 Mar 2020
PathVQA: 30000+ Questions for Medical Visual Question Answering
PathVQA: 30000+ Questions for Medical Visual Question Answering
Xuehai He
Yichen Zhang
Luntian Mou
Eric Xing
P. Xie
LM&MA
77
246
0
07 Mar 2020
Noise Estimation Using Density Estimation for Self-Supervised Multimodal
  Learning
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning
Elad Amrani
Rami Ben-Ari
Daniel Rotman
A. Bronstein
134
126
0
06 Mar 2020
XGPT: Cross-modal Generative Pre-Training for Image Captioning
XGPT: Cross-modal Generative Pre-Training for Image Captioning
Qiaolin Xia
Haoyang Huang
Nan Duan
Dongdong Zhang
Lei Ji
Zhifang Sui
Edward Cui
Taroon Bharti
Xin Liu
Ming Zhou
MLLMVLM
103
76
0
03 Mar 2020
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
A. Torfi
Rouzbeh A. Shirvani
Yaser Keneshloo
Nader Tavvaf
Edward A. Fox
AI4CEVLM
151
222
0
02 Mar 2020
A Study on Multimodal and Interactive Explanations for Visual Question
  Answering
A Study on Multimodal and Interactive Explanations for Visual Question Answering
Kamran Alipour
J. Schulze
Yi Yao
Avi Ziskind
Giedrius Burachas
64
27
0
01 Mar 2020
Visual Commonsense R-CNN
Visual Commonsense R-CNN
Tan Wang
Jianqiang Huang
Hanwang Zhang
Qianru Sun
SSLObjDCML
86
252
0
27 Feb 2020
Unshuffling Data for Improved Generalization
Unshuffling Data for Improved Generalization
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
77
78
0
27 Feb 2020
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
Thomas Scialom
Patrick Bordes
Paul-Alexis Dray
Jacopo Staiano
Patrick Gallinari
59
6
0
25 Feb 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual
  Question Answering
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
101
97
0
24 Feb 2020
Shallow2Deep: Indoor Scene Modeling by Single Image Understanding
Shallow2Deep: Indoor Scene Modeling by Single Image Understanding
Y. Nie
Shihui Guo
Jian Chang
Xiaoguang Han
Jiahui Huang
Shimin Hu
Jianjun Zhang
3DPC3DV
154
16
0
22 Feb 2020
Interactive Natural Language-based Person Search
Interactive Natural Language-based Person Search
Vikram Shree
Wei-Lun Chao
M. Campbell
28
4
0
19 Feb 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
71
75
0
19 Feb 2020
CQ-VQA: Visual Question Answering on Categorized Questions
CQ-VQA: Visual Question Answering on Categorized Questions
Aakansha Mishra
A. Anand
Prithwijit Guha
145
6
0
17 Feb 2020
Looking Enhances Listening: Recovering Missing Speech Using Images
Looking Enhances Listening: Recovering Missing Speech Using Images
Tejas Srinivasan
Ramon Sanabria
Florian Metze
59
15
0
13 Feb 2020
Component Analysis for Visual Question Answering Architectures
Component Analysis for Visual Question Answering Architectures
Camila Kolling
Jonatas Wehrmann
Rodrigo C. Barros
CoGe
36
2
0
12 Feb 2020
Object Detection as a Positive-Unlabeled Problem
Object Detection as a Positive-Unlabeled Problem
Yuewei Yang
Kevin J. Liang
Lawrence Carin
82
39
0
11 Feb 2020
Solving Raven's Progressive Matrices with Neural Networks
Solving Raven's Progressive Matrices with Neural Networks
Tao Zhuo
Mohan S. Kankanhalli
105
26
0
05 Feb 2020
Bridging Text and Video: A Universal Multimodal Transformer for
  Video-Audio Scene-Aware Dialog
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang Li
Zongjia Li
Jinchao Zhang
Yang Feng
Cheng Niu
Jie Zhou
143
37
0
01 Feb 2020
Break It Down: A Question Understanding Benchmark
Break It Down: A Question Understanding Benchmark
Tomer Wolfson
Mor Geva
Ankit Gupta
Matt Gardner
Yoav Goldberg
Daniel Deutch
Jonathan Berant
91
188
0
31 Jan 2020
Augmenting Visual Question Answering with Semantic Frame Information in
  a Multitask Learning Approach
Augmenting Visual Question Answering with Semantic Frame Information in a Multitask Learning Approach
Mehrdad Alizadeh
Barbara Di Eugenio
23
3
0
31 Jan 2020
Dual Convolutional LSTM Network for Referring Image Segmentation
Dual Convolutional LSTM Network for Referring Image Segmentation
Linwei Ye
Zhi Liu
Yang Wang
76
46
0
30 Jan 2020
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
C. Qi
Xinlei Chen
Or Litany
Leonidas Guibas
3DPC
254
253
0
29 Jan 2020
Deep Bayesian Network for Visual Question Generation
Deep Bayesian Network for Visual Question Generation
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
36
18
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OODAAML
71
19
0
23 Jan 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
88
60
0
22 Jan 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised
  Image-Text Data
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
134
263
0
22 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
79
18
0
20 Jan 2020
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
Ramprasaath R. Selvaraju
Purva Tendulkar
Devi Parikh
Eric Horvitz
Marco Tulio Ribeiro
Besmira Nushi
Ece Kamar
LRM
57
14
0
20 Jan 2020
Modality-Balanced Models for Visual Dialogue
Modality-Balanced Models for Visual Dialogue
Hyounghun Kim
Hao Tan
Joey Tianyi Zhou
61
27
0
17 Jan 2020
A "Network Pruning Network" Approach to Deep Model Compression
A "Network Pruning Network" Approach to Deep Model Compression
Vinay Kumar Verma
Pravendra Singh
Vinay P. Namboodiri
Piyush Rai
3DPCVLM
58
8
0
15 Jan 2020
CheXplain: Enabling Physicians to Explore and UnderstandData-Driven,
  AI-Enabled Medical Imaging Analysis
CheXplain: Enabling Physicians to Explore and UnderstandData-Driven, AI-Enabled Medical Imaging Analysis
Yao Xie
Melody Chen
David Kao
Ge Gao
Xiang Ánthony' Chen
141
131
0
15 Jan 2020
Joint Reasoning for Multi-Faceted Commonsense Knowledge
Joint Reasoning for Multi-Faceted Commonsense Knowledge
Yohan Chalier
Simon Razniewski
Gerhard Weikum
LRM
134
25
0
13 Jan 2020
Detecting depression in dyadic conversations with multimodal narratives
  and visualizations
Detecting depression in dyadic conversations with multimodal narratives and visualizations
Joshua Y. Kim
Greyson Y. Kim
K. Yacef
45
7
0
13 Jan 2020
In Defense of Grid Features for Visual Question Answering
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OODObjD
88
320
0
10 Jan 2020
Visual Question Answering on 360° Images
Visual Question Answering on 360° Images
Shih-Han Chou
Wei-Lun Chao
Wei-Sheng Lai
Min Sun
Ming-Hsuan Yang
52
22
0
10 Jan 2020
Think Locally, Act Globally: Federated Learning with Local and Global
  Representations
Think Locally, Act Globally: Federated Learning with Local and Global Representations
Paul Pu Liang
Terrance Liu
Liu Ziyin
Nicholas B. Allen
Randy P. Auerbach
David Brent
Ruslan Salakhutdinov
Louis-Philippe Morency
FedML
124
570
0
06 Jan 2020
Multi-Layer Content Interaction Through Quaternion Product For Visual
  Question Answering
Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering
Lei Shi
Shijie Geng
Kai Shuang
Chiori Hori
Songxiang Liu
Peng Gao
Sen Su
85
11
0
03 Jan 2020
A Multimodal Target-Source Classifier with Attention Branches to
  Understand Ambiguous Instructions for Fetching Daily Objects
A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects
A. Magassouba
K. Sugiura
Hisashi Kawai
81
9
0
23 Dec 2019
Exploring Context, Attention and Audio Features for Audio Visual
  Scene-Aware Dialog
Exploring Context, Attention and Audio Features for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Jonathan Huang
L. Nachman
18
1
0
20 Dec 2019
Leveraging Topics and Audio Features with Multimodal Attention for Audio
  Visual Scene-Aware Dialog
Leveraging Topics and Audio Features with Multimodal Attention for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Jonathan Huang
L. Nachman
37
7
0
20 Dec 2019
Segmentations-Leak: Membership Inference Attacks and Defenses in
  Semantic Image Segmentation
Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation
Yang He
Shadi Rahimian
Bernt Schiele
Mario Fritz
MIACV
92
52
0
20 Dec 2019
Deep Exemplar Networks for VQA and VQG
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
31
4
0
19 Dec 2019
Going Beneath the Surface: Evaluating Image Captioning for
  Grammaticality, Truthfulness and Diversity
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
38
9
0
19 Dec 2019
Towards Causal VQA: Revealing and Reducing Spurious Correlations by
  Invariant and Covariant Semantic Editing
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CMLAAML
93
159
0
16 Dec 2019
CLOSURE: Assessing Systematic Generalization of CLEVR Models
CLOSURE: Assessing Systematic Generalization of CLEVR Models
Dzmitry Bahdanau
H. D. Vries
Timothy J. O'Donnell
Shikhar Murty
Philippe Beaudoin
Yoshua Bengio
Aaron Courville
NAI
80
90
0
12 Dec 2019
Multimodal Self-Supervised Learning for Medical Image Analysis
Multimodal Self-Supervised Learning for Medical Image Analysis
Aiham Taleb
C. Lippert
T. Klein
Moin Nabi
SSL
91
98
0
11 Dec 2019
Previous
123...434445...585960
Next