ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering
v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown
Title
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual
  Question Answering
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering
Medhini Narasimhan
Svetlana Lazebnik
Alex Schwing
NAIGNNReLM
69
11
0
01 Nov 2018
A Corpus for Reasoning About Natural Language Grounded in Photographs
A Corpus for Reasoning About Natural Language Grounded in Photographs
Alane Suhr
Stephanie Zhou
Ally Zhang
Iris Zhang
Huajun Bai
Yoav Artzi
LRM
120
610
0
01 Nov 2018
How2: A Large-scale Dataset for Multimodal Language Understanding
How2: A Large-scale Dataset for Multimodal Language Understanding
Ramon Sanabria
Ozan Caglayan
Shruti Palaskar
Desmond Elliott
Loïc Barrault
Lucia Specia
Florian Metze
VGenMLLM
107
292
0
01 Nov 2018
TallyQA: Answering Complex Counting Questions
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
71
125
0
29 Oct 2018
Do Explanations make VQA Models more Predictable to a Human?
Do Explanations make VQA Models more Predictable to a Human?
Arjun Chandrasekaran
Viraj Prabhu
Deshraj Yadav
Prithvijit Chattopadhyay
Devi Parikh
FAtt
150
97
0
29 Oct 2018
Middle-Out Decoding
Middle-Out Decoding
Shikib Mehri
Leonid Sigal
68
22
0
28 Oct 2018
Fabrik: An Online Collaborative Neural Network Editor
Fabrik: An Online Collaborative Neural Network Editor
Utsav Garg
Viraj Prabhu
Deshraj Yadav
Ram Ramrakhya
Harsh Agrawal
Dhruv Batra
GNN
65
4
0
27 Oct 2018
Engaging Image Captioning Via Personality
Engaging Image Captioning Via Personality
Kurt Shuster
Samuel Humeau
Hexiang Hu
Antoine Bordes
Jason Weston
87
152
0
25 Oct 2018
Understand, Compose and Respond - Answering Visual Questions by a
  Composition of Abstract Procedures
Understand, Compose and Respond - Answering Visual Questions by a Composition of Abstract Procedures
B. Vatashsky
S. Ullman
CoGe
72
1
0
25 Oct 2018
Improving Context Modelling in Multimodal Dialogue Generation
Improving Context Modelling in Multimodal Dialogue Generation
Shubham Agarwal
Ondrej Dusek
Ioannis Konstas
Verena Rieser
71
19
0
20 Oct 2018
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
Shubham Agarwal
Ondrej Dusek
Ioannis Konstas
Verena Rieser
76
22
0
20 Oct 2018
Cross-Modal and Hierarchical Modeling of Video and Text
Cross-Modal and Hierarchical Modeling of Video and Text
Bowen Zhang
Hexiang Hu
Fei Sha
BDLAI4TS
84
191
0
16 Oct 2018
Learning to Globally Edit Images with Textual Description
Learning to Globally Edit Images with Textual Description
Hai Wang
Jason D. Williams
Sin-Han Kang
DiffM
75
18
0
13 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial
  Regularization
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
72
239
0
08 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language
  Understanding
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
121
614
0
04 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question
  Answering
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Hyeonwoo Noh
Taehoon Kim
Jonghwan Mun
Bohyung Han
86
17
0
03 Oct 2018
Image as Data: Automated Visual Content Analysis for Political Science
Image as Data: Automated Visual Content Analysis for Political Science
Jungseock Joo
Zachary C. Steinert-Threlkeld
48
42
0
03 Oct 2018
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition
Jianwei Yang
Jiasen Lu
Stefan Lee
Dhruv Batra
Devi Parikh
103
42
0
01 Oct 2018
Learning Robust, Transferable Sentence Representations for Text
  Classification
Learning Robust, Transferable Sentence Representations for Text Classification
Wasi Uddin Ahmad
Xueying Bai
Nanyun Peng
Kai-Wei Chang
AI4TSOOD
61
5
0
28 Sep 2018
A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC
A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC
Mark Yatskar
93
97
0
27 Sep 2018
Textually Enriched Neural Module Networks for Visual Question Answering
Textually Enriched Neural Module Networks for Visual Question Answering
Khyathi Chandu
Mary Arpita Pyreddy
Matthieu Felix
N. Joshi
56
6
0
23 Sep 2018
Multimodal Dual Attention Memory for Video Story Question Answering
Multimodal Dual Attention Memory for Video Story Question Answering
Kyung-Min Kim
Seongho Choi
Jin-Hwa Kim
Byoung-Tak Zhang
77
77
0
21 Sep 2018
Lessons learned in multilingual grounded language learning
Lessons learned in multilingual grounded language learning
Ákos Kádár
Desmond Elliott
Marc-Alexandre Côté
Grzegorz Chrupała
Afra Alishahi
VLM
112
24
0
20 Sep 2018
MTLE: A Multitask Learning Encoder of Visual Feature Representations for
  Video and Movie Description
MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description
Oliver A. Nina
Washington Garcia
Scott Clouse
Alper Yilmaz
30
4
0
19 Sep 2018
LiveBot: Generating Live Video Comments Based on Visual and Textual
  Contexts
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
Shuming Ma
Lei Cui
Damai Dai
Furu Wei
Xu Sun
VGen
72
63
0
13 Sep 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in
  the Evaluation of VQA
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA
Shailza Jolly
Sandro Pezzelle
T. Klein
Andreas Dengel
Moin Nabi
39
2
0
12 Sep 2018
Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes
  with Utterances
Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances
Thao Le Minh
N. Shimizu
Takashi Miyazaki
Koichi Shinoda
32
13
0
12 Sep 2018
Answering Visual What-If Questions: From Actions to Predicted Scene
  Descriptions
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions
M. Wagner
H. Basevi
Rakshith Shetty
Wenbin Li
Mateusz Malinowski
M. Fritz
A. Leonardis
68
29
0
11 Sep 2018
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch
  Norm on CLEVR
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR
Mateusz Malinowski
Carl Doersch
ReLM
65
12
0
11 Sep 2018
Context-Dependent Diffusion Network for Visual Relationship Detection
Context-Dependent Diffusion Network for Visual Relationship Detection
Zhen Cui
Chunyan Xu
Wenming Zheng
Jian Yang
GNN
79
50
0
11 Sep 2018
How clever is the FiLM model, and how clever can it be?
How clever is the FiLM model, and how clever can it be?
A. Kuhnle
Huiyuan Xie
Ann A. Copestake
68
6
0
09 Sep 2018
Faithful Multimodal Explanation for Visual Question Answering
Faithful Multimodal Explanation for Visual Question Answering
Jialin Wu
Raymond J. Mooney
85
91
0
08 Sep 2018
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image
  Data to Model Human Conceptual Knowledge
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge
Steven Derby
Paul Miller
B. Murphy
Barry Devereux
36
15
0
07 Sep 2018
Cascaded Mutual Modulation for Visual Reasoning
Cascaded Mutual Modulation for Visual Reasoning
Yiqun Yao
Jiaming Xu
Feng Wang
Bo Xu
LRM
60
14
0
06 Sep 2018
Visual Coreference Resolution in Visual Dialog using Neural Module
  Networks
Visual Coreference Resolution in Visual Dialog using Neural Module Networks
Satwik Kottur
José M. F. Moura
Devi Parikh
Dhruv Batra
Marcus Rohrbach
77
165
0
06 Sep 2018
Interpretable Visual Question Answering by Reasoning on Dependency Trees
Interpretable Visual Question Answering by Reasoning on Dependency Trees
Qingxing Cao
Bailin Li
Xiaodan Liang
Liang Lin
72
56
0
06 Sep 2018
TVQA: Localized, Compositional Video Question Answering
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
116
643
0
05 Sep 2018
Retinal Vessel Segmentation under Extreme Low Annotation: A Generative
  Adversarial Network Approach
Retinal Vessel Segmentation under Extreme Low Annotation: A Generative Adversarial Network Approach
A. Lahiri
V. Jain
Arnab Kumar Mondal
P. Biswas
GANMedIm
73
12
0
05 Sep 2018
Straight to the Facts: Learning Knowledge Base Retrieval for Factual
  Visual Question Answering
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering
Medhini Narasimhan
Alex Schwing
79
105
0
04 Sep 2018
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking
  Recipes
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes
Semih Yagcioglu
Aykut Erdem
Erkut Erdem
Nazli Ikizler-Cinbis
CoGe
64
173
0
04 Sep 2018
Diverse and Coherent Paragraph Generation from Images
Diverse and Coherent Paragraph Generation from Images
Moitreya Chatterjee
Alex Schwing
75
67
0
03 Sep 2018
Learning to Describe Differences Between Pairs of Similar Images
Learning to Describe Differences Between Pairs of Similar Images
Harsh Jhamtani
Taylor Berg-Kirkpatrick
90
155
0
31 Aug 2018
Towards a Better Metric for Evaluating Question Generation Systems
Towards a Better Metric for Evaluating Question Generation Systems
Preksha Nema
Mitesh M. Khapra
95
108
0
30 Aug 2018
Adapting Visual Question Answering Models for Enhancing Multimodal
  Community Q&A Platforms
Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms
Avikalp Srivastava
Hsin Wen Liu
Sumio Fujita
30
3
0
29 Aug 2018
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader
  Model for Open-domain Question Answering
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering
Jianmo Ni
Chenguang Zhu
Weizhu Chen
Julian McAuley
RALM
89
38
0
28 Aug 2018
Convolutional Neural Networks for Aerial Vehicle Detection and
  Recognition
Convolutional Neural Networks for Aerial Vehicle Detection and Recognition
Amir Soleimani
Nasser M. Nasrabadi
E. Griffith
J. Ralph
Simon Maskell
28
10
0
26 Aug 2018
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem
  Solvers
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers
Dongxiang Zhang
Lei Wang
Nuo Xu
B. Dai
Heng Tao Shen
ReLMAIMat
98
127
0
22 Aug 2018
CoQA: A Conversational Question Answering Challenge
CoQA: A Conversational Question Answering Challenge
Siva Reddy
Danqi Chen
Christopher D. Manning
RALMHAI
158
1,213
0
21 Aug 2018
Auto-Classification of Retinal Diseases in the Limit of Sparse Data
  Using a Two-Streams Machine Learning Model
Auto-Classification of Retinal Diseases in the Limit of Sparse Data Using a Two-Streams Machine Learning Model
Chao-Han Huck Yang
Fangyu Liu
Jia-Hong Huang
Meng Tian
Hiromasa Morikawa
I-Hung Lin
Yi-Chieh Liu
Hao-Hsiang Yang
Jesper N. Tegnér
81
18
0
16 Aug 2018
Context-Aware Visual Policy Network for Sequence-Level Image Captioning
Context-Aware Visual Policy Network for Sequence-Level Image Captioning
Daqing Liu
Zhengjun Zha
Hanwang Zhang
Yongdong Zhang
Feng Wu
CLIP
103
104
0
16 Aug 2018
Previous
123...505152...585960
Next