ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering
v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown
Title
Crowdsourcing Question-Answer Meaning Representations
Crowdsourcing Question-Answer Meaning Representations
Julian Michael
Gabriel Stanovsky
Luheng He
Ido Dagan
Luke Zettlemoyer
134
78
0
16 Nov 2017
Supervised and Unsupervised Transfer Learning for Question Answering
Supervised and Unsupervised Transfer Learning for Question Answering
Yu-An Chung
Hung-yi Lee
James R. Glass
104
83
0
14 Nov 2017
Weakly-supervised Semantic Parsing with Abstract Examples
Weakly-supervised Semantic Parsing with Abstract Examples
Omer Goldman
Veronica Latcinnik
Udi Naveh
Amir Globerson
Jonathan Berant
NAI
150
70
0
14 Nov 2017
High-Order Attention Models for Visual Question Answering
High-Order Attention Models for Visual Question Answering
Idan Schwartz
Alex Schwing
Tamir Hazan
80
103
0
12 Nov 2017
Object Referring in Visual Scene with Spoken Language
Object Referring in Visual Scene with Spoken Language
A. Vasudevan
Dengxin Dai
Luc Van Gool
98
19
0
10 Nov 2017
Active Learning for Visual Question Answering: An Empirical Study
Active Learning for Visual Question Answering: An Empirical Study
Xiaoyu Lin
Devi Parikh
102
32
0
06 Nov 2017
Whodunnit? Crime Drama as a Case for Natural Language Understanding
Whodunnit? Crime Drama as a Case for Natural Language Understanding
Lea Frermann
Shay B. Cohen
Mirella Lapata
67
26
0
31 Oct 2017
Discovery Radiomics with CLEAR-DR: Interpretable Computer Aided
  Diagnosis of Diabetic Retinopathy
Discovery Radiomics with CLEAR-DR: Interpretable Computer Aided Diagnosis of Diabetic Retinopathy
Devinder Kumar
Graham W. Taylor
Alexander Wong
MedIm
48
36
0
29 Oct 2017
Understanding Early Word Learning in Situated Artificial Agents
Understanding Early Word Learning in Situated Artificial Agents
Felix Hill
S. Clark
Karl Moritz Hermann
Phil Blunsom
LM&Ro
93
32
0
26 Oct 2017
InterpNET: Neural Introspection for Interpretable Deep Learning
InterpNET: Neural Introspection for Interpretable Deep Learning
Shane T. Barratt
65
20
0
26 Oct 2017
FigureQA: An Annotated Figure Dataset for Visual Reasoning
FigureQA: An Annotated Figure Dataset for Visual Reasoning
Samira Ebrahimi Kahou
Vincent Michalski
Adam Atkinson
Ákos Kádár
Adam Trischler
Yoshua Bengio
ReLMAIMat
91
331
0
19 Oct 2017
Describing Natural Images Containing Novel Objects with Knowledge Guided
  Assitance
Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance
Aditya Mogadala
Umanga Bista
Lexing Xie
Achim Rettinger
58
7
0
17 Oct 2017
Interactively Picking Real-World Objects with Unconstrained Spoken
  Language Instructions
Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions
Jun Hatori
Yuta Kikuchi
Sosuke Kobayashi
K. Takahashi
Yuta Tsuboi
Y. Unno
W. Ko
Jethro Tan
78
161
0
17 Oct 2017
Regularizing Deep Neural Networks by Noise: Its Interpretation and
  Optimization
Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Hyeonwoo Noh
Tackgeun You
Jonghwan Mun
Bohyung Han
NoLa
73
201
0
14 Oct 2017
iVQA: Inverse Visual Question Answering
iVQA: Inverse Visual Question Answering
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
66
47
0
10 Oct 2017
Person Re-Identification with Vision and Language
Person Re-Identification with Vision and Language
F. Yan
K. Mikolajczyk
J. Kittler
VLM
34
11
0
03 Oct 2017
Visual Reasoning with Natural Language
Visual Reasoning with Natural Language
Stephanie Zhou
Alane Suhr
Yoav Artzi
45
4
0
02 Oct 2017
Vision-based deep execution monitoring
Vision-based deep execution monitoring
Francesco Puja
S. Grazioso
A. Tammaro
Valsamis Ntouskos
Marta Sanzari
F. Pirri
26
1
0
29 Sep 2017
Fooling Vision and Language Models Despite Localization and Attention
  Mechanism
Fooling Vision and Language Models Despite Localization and Attention Mechanism
Xiaojun Xu
Xinyun Chen
Chang-rui Liu
Anna Rohrbach
Trevor Darrell
Basel Alomair
AAML
99
41
0
25 Sep 2017
Survey of Recent Advances in Visual Question Answering
Survey of Recent Advances in Visual Question Answering
Supriya Pandhre
Shagun Sodhani
30
14
0
24 Sep 2017
Visual Reference Resolution using Attention Memory for Visual Dialog
Visual Reference Resolution using Attention Memory for Visual Dialog
Paul Hongsuck Seo
Andreas M. Lehrmann
Bohyung Han
Leonid Sigal
107
123
0
23 Sep 2017
Modeling Image Virality with Pairwise Spatial Transformer Networks
Modeling Image Virality with Pairwise Spatial Transformer Networks
Abhimanyu Dubey
Sumeet Agarwal
GNN
30
11
0
22 Sep 2017
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAttAIMatOffRLAI4CE
386
2,251
0
22 Sep 2017
Visual Question Generation as Dual Task of Visual Question Answering
Visual Question Generation as Dual Task of Visual Question Answering
Yikang Li
Nan Duan
Bolei Zhou
Xiao Chu
Wanli Ouyang
Xiaogang Wang
98
166
0
21 Sep 2017
Exploring Human-like Attention Supervision in Visual Question Answering
Exploring Human-like Attention Supervision in Visual Question Answering
Tingting Qiao
Jianfeng Dong
Duanqing Xu
63
105
0
19 Sep 2017
Scene-centric Joint Parsing of Cross-view Videos
Scene-centric Joint Parsing of Cross-view Videos
Qi
Yuanlu Xu
Tao Yuan
Tianfu Wu
Song-Chun Zhu
11
0
0
16 Sep 2017
Learning Functional Causal Models with Generative Neural Networks
Learning Functional Causal Models with Generative Neural Networks
Hugo Jair Escalante
Sergio Escalera
Xavier Baro
Isabelle M Guyon
Umut Güçlü
Marcel van Gerven
CMLBDL
107
108
0
15 Sep 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training
  dataset for image captioning
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning
Yang Xian
Yingli Tian
VLM
59
23
0
15 Sep 2017
Robustness Analysis of Visual QA Models by Basic Questions
Robustness Analysis of Visual QA Models by Basic Questions
Jia-Hong Huang
Cuong Duc Dao
Modar Alfadly
C. Huck Yang
Guohao Li
OOD
65
24
0
14 Sep 2017
Link the head to the "beak": Zero Shot Learning from Noisy Text
  Description at Part Precision
Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision
Mohamed Elhoseiny
Yizhe Zhu
Han Zhang
Ahmed Elgammal
VLM
93
134
0
04 Sep 2017
Reasoning about Fine-grained Attribute Phrases using Reference Games
Reasoning about Fine-grained Attribute Phrases using Reference Games
Jong-Chyi Su
Chenyun Wu
Huaizu Jiang
Subhransu Maji
92
16
0
29 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised
  Attention in VQA and Question-Focused Semantic Segmentation
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
106
127
0
15 Aug 2017
Situation Recognition with Graph Neural Networks
Situation Recognition with Graph Neural Networks
Ruiyu Li
Makarand Tapaswi
Renjie Liao
Jiaya Jia
R. Urtasun
Sanja Fidler
GNN
70
132
0
14 Aug 2017
Mining Deep And-Or Object Structures via Cost-Sensitive
  Question-Answer-Based Active Annotations
Mining Deep And-Or Object Structures via Cost-Sensitive Question-Answer-Based Active Annotations
Quanshi Zhang
Ying Nian Wu
Hao Zhang
Song-Chun Zhu
49
5
0
13 Aug 2017
Going Deeper with Semantics: Video Activity Interpretation using
  Semantic Contextualization
Going Deeper with Semantics: Video Activity Interpretation using Semantic Contextualization
Sathyanarayanan N. Aakur
F. Souza
Sudeep Sarkar
49
10
0
11 Aug 2017
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling
  for Visual Question Answering
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Jianping Fan
Dacheng Tao
101
462
0
10 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions
Learning to Disambiguate by Asking Discriminative Questions
Yining Li
Chen Huang
Xiaoou Tang
Chen Change Loy
65
22
0
09 Aug 2017
Tips and Tricks for Visual Question Answering: Learnings from the 2017
  Challenge
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
Damien Teney
Peter Anderson
Xiaodong He
Anton Van Den Hengel
132
383
0
09 Aug 2017
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual
  Cross Retrieval
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval
Yuming Shen
Li Liu
Ling Shao
Jingkuan Song
65
49
0
08 Aug 2017
Weakly Supervised Image Annotation and Segmentation with Objects and
  Attributes
Weakly Supervised Image Annotation and Segmentation with Objects and Attributes
Zhiyuan Shi
Yongxin Yang
Timothy M. Hospedales
Tao Xiang
62
46
0
08 Aug 2017
Structured Attentions for Visual Question Answering
Structured Attentions for Visual Question Answering
Chen Zhu
Yanpeng Zhao
Shuaiyi Huang
Kewei Tu
Yi-An Ma
FAtt
87
107
0
07 Aug 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption
  Generator?
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?
Marc Tanti
Albert Gatt
K. Camilleri
48
56
0
07 Aug 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention
Identity-Aware Textual-Visual Matching with Latent Co-attention
Shuang Li
Tong Xiao
Hongsheng Li
Wei Yang
Xiaogang Wang
103
230
0
07 Aug 2017
Query-guided Regression Network with Context Policy for Phrase Grounding
Query-guided Regression Network with Context Policy for Phrase Grounding
Kan Chen
Rama Kovvuri
Ram Nevatia
88
142
0
04 Aug 2017
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for
  Visual Question Answering
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu
Jun-chen Yu
Jianping Fan
Dacheng Tao
82
669
0
04 Aug 2017
MemexQA: Visual Memex Question Answering
MemexQA: Visual Memex Question Answering
Lu Jiang
Junwei Liang
Liangliang Cao
Yannis Kalantidis
S. Farfade
Alexander G. Hauptmann
46
28
0
04 Aug 2017
Scene Graph Generation from Objects, Phrases and Region Captions
Scene Graph Generation from Objects, Phrases and Region Captions
Yikang Li
Wanli Ouyang
Bolei Zhou
Kun Wang
Xiaogang Wang
116
505
0
31 Jul 2017
Men Also Like Shopping: Reducing Gender Bias Amplification using
  Corpus-level Constraints
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
Jieyu Zhao
Tianlu Wang
Mark Yatskar
Vicente Ordonez
Kai-Wei Chang
FaML
142
975
0
29 Jul 2017
Video Highlight Prediction Using Audience Chat Reactions
Video Highlight Prediction Using Audience Chat Reactions
Cheng-Yang Fu
Joon Lee
Joey Tianyi Zhou
Alexander C. Berg
62
37
0
26 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
230
4,231
0
25 Jul 2017
Previous
123...545556...585960
Next