ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXivPDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,890 papers shown
Title
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
15
2,868
0
26 May 2017
Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence
  Models for Fill-in-the-Blank Image Captioning
Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning
Q. Sun
Stefan Lee
Dhruv Batra
BDL
33
43
0
24 May 2017
Learning Convolutional Text Representations for Visual Question
  Answering
Learning Convolutional Text Representations for Visual Question Answering
Zhengyang Wang
Shuiwang Ji
FAtt
19
15
0
18 May 2017
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
H. Ben-younes
Rémi Cadène
Matthieu Cord
Nicolas Thome
67
578
0
18 May 2017
ParlAI: A Dialog Research Software Platform
ParlAI: A Dialog Research Software Platform
Alexander H. Miller
Will Feng
Adam Fisch
Jiasen Lu
Dhruv Batra
Antoine Bordes
Devi Parikh
Jason Weston
40
373
0
18 May 2017
Object-Level Context Modeling For Scene Classification with Context-CNN
Object-Level Context Modeling For Scene Classification with Context-CNN
Syed Ashar Javed
A. Nelakanti
VLM
32
10
0
11 May 2017
Survey of Visual Question Answering: Datasets and Techniques
Survey of Visual Question Answering: Datasets and Techniques
A. Gupta
21
38
0
10 May 2017
Inferring and Executing Programs for Visual Reasoning
Inferring and Executing Programs for Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Judy Hoffman
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
NAI
35
541
0
10 May 2017
Combating Human Trafficking with Deep Multimodal Models
Combating Human Trafficking with Deep Multimodal Models
Edmund Tong
Amir Zadeh
Cara Jones
Louis-Philippe Morency
24
51
0
08 May 2017
Supervised Learning of Universal Sentence Representations from Natural
  Language Inference Data
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Alexis Conneau
Douwe Kiela
Holger Schwenk
Loïc Barrault
Antoine Bordes
AI4TS
SSL
70
2,097
0
05 May 2017
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on
  Weakly-Supervised Classification and Localization of Common Thorax Diseases
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
M. Bagheri
Ronald M. Summers
LM&MA
72
2,474
0
05 May 2017
Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
Fanyi Xiao
Leonid Sigal
Yong Jae Lee
35
139
0
03 May 2017
FOIL it! Find One mismatch between Image and Language caption
FOIL it! Find One mismatch between Image and Language caption
Ravi Shekhar
Sandro Pezzelle
Yauhen Klimovich
Aurélie Herbelot
Moin Nabi
E. Sangineto
Raffaella Bernardi
25
137
0
03 May 2017
The Forgettable-Watcher Model for Video Question Answering
The Forgettable-Watcher Model for Video Question Answering
Hongyang Xue
Zhou Zhao
Deng Cai
21
9
0
03 May 2017
Show, Adapt and Tell: Adversarial Training of Cross-domain Image
  Captioner
Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner
Tseng-Hung Chen
Yuan-Hong Liao
Ching-Yao Chuang
W. Hsu
Jianlong Fu
Min Sun
31
141
0
02 May 2017
The Promise of Premise: Harnessing Question Premises in Visual Question
  Answering
The Promise of Premise: Harnessing Question Premises in Visual Question Answering
Aroma Mahendru
Viraj Prabhu
Akrit Mohapatra
Dhruv Batra
Stefan Lee
NAI
37
38
0
01 May 2017
Speech-Based Visual Question Answering
Speech-Based Visual Question Answering
Ted Zhang
Dengxin Dai
Tinne Tuytelaars
Marie-Francine Moens
Luc Van Gool
40
24
0
01 May 2017
Mapping Instructions and Visual Observations to Actions with
  Reinforcement Learning
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
Dipendra Kumar Misra
John Langford
Yoav Artzi
21
247
0
28 Apr 2017
C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0
  Dataset
C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset
Aishwarya Agrawal
Aniruddha Kembhavi
Dhruv Batra
Devi Parikh
CoGe
26
80
0
26 Apr 2017
Paying Attention to Descriptions Generated by Image Captioning Models
Paying Attention to Descriptions Generated by Image Captioning Models
Hamed R. Tavakoli
Rakshith Shetty
Ali Borji
Jorma T. Laaksonen
29
79
0
24 Apr 2017
Towards Instance Segmentation with Object Priority: Prominent Object
  Detection and Recognition
Towards Instance Segmentation with Object Priority: Prominent Object Detection and Recognition
Hamed R. Tavakoli
Jorma T. Laaksonen
16
1
0
24 Apr 2017
An Analysis of Action Recognition Datasets for Language and Vision Tasks
An Analysis of Action Recognition Datasets for Language and Vision Tasks
Spandana Gella
Frank Keller
ObjD
24
11
0
24 Apr 2017
Being Negative but Constructively: Lessons Learnt from Creating Better
  Visual Question Answering Datasets
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Wei-Lun Chao
Hexiang Hu
Fei Sha
22
37
0
24 Apr 2017
Learning to Reason: End-to-End Module Networks for Visual Question
  Answering
Learning to Reason: End-to-End Module Networks for Visual Question Answering
Ronghang Hu
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Kate Saenko
KELM
GNN
ReLM
LRM
42
574
0
18 Apr 2017
AnchorNet: A Weakly Supervised Network to Learn Geometry-sensitive
  Features For Semantic Matching
AnchorNet: A Weakly Supervised Network to Learn Geometry-sensitive Features For Semantic Matching
David Novotny
Diane Larlus
Andrea Vedaldi
3DPC
33
65
0
16 Apr 2017
Video Fill In the Blank using LR/RL LSTMs with Spatial-Temporal
  Attentions
Video Fill In the Blank using LR/RL LSTMs with Spatial-Temporal Attentions
Amir Mazaheri
Dong-Ming Zhang
M. Shah
17
12
0
15 Apr 2017
ShapeWorld - A new test methodology for multimodal language
  understanding
ShapeWorld - A new test methodology for multimodal language understanding
A. Kuhnle
Ann A. Copestake
36
66
0
14 Apr 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
Y. Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
34
547
0
14 Apr 2017
Spatial Memory for Context Reasoning in Object Detection
Spatial Memory for Context Reasoning in Object Detection
Xinlei Chen
Abhinav Gupta
ObjD
25
164
0
13 Apr 2017
Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR)
  Approach to Understanding Deep Neural Networks
Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks
Devinder Kumar
Alexander Wong
Graham W. Taylor
31
59
0
13 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection
  with Natural Language Queries
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Y. Zhang
Luyao Yuan
Yijie Guo
Zhiyuan He
I-An Huang
Honglak Lee
ObjD
28
57
0
12 Apr 2017
What's in a Question: Using Visual Questions as a Form of Supervision
What's in a Question: Using Visual Questions as a Form of Supervision
Siddha Ganju
Olga Russakovsky
Abhinav Gupta
19
16
0
12 Apr 2017
Creativity: Generating Diverse Questions using Variational Autoencoders
Creativity: Generating Diverse Questions using Variational Autoencoders
Unnat Jain
Ziyu Zhang
Alex Schwing
25
152
0
11 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
27
494
0
11 Apr 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question
  Answering
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
V. Kazemi
Ali Elqursh
OOD
28
184
0
11 Apr 2017
Pay Attention to Those Sets! Learning Quantification from Images
Pay Attention to Those Sets! Learning Quantification from Images
Ionut-Teodor Sorodoc
Sandro Pezzelle
Aurélie Herbelot
Mariella Dimiccoli
Raffaella Bernardi
6
0
0
10 Apr 2017
An Empirical Evaluation of Visual Question Answering for Novel Objects
An Empirical Evaluation of Visual Question Answering for Novel Objects
Santhosh Kumar Ramakrishnan
Ambar Pal
Gaurav Sharma
Anurag Mittal
OOD
24
32
0
08 Apr 2017
It Takes Two to Tango: Towards Theory of AI's Mind
It Takes Two to Tango: Towards Theory of AI's Mind
Arjun Chandrasekaran
Deshraj Yadav
Prithvijit Chattopadhyay
Viraj Prabhu
Devi Parikh
41
54
0
03 Apr 2017
Aligned Image-Word Representations Improve Inductive Transfer Across
  Vision-Language Tasks
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
Tanmay Gupta
Kevin J. Shih
Saurabh Singh
Derek Hoiem
37
26
0
02 Apr 2017
Towards Building Large Scale Multimodal Domain-Aware Conversation
  Systems
Towards Building Large Scale Multimodal Domain-Aware Conversation Systems
Amrita Saha
Mitesh Khapra
Karthik Sankaranarayanan
29
8
0
01 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core
  tasks, applications and evaluation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
27
810
0
29 Mar 2017
A Deep Compositional Framework for Human-like Language Acquisition in
  Virtual Environment
A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment
Haonan Yu
Haichao Zhang
Wenyuan Xu
LM&Ro
11
25
0
28 Mar 2017
An Analysis of Visual Question Answering Algorithms
An Analysis of Visual Question Answering Algorithms
Kushal Kafle
Christopher Kanan
30
231
0
28 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe-nan Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
36
234
0
23 Mar 2017
Learning Cooperative Visual Dialog Agents with Deep Reinforcement
  Learning
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
Abhishek Das
Satwik Kottur
J. M. F. Moura
Stefan Lee
Dhruv Batra
OffRL
31
424
0
20 Mar 2017
VQABQ: Visual Question Answering by Basic Questions
VQABQ: Visual Question Answering by Basic Questions
Jia-Hong Huang
Modar Alfadly
Guohao Li
27
24
0
19 Mar 2017
Recurrent Models for Situation Recognition
Recurrent Models for Situation Recognition
Arun Mallya
Svetlana Lazebnik
20
30
0
18 Mar 2017
End-to-end optimization of goal-driven and visually grounded dialogue
  systems
End-to-end optimization of goal-driven and visually grounded dialogue systems
Florian Strub
H. D. Vries
Jérémie Mary
Bilal Piot
Aaron Courville
Olivier Pietquin
OffRL
30
138
0
15 Mar 2017
Deep Variation-structured Reinforcement Learning for Visual Relationship
  and Attribute Detection
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
Xiaodan Liang
Lisa Lee
Eric Xing
29
250
0
08 Mar 2017
Asymmetric Tri-training for Unsupervised Domain Adaptation
Asymmetric Tri-training for Unsupervised Domain Adaptation
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
49
583
0
27 Feb 2017
Previous
123...5455565758
Next