ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering
v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown
Title
Recent advances in deep learning applied to skin cancer detection
Recent advances in deep learning applied to skin cancer detection
André G. C. Pacheco
R. Krohling
MedIm
57
38
0
06 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
143
252
0
06 Dec 2019
Weak Supervision helps Emergence of Word-Object Alignment and improves
  Vision-Language Tasks
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
60
15
0
06 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
111
117
0
05 Dec 2019
Knowledge-Enriched Visual Storytelling
Knowledge-Enriched Visual Storytelling
Chao-Chun Hsu
Zi-Yuan Chen
Chi-Yang Hsu
Chih-Chia Li
Tzu-Yuan Lin
Ting-Hao 'Kenneth' Huang
Lun-Wei Ku
DiffM
90
47
0
03 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDLOODUQCV
93
14
0
02 Dec 2019
TutorialVQA: Question Answering Dataset for Tutorial Videos
TutorialVQA: Question Answering Dataset for Tutorial Videos
Anthony Colas
Seokhwan Kim
Franck Dernoncourt
Siddhesh Gupte
D. Wang
Doo Soon Kim
76
31
0
02 Dec 2019
Assessing the Robustness of Visual Question Answering Models
Assessing the Robustness of Visual Question Answering Models
Jia-Hong Huang
Modar Alfadly
Guohao Li
Marcel Worring
AAMLOOD
100
24
0
30 Nov 2019
A Free Lunch in Generating Datasets: Building a VQG and VQA System with
  Attention and Humans in the Loop
A Free Lunch in Generating Datasets: Building a VQG and VQA System with Attention and Humans in the Loop
Jihyeon Janel Lee
S. Arora
18
1
0
30 Nov 2019
Learning Perceptual Inference by Contrasting
Learning Perceptual Inference by Contrasting
Chi Zhang
Baoxiong Jia
Feng Gao
Yixin Zhu
Hongjing Lu
Song-Chun Zhu
LRM
84
109
0
29 Nov 2019
Multimodal Machine Translation through Visuals and Speech
Multimodal Machine Translation through Visuals and Speech
U. Sulubacak
Ozan Caglayan
Stig-Arne Gronroos
Aku Rouhe
Desmond Elliott
Lucia Specia
Jörg Tiedemann
101
77
0
28 Nov 2019
Multimodal Attention Networks for Low-Level Vision-and-Language
  Navigation
Multimodal Attention Networks for Low-Level Vision-and-Language Navigation
Federico Landi
Lorenzo Baraldi
Marcella Cornia
M. Corsini
Rita Cucchiara
LM&Ro
87
29
0
27 Nov 2019
Transfer Learning in Visual and Relational Reasoning
Transfer Learning in Visual and Relational Reasoning
T. S. Jayram
Vincent Marois
Tomasz Kornuta
V. Albouy
Emre Sevgen
A. Ozcan
NAIOODLRM
37
2
0
27 Nov 2019
Efficient Attention Mechanism for Visual Dialog that can Handle All the
  Interactions between Multiple Inputs
Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
107
7
0
26 Nov 2019
Learning to Learn Words from Visual Scenes
Learning to Learn Words from Visual Scenes
Dídac Surís
Dave Epstein
Heng Ji
Shih-Fu Chang
Carl Vondrick
VLMCLIPSSLOffRL
70
4
0
25 Nov 2019
Two Causal Principles for Improving Visual Dialog
Two Causal Principles for Improving Visual Dialog
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
110
149
0
24 Nov 2019
Unsupervised Keyword Extraction for Full-sentence VQA
Unsupervised Keyword Extraction for Full-sentence VQA
Kohei Uehara
Tatsuya Harada
32
1
0
23 Nov 2019
Temporal Reasoning via Audio Question Answering
Temporal Reasoning via Audio Question Answering
Haytham M. Fayek
Justin Johnson
65
54
0
21 Nov 2019
ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks
ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks
Monika Sharma
Shikha Gupta
Arindam Chowdhury
Lovekesh Vig
40
9
0
21 Nov 2019
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Badri N. Patro
Anupriy
Vinay P. Namboodiri
AAMLFAtt
85
26
0
19 Nov 2019
Modal-aware Features for Multimodal Hashing
Modal-aware Features for Multimodal Hashing
Haien Zeng
Hanjiang Lai
Hanlu Chu
Yong Tang
Jian Yin
48
0
0
19 Nov 2019
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning
  Tasks
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
Fengda Zhu
Yi Zhu
Xiaojun Chang
Xiaodan Liang
LRM
115
244
0
18 Nov 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
90
70
0
17 Nov 2019
The Eighth Dialog System Technology Challenge
The Eighth Dialog System Technology Challenge
Seokhwan Kim
Michel Galley
Chulaka Gunasekara
Sungjin Lee
Adam Atkinson
...
Tim K. Marks
Abhinav Rastogi
Xiaoxue Zang
Srinivas Sunkara
Raghav Gupta
VLM
71
65
0
14 Nov 2019
Question-Conditioned Counterfactual Image Generation for VQA
Question-Conditioned Counterfactual Image Generation for VQA
Jingjing Pan
Yash Goyal
Stefan Lee
EgoVOOD
90
19
0
14 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal
  Transformers for TextVQA
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
96
197
0
14 Nov 2019
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via
  Iterative Multi-agent Communication
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication
Ruize Wang
Zhongyu Wei
Ying Cheng
Piji Li
Haijun Shan
Ji Zhang
Qi Zhang
Xuanjing Huang
VGenDiffM
86
13
0
11 Nov 2019
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Yiming Xu
Lin Chen
Zhongwei Cheng
Lixin Duan
Jiebo Luo
OOD
86
24
0
11 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAIAI4TS
122
338
0
10 Nov 2019
On Architectures for Including Visual Information in Neural Language
  Models for Image Description
On Architectures for Including Visual Information in Neural Language Models for Image Description
Marc Tanti
Albert Gatt
K. Camilleri
VLM
48
2
0
09 Nov 2019
Are we asking the right questions in MovieQA?
Are we asking the right questions in MovieQA?
Bhavan A. Jasani
Rohit Girdhar
Deva Ramanan
72
16
0
08 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRMReLM
102
9
0
31 Oct 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
210
1,014
0
31 Oct 2019
Heterogeneous Graph Learning for Visual Commonsense Reasoning
Heterogeneous Graph Learning for Visual Commonsense Reasoning
Weijiang Yu
Jingwen Zhou
Weihao Yu
Xiaodan Liang
Nong Xiao
LRM
79
47
0
25 Oct 2019
Cross-Lingual Vision-Language Navigation
Cross-Lingual Vision-Language Navigation
An Yan
Xinze Wang
Jiangtao Feng
Lei Li
William Yang Wang
LM&Ro
74
16
0
24 Oct 2019
Assisting human experts in the interpretation of their visual process: A
  case study on assessing copper surface adhesive potency
Assisting human experts in the interpretation of their visual process: A case study on assessing copper surface adhesive potency
T. Hascoet
Xuejiao Deng
Daniela Mihai
Mari Sugiyama
Yuji Adachi
Sachiko Nakamura
Jonathon S. Hare
Tomoko Hayashi
T. Takiguchi
15
1
0
24 Oct 2019
KnowIT VQA: Answering Knowledge-Based Questions about Videos
KnowIT VQA: Answering Knowledge-Based Questions about Videos
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
152
80
0
23 Oct 2019
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies,
  Opportunities and Challenges toward Responsible AI
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
Alejandro Barredo Arrieta
Natalia Díaz Rodríguez
Javier Del Ser
Adrien Bennetot
Siham Tabik
...
S. Gil-Lopez
Daniel Molina
Richard Benjamins
Raja Chatila
Francisco Herrera
XAI
311
6,387
0
22 Oct 2019
Good, Better, Best: Textual Distractors Generation for Multiple-Choice
  Visual Question Answering via Reinforcement Learning
Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning
Jiaying Lu
Xin Ye
Yi Ren
Yezhou Yang
78
10
0
21 Oct 2019
Enforcing Reasoning in Visual Commonsense Reasoning
Enforcing Reasoning in Visual Commonsense Reasoning
Hammad A. Ayyubi
Md. Mehrab Tanjim
D. Kriegman
ReLMOOD
57
2
0
21 Oct 2019
PyTorchPipe: a framework for rapid prototyping of pipelines combining
  language and vision
PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision
Tomasz Kornuta
29
2
0
18 Oct 2019
ALOHA: Artificial Learning of Human Attributes for Dialogue Agents
ALOHA: Artificial Learning of Human Attributes for Dialogue Agents
Aaron W. Li
Veronica Jiang
Steven Y. Feng
Julia Sprague
Wei Zhou
Jesse Hoey
63
28
0
18 Oct 2019
Dynamic Attention Networks for Task Oriented Grounding
Dynamic Attention Networks for Task Oriented Grounding
S. Dasgupta
Badri N. Patro
Vinay P. Namboodiri
73
1
0
14 Oct 2019
Granular Multimodal Attention Networks for Visual Dialog
Granular Multimodal Attention Networks for Visual Dialog
Badri N. Patro
Shivansh Patel
Vinay P. Namboodiri
114
1
0
13 Oct 2019
Modulated Self-attention Convolutional Network for VQA
Modulated Self-attention Convolutional Network for VQA
Jean-Benoit Delbrouck
Antoine Maiorca
Nathan Hubens
Stéphane Dupont
25
1
0
08 Oct 2019
Meta Module Network for Compositional Visual Reasoning
Meta Module Network for Compositional Visual Reasoning
Wenhu Chen
Zhe Gan
Linjie Li
Yu Cheng
Wenjie Wang
Jingjing Liu
LRM
93
71
0
08 Oct 2019
Compositional Generalization for Primitive Substitutions
Compositional Generalization for Primitive Substitutions
Yuanpeng Li
Liang Zhao
Jianyu Wang
Joel Hestness
72
87
0
07 Oct 2019
REMIND Your Neural Network to Prevent Catastrophic Forgetting
REMIND Your Neural Network to Prevent Catastrophic Forgetting
Tyler L. Hayes
Kushal Kafle
Robik Shrestha
Manoj Acharya
Christopher Kanan
CLL
155
303
0
06 Oct 2019
Which Ads to Show? Advertisement Image Assessment with Auxiliary
  Information via Multi-step Modality Fusion
Which Ads to Show? Advertisement Image Assessment with Auxiliary Information via Multi-step Modality Fusion
Kyung-Wha Park
Junghoon Lee
Sunyoung Kwon
Jung-Woo Ha
Kyung-Min Kim
Byoung-Tak Zhang
32
2
0
06 Oct 2019
Talk2Nav: Long-Range Vision-and-Language Navigation with Dual Attention
  and Spatial Memory
Talk2Nav: Long-Range Vision-and-Language Navigation with Dual Attention and Spatial Memory
A. Vasudevan
Ahmed K. Farahat
Chetan Gupta
LM&Ro
67
2
0
04 Oct 2019
Previous
123...444546...585960
Next