ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering
v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown
Title
Efficient Large-Scale Multi-Modal Classification
Efficient Large-Scale Multi-Modal Classification
D. Kiela
Edouard Grave
Armand Joulin
Tomas Mikolov
97
151
0
06 Feb 2018
Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement
  Learning
Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning
Minghai Chen
Sen Wang
Paul Pu Liang
T. Baltrušaitis
Amir Zadeh
Louis-Philippe Morency
74
281
0
03 Feb 2018
Dual Recurrent Attention Units for Visual Question Answering
Dual Recurrent Attention Units for Visual Question Answering
Ahmed Osman
Wojciech Samek
51
32
0
01 Feb 2018
Interactive Grounded Language Acquisition and Generalization in a 2D
  World
Interactive Grounded Language Acquisition and Generalization in a 2D World
Haonan Yu
Haichao Zhang
Wenyuan Xu
LLMAGLM&Ro
194
79
0
31 Jan 2018
Object-based reasoning in VQA
Object-based reasoning in VQA
Mikyas T. Desta
Larry Chen
Tomasz Kornuta
67
33
0
29 Jan 2018
DeepSIC: Deep Semantic Image Compression
DeepSIC: Deep Semantic Image Compression
Sihui Luo
Yezhou Yang
Xiuming Zhang
69
46
0
29 Jan 2018
Game of Sketches: Deep Recurrent Models of Pictionary-style Word
  Guessing
Game of Sketches: Deep Recurrent Models of Pictionary-style Word Guessing
Ravi Kiran Sarvadevabhatla
Shiv Surya
Trisha Mittal
Venkatesh Babu Radhakrishnan
70
14
0
29 Jan 2018
Tell-and-Answer: Towards Explainable Visual Question Answering using
  Attributes and Captions
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
Qing Li
Jianlong Fu
D. Yu
Tao Mei
Jiebo Luo
FAttXAICoGe
97
60
0
27 Jan 2018
DVQA: Understanding Data Visualizations via Question Answering
DVQA: Understanding Data Visualizations via Question Answering
Kushal Kafle
Brian L. Price
Scott D. Cohen
Christopher Kanan
AIMat
116
397
0
24 Jan 2018
Structured Triplet Learning with POS-tag Guided Attention for Visual
  Question Answering
Structured Triplet Learning with POS-tag Guided Attention for Visual Question Answering
Zhe Wang
Xiaoyi Liu
Liangjian Chen
Limin Wang
Yu Qiao
Xiaohui Xie
Charless C. Fowlkes
58
14
0
24 Jan 2018
Visual Analytics in Deep Learning: An Interrogative Survey for the Next
  Frontiers
Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers
Fred Hohman
Minsuk Kahng
Robert S. Pienta
Duen Horng Chau
OODHAI
103
541
0
21 Jan 2018
Benchmark Visual Question Answer Models by using Focus Map
Benchmark Visual Question Answer Models by using Focus Map
Wenda Qiu
Yueyang Xianzang
Zhekai Zhang
3DV
22
0
0
13 Jan 2018
Visual Text Correction
Visual Text Correction
Amir Mazaheri
M. Shah
103
11
0
06 Jan 2018
Object Referring in Videos with Language and Human Gaze
Object Referring in Videos with Language and Human Gaze
A. Vasudevan
Dengxin Dai
Luc Van Gool
VOS
104
76
0
04 Jan 2018
Interpretable Counting for Visual Question Answering
Interpretable Counting for Visual Question Answering
Alexander R. Trott
Caiming Xiong
R. Socher
106
71
0
23 Dec 2017
DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme
  Exposure Image Pairs
DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs
K. Prabhakar
V. S. Srikar
R. Venkatesh Babu
51
548
0
20 Dec 2017
Visual Explanations from Hadamard Product in Multimodal Deep Networks
Visual Explanations from Hadamard Product in Multimodal Deep Networks
Jin-Hwa Kim
Byoung-Tak Zhang
FAtt
32
4
0
18 Dec 2017
CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven
  Communication
CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication
Jin-Hwa Kim
Nikita Kitaev
Xinlei Chen
Marcus Rohrbach
Byoung-Tak Zhang
Yuandong Tian
Dhruv Batra
Devi Parikh
DiffMVGen
91
25
0
15 Dec 2017
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
Yonatan Bisk
Kevin J. Shih
Yejin Choi
D. Marcu
65
63
0
10 Dec 2017
IQA: Visual Question Answering in Interactive Environments
IQA: Visual Question Answering in Interactive Environments
Daniel Gordon
Aniruddha Kembhavi
Mohammad Rastegari
Joseph Redmon
Dieter Fox
Ali Farhadi
LM&Ro
129
391
0
09 Dec 2017
Grounding Referring Expressions in Images by Variational Context
Grounding Referring Expressions in Images by Variational Context
Hanwang Zhang
Yulei Niu
Shih-Fu Chang
BDLObjD
84
223
0
05 Dec 2017
Examining Cooperation in Visual Dialog Models
Examining Cooperation in Visual Dialog Models
Mircea Mironenco
D. Kianfar
Ke M. Tran
Evangelos Kanoulas
E. Gavves
52
4
0
04 Dec 2017
Learning by Asking Questions
Learning by Asking Questions
Ishan Misra
Ross B. Girshick
Rob Fergus
M. Hebert
Abhinav Gupta
Laurens van der Maaten
74
84
0
04 Dec 2017
Incorporating External Knowledge to Answer Open-Domain Visual Questions
  with Dynamic Memory Networks
Incorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks
Guohao Li
Hang Su
Wenwu Zhu
100
46
0
03 Dec 2017
Recurrent Neural Networks for Semantic Instance Segmentation
Recurrent Neural Networks for Semantic Instance Segmentation
Amaia Salvador
Míriam Bellver
Victor Campos
Manel Baradad
F. Marqués
Jordi Torres
Xavier Giró-i-Nieto
SSeg
69
62
0
02 Dec 2017
Don't Just Assume; Look and Answer: Overcoming Priors for Visual
  Question Answering
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Aishwarya Agrawal
Dhruv Batra
Devi Parikh
Aniruddha Kembhavi
OOD
182
587
0
01 Dec 2017
Relation Networks for Object Detection
Relation Networks for Object Detection
Han Hu
Jiayuan Gu
Zheng Zhang
Jifeng Dai
Yichen Wei
ObjD
150
1,230
0
30 Nov 2017
Embodied Question Answering
Embodied Question Answering
Abhishek Das
Samyak Datta
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
113
652
0
30 Nov 2017
Multimodal Attribute Extraction
Multimodal Attribute Extraction
Robert L Logan IV
Samuel Humeau
Sameer Singh
64
27
0
29 Nov 2017
HoME: a Household Multimodal Environment
HoME: a Household Multimodal Environment
Simon Brodeur
Ethan Perez
Ankesh Anand
Florian Golemo
Luca Herranz-Celotti
Florian Strub
Jean Rouat
Hugo Larochelle
Aaron Courville
LM&Ro
120
103
0
29 Nov 2017
AttnGAN: Fine-Grained Text to Image Generation with Attentional
  Generative Adversarial Networks
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GANViT
175
1,725
0
28 Nov 2017
Hyper-dimensional computing for a visual question-answering system that
  is trainable end-to-end
Hyper-dimensional computing for a visual question-answering system that is trainable end-to-end
Guglielmo Montone
J. O'Regan
A. Terekhov
45
13
0
28 Nov 2017
Dynamic Graph Generation Network: Generating Relational Knowledge from
  Diagrams
Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams
Daesik Kim
Y. Yoo
Jeesoo Kim
Sangkuk Lee
Nojun Kwak
61
25
0
27 Nov 2017
Convolutional Image Captioning
Convolutional Image Captioning
J. Aneja
Aditya Deshpande
Alex Schwing
VLM
137
361
0
24 Nov 2017
Conditional Image-Text Embedding Networks
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
100
118
0
22 Nov 2017
Visual Question Answering as a Meta Learning Task
Visual Question Answering as a Meta Learning Task
Damien Teney
Anton Van Den Hengel
OOD
78
42
0
22 Nov 2017
Deep Sparse Coding for Invariant Multimodal Halle Berry Neurons
Deep Sparse Coding for Invariant Multimodal Halle Berry Neurons
Edward J. Kim
Darryl Hannan
Garrett Kenyon
85
25
0
21 Nov 2017
Functional Map of the World
Functional Map of the World
Gordon A. Christie
Neil Fendley
James Wilson
R. Mukherjee
VGen
87
399
0
21 Nov 2017
Asking the Difficult Questions: Goal-Oriented Visual Question Generation
  via Intermediate Rewards
Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards
Junjie Zhang
Qi Wu
Chunhua Shen
Jian Zhang
Jianfeng Lu
Anton Van Den Hengel
LRM
69
29
0
21 Nov 2017
Are You Talking to Me? Reasoned Visual Dialog Generation through
  Adversarial Learning
Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
Qi Wu
Peng Wang
Chunhua Shen
Ian Reid
Anton Van Den Hengel
GAN
82
129
0
21 Nov 2017
Vision-and-Language Navigation: Interpreting visually-grounded
  navigation instructions in real environments
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
182
1,325
0
20 Nov 2017
Adversarial Attacks Beyond the Image Space
Adversarial Attacks Beyond the Image Space
Fangyin Wei
Chenxi Liu
Yu-Siang Wang
Weichao Qiu
Lingxi Xie
Yu-Wing Tai
Chi-Keung Tang
Alan Yuille
AAML
126
150
0
20 Nov 2017
Co-attending Free-form Regions and Detections with Multi-modal
  Multiplicative Feature Embedding for Visual Question Answering
Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering
Pan Lu
Hongsheng Li
Wei Zhang
Jianyong Wang
Xiaogang Wang
94
80
0
18 Nov 2017
ADVISE: Symbolism and External Knowledge for Decoding Advertisements
ADVISE: Symbolism and External Knowledge for Decoding Advertisements
Keren Ye
Adriana Kovashka
79
51
0
17 Nov 2017
Attentive Explanations: Justifying Decisions and Pointing to the
  Evidence (Extended Abstract)
Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract)
Dong Huk Park
Lisa Anne Hendricks
Zeynep Akata
Anna Rohrbach
Bernt Schiele
Trevor Darrell
Marcus Rohrbach
46
4
0
17 Nov 2017
Neural Motifs: Scene Graph Parsing with Global Context
Neural Motifs: Scene Graph Parsing with Global Context
Rowan Zellers
Mark Yatskar
Sam Thomson
Yejin Choi
GNN
135
1,003
0
17 Nov 2017
Language-Based Image Editing with Recurrent Attentive Models
Language-Based Image Editing with Recurrent Attentive Models
Jianbo Chen
Yelong Shen
Jianfeng Gao
Jingjing Liu
Xiaodong Liu
99
122
0
16 Nov 2017
A Novel Framework for Robustness Analysis of Visual QA Models
A Novel Framework for Robustness Analysis of Visual QA Models
Jia-Hong Huang
Cuong Duc Dao
Modar Alfadly
Guohao Li
AAMLOOD
82
34
0
16 Nov 2017
Natural Language Guided Visual Relationship Detection
Natural Language Guided Visual Relationship Detection
Wentong Liao
Shuai Lin
Bodo Rosenhahn
M. Yang
92
63
0
16 Nov 2017
Priming Neural Networks
Priming Neural Networks
Amir Rosenfeld
Mahdi Biparva
John K. Tsotsos
ObjD
69
11
0
16 Nov 2017
Previous
123...535455...585960
Next