ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.05910
  4. Cited By
Visual Question Answering: A Survey of Methods and Datasets

Visual Question Answering: A Survey of Methods and Datasets

20 July 2016
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
ArXivPDFHTML

Papers citing "Visual Question Answering: A Survey of Methods and Datasets"

50 / 67 papers shown
Title
Capturing Rich Behavior Representations: A Dynamic Action Semantic-Aware Graph Transformer for Video Captioning
Capturing Rich Behavior Representations: A Dynamic Action Semantic-Aware Graph Transformer for Video Captioning
Caihua Liu
Xu Li
Wenjing Xue
Wei Tang
Xia Feng
56
0
0
20 Feb 2025
Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering
Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering
Qian Tao
Xiaoyang Fan
Yong Xu
Xingquan Zhu
Yufei Tang
52
0
0
22 Jan 2025
A Survey of Language-Based Communication in Robotics
A Survey of Language-Based Communication in Robotics
William Hunt
Sarvapali D. Ramchurn
Mohammad D. Soorati
LM&Ro
67
12
0
06 Jun 2024
IntentTuner: An Interactive Framework for Integrating Human Intents in
  Fine-tuning Text-to-Image Generative Models
IntentTuner: An Interactive Framework for Integrating Human Intents in Fine-tuning Text-to-Image Generative Models
Xingchen Zeng
Ziyao Gao
Yilin Ye
Wei Zeng
22
12
0
28 Jan 2024
Multimodality of AI for Education: Towards Artificial General
  Intelligence
Multimodality of AI for Education: Towards Artificial General Intelligence
Gyeong-Geon Lee
Lehong Shi
Ehsan Latif
Yizhu Gao
Arne Bewersdorff
...
Zheng Liu
Hui Wang
Gengchen Mai
Tiaming Liu
Xiaoming Zhai
35
38
0
10 Dec 2023
Learning Differentiable Logic Programs for Abstract Visual Reasoning
Learning Differentiable Logic Programs for Abstract Visual Reasoning
Hikaru Shindo
Viktor Pfanschilling
Devendra Singh Dhami
Kristian Kersting
NAI
34
6
0
03 Jul 2023
A Unified Framework for Slot based Response Generation in a Multimodal
  Dialogue System
A Unified Framework for Slot based Response Generation in a Multimodal Dialogue System
Mauajama Firdaus
Avinash Madasu
Asif Ekbal
47
7
0
27 May 2023
Interpretable Medical Image Visual Question Answering via Multi-Modal
  Relationship Graph Learning
Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning
Xinyue Hu
Lin Gu
Kazuma Kobayashi
Qi A. An
Qingyu Chen
Zhiyong Lu
Chang Su
Tatsuya Harada
Yingying Zhu
GNN
34
9
0
19 Feb 2023
On The Coherence of Quantitative Evaluation of Visual Explanations
On The Coherence of Quantitative Evaluation of Visual Explanations
Benjamin Vandersmissen
José Oramas
XAI
FAtt
36
3
0
14 Feb 2023
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution
  Generalization of VQA Models
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution Generalization of VQA Models
Ali Borji
CoGe
15
1
0
28 Jan 2023
Neuro-Symbolic Spatio-Temporal Reasoning
Neuro-Symbolic Spatio-Temporal Reasoning
Pascal Hitzler
Michael Sioutis
Md Kamruzzaman Sarker
Marjan Alirezaie
Aaron Eberhart
Stefan Wermter
NAI
28
0
0
28 Nov 2022
MapQA: A Dataset for Question Answering on Choropleth Maps
MapQA: A Dataset for Question Answering on Choropleth Maps
Shuaichen Chang
David Palzer
Jialin Li
Eric Fosler-Lussier
N. Xiao
19
40
0
15 Nov 2022
Watching the News: Towards VideoQA Models that can Read
Watching the News: Towards VideoQA Models that can Read
Soumya Jahagirdar
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
32
18
0
10 Nov 2022
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question
  Answering
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering
Hao Li
Jinfa Huang
Peng Jin
Guoli Song
Qi Wu
Jie Chen
39
21
0
21 Sep 2022
Visual Recognition by Request
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
16
15
0
28 Jul 2022
EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual
  Question Answering
EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual Question Answering
Violetta Shevchenko
Ehsan Abbasnejad
A. Dick
Anton Van Den Hengel
Damien Teney
49
0
0
29 Jun 2022
What is Right for Me is Not Yet Right for You: A Dataset for Grounding
  Relative Directions via Multi-Task Learning
What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning
Jae Hee Lee
Matthias Kerzel
Kyra Ahrens
C. Weber
S. Wermter
40
9
0
05 May 2022
Attention Mechanism based Cognition-level Scene Understanding
Attention Mechanism based Cognition-level Scene Understanding
Xuejiao Tang
Tai Le Quy
LRM
32
0
0
17 Apr 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
213
0
18 Feb 2022
Recognition-free Question Answering on Handwritten Document Collections
Recognition-free Question Answering on Handwritten Document Collections
Oliver Tüselmann
Friedrich Müller
Fabian Wolf
G. Fink
RALM
21
4
0
12 Feb 2022
Can Open Domain Question Answering Systems Answer Visual Knowledge
  Questions?
Can Open Domain Question Answering Systems Answer Visual Knowledge Questions?
Jiawen Zhang
Abhijit Mishra
Avinesh P.V.S
Siddharth Patwardhan
Sachin Agarwal
24
0
0
09 Feb 2022
Grounding Answers for Visual Questions Asked by Visually Impaired People
Grounding Answers for Visual Questions Asked by Visually Impaired People
Chongyan Chen
Samreen Anjum
Danna Gurari
27
50
0
04 Feb 2022
SA-VQA: Structured Alignment of Visual and Semantic Representations for
  Visual Question Answering
SA-VQA: Structured Alignment of Visual and Semantic Representations for Visual Question Answering
Peixi Xiong
Quanzeng You
Pei Yu
Zicheng Liu
Ying Wu
24
5
0
25 Jan 2022
Change Detection Meets Visual Question Answering
Change Detection Meets Visual Question Answering
Zhenghang Yuan
Lichao Mou
Zhitong Xiong
Xiaoxiang Zhu
21
43
0
12 Dec 2021
Question Answering Survey: Directions, Challenges, Datasets, Evaluation
  Matrices
Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices
Hariom A. Pandya
Brijesh S. Bhatt
42
27
0
07 Dec 2021
Visual Question Answering based on Formal Logic
Visual Question Answering based on Formal Logic
Muralikrishnna G. Sethuraman
Ali Payani
Faramarz Fekri
J. C. Kerce
NAI
21
3
0
08 Nov 2021
On the Significance of Question Encoder Sequence Model in the
  Out-of-Distribution Performance in Visual Question Answering
On the Significance of Question Encoder Sequence Model in the Out-of-Distribution Performance in Visual Question Answering
K. Gouthaman
Anurag Mittal
CML
45
0
0
28 Aug 2021
Recent Advances and Trends in Multimodal Deep Learning: A Review
Recent Advances and Trends in Multimodal Deep Learning: A Review
Jabeen Summaira
Xi Li
Amin Muhammad Shoib
Songyuan Li
Abdul Jabbar
HAI
20
55
0
24 May 2021
A Review on Explainability in Multimodal Deep Neural Nets
A Review on Explainability in Multimodal Deep Neural Nets
Gargi Joshi
Rahee Walambe
K. Kotecha
29
140
0
17 May 2021
RotLSTM: Rotating Memories in Recurrent Neural Networks
RotLSTM: Rotating Memories in Recurrent Neural Networks
Vlad Velici
Adam Prugel-Bennett
RALM
VLM
22
1
0
01 May 2021
Biomedical Question Answering: A Survey of Approaches and Challenges
Biomedical Question Answering: A Survey of Approaches and Challenges
Qiao Jin
Zheng Yuan
Guangzhi Xiong
Qian Yu
Huaiyuan Ying
Chuanqi Tan
Mosha Chen
Songfang Huang
Xiaozhong Liu
Sheng Yu
29
96
0
10 Feb 2021
Answer Questions with Right Image Regions: A Visual Attention
  Regularization Approach
Answer Questions with Right Image Regions: A Visual Attention Regularization Approach
Yebin Liu
Yangyang Guo
Jianhua Yin
Xuemeng Song
Weifeng Liu
Liqiang Nie
29
28
0
03 Feb 2021
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
118
31
0
16 Oct 2020
Referring Expression Comprehension: A Survey of Methods and Datasets
Referring Expression Comprehension: A Survey of Methods and Datasets
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
50
93
0
19 Jul 2020
On the Value of Out-of-Distribution Testing: An Example of Goodhart's
  Law
On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law
Damien Teney
Kushal Kafle
Robik Shrestha
Ehsan Abbasnejad
Christopher Kanan
Anton Van Den Hengel
OODD
OOD
33
145
0
19 May 2020
Visual Relationship Detection using Scene Graphs: A Survey
Visual Relationship Detection using Scene Graphs: A Survey
Aniket Agarwal
Ayush Mangal
Vipul
GNN
25
20
0
16 May 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual
  Question Answering
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
31
91
0
24 Feb 2020
A Review on Intelligent Object Perception Methods Combining
  Knowledge-based Reasoning and Machine Learning
A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning
Filippos Gouidis
Alexandros Vassiliades
T. Patkos
Antonis Argyros
Nick Bassiliades
Dimitris Plexousakis
OCL
29
12
0
26 Dec 2019
Towards Causal VQA: Revealing and Reducing Spurious Correlations by
  Invariant and Covariant Semantic Editing
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CML
AAML
32
155
0
16 Dec 2019
An Empirical Study on Leveraging Scene Graphs for Visual Question
  Answering
An Empirical Study on Leveraging Scene Graphs for Visual Question Answering
Cheng Zhang
Wei-Lun Chao
D. Xuan
23
50
0
28 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
25
133
0
22 Jul 2019
Open-Ended Long-Form Video Question Answering via Hierarchical
  Convolutional Self-Attention Networks
Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks
Zhu Zhang
Zhou Zhao
Zhijie Lin
Jingkuan Song
Xiaofei He
BDL
27
14
0
28 Jun 2019
Integrating Knowledge and Reasoning in Image Understanding
Integrating Knowledge and Reasoning in Image Understanding
Somak Aditya
Yezhou Yang
Chitta Baral
OCL
39
40
0
24 Jun 2019
Adversarial Multimodal Network for Movie Question Answering
Zhaoquan Yuan
Siyuan Sun
Lixin Duan
Xiao Wu
Changsheng Xu
24
3
0
24 Jun 2019
"My Way of Telling a Story": Persona based Grounded Story Generation
"My Way of Telling a Story": Persona based Grounded Story Generation
Shrimai Prabhumoye
Khyathi Raghavi Chandu
Ruslan Salakhutdinov
A. Black
27
35
0
14 Jun 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
96
1,413
0
24 May 2019
Show, Price and Negotiate: A Negotiator with Online Value Look-Ahead
Show, Price and Negotiate: A Negotiator with Online Value Look-Ahead
Amin Parvaneh
Ehsan Abbasnejad
Qi Wu
Javen Qinfeng Shi
Anton van den Hengel
OffRL
29
5
0
07 May 2019
Answer Them All! Toward Universal Visual Question Answering Models
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
25
82
0
01 Mar 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
53
322
0
20 Jan 2019
On transfer learning using a MAC model variant
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
26
9
0
15 Nov 2018
12
Next