Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.09684
Cited By
An Analysis of Visual Question Answering Algorithms
28 March 2017
Kushal Kafle
Christopher Kanan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Analysis of Visual Question Answering Algorithms"
48 / 48 papers shown
Title
Conditional Temporal Neural Processes with Covariance Loss
Boseon Yoo
Jiwoo Lee
Janghoon Ju
Seijun Chung
Soyeon Kim
Jaesik Choi
70
15
0
01 Apr 2025
FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA
S M Sarwar
77
1
0
25 Feb 2025
MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-Text Decoding
Weikang Qiu
Zheng Huang
Haoyu Hu
Aosong Feng
Yujun Yan
Rex Ying
47
0
0
18 Feb 2025
The Quest for Visual Understanding: A Journey Through the Evolution of Visual Question Answering
Anupam Pandey
Deepjyoti Bodo
Arpan Phukan
Asif Ekbal
36
0
0
13 Jan 2025
A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future
Shilin Sun
Wenbin An
Feng Tian
Fang Nan
Qidong Liu
J. Liu
N. Shah
Ping Chen
96
2
0
18 Dec 2024
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
Siyuan Wang
Dianyi Wang
Chengxing Zhou
Zejun Li
Zhihao Fan
Xuanjing Huang
Zhongyu Wei
VLM
179
0
0
17 Dec 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
29
1
0
17 Nov 2024
SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Sunil Aryal
Imran Razzak
Hakim Hacid
26
0
0
30 Oct 2024
Can MLLMs Understand the Deep Implication Behind Chinese Images?
Chenhao Zhang
Xi Feng
Yuelin Bai
Xinrun Du
Jinchang Hou
...
Min Yang
Wenhao Huang
Chenghua Lin
Ge Zhang
Shiwen Ni
ELM
VLM
38
3
0
17 Oct 2024
MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling
Jian Yang
Dacheng Yin
Yizhou Zhou
Fengyun Rao
Wei-dong Zhai
Yang Cao
Zheng-jun Zha
DiffM
28
2
0
14 Oct 2024
ING-VP: MLLMs cannot Play Easy Vision-based Games Yet
Haoran Zhang
Hangyu Guo
Shuyue Guo
Meng Cao
Wenhao Huang
Jiaheng Liu
Ge Zhang
VLM
MLLM
LRM
35
2
0
09 Oct 2024
QTG-VQA: Question-Type-Guided Architectural for VideoQA Systems
Zhixian He
Pengcheng Zhao
Fuwei Zhang
Shujin Lin
41
0
0
14 Sep 2024
Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions
Junzhang Liu
Zhecan Wang
Hammad A. Ayyubi
Haoxuan You
Chris Thomas
Rui Sun
Shih-Fu Chang
Kai-Wei Chang
39
0
0
18 May 2024
Hallucination of Multimodal Large Language Models: A Survey
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
95
139
0
29 Apr 2024
Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models
Jesse Atuhurra
Iqra Ali
Tatsuya Hiraoka
Hidetaka Kamigaito
Tomoya Iwakura
Taro Watanabe
44
1
0
29 Mar 2024
COCO is "ALL'' You Need for Visual Instruction Fine-tuning
Xiaotian Han
Yiqi Wang
Bohan Zhai
Quanzeng You
Hongxia Yang
VLM
MLLM
33
2
0
17 Jan 2024
Dolphins: Multimodal Language Model for Driving
Yingzi Ma
Yulong Cao
Jiachen Sun
Marco Pavone
Chaowei Xiao
MLLM
30
50
0
01 Dec 2023
Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning
Mustafa Shukor
Alexandre Ramé
Corentin Dancette
Matthieu Cord
LRM
MLLM
38
20
0
01 Oct 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
21
1
0
15 Jun 2023
An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics
Saba Ahmadi
Aishwarya Agrawal
27
6
0
24 May 2023
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
Zhiyang Xu
Ying Shen
Lifu Huang
MLLM
32
110
0
21 Dec 2022
Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning
Q. Si
Yuanxin Liu
Fandong Meng
Zheng Lin
Peng Fu
Yanan Cao
Weiping Wang
Jie Zhou
34
23
0
10 Oct 2022
Interactive Question Answering Systems: Literature Review
Giovanni Maria Biancofiore
Yashar Deldjoo
T. D. Noia
E. Sciascio
F. Narducci
34
13
0
04 Sep 2022
All You May Need for VQA are Image Captions
Soravit Changpinyo
Doron Kukliansky
Idan Szpektor
Xi Chen
Nan Ding
Radu Soricut
32
70
0
04 May 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
213
0
18 Feb 2022
Grounding Answers for Visual Questions Asked by Visually Impaired People
Chongyan Chen
Samreen Anjum
Danna Gurari
23
50
0
04 Feb 2022
Multimodal Integration of Human-Like Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Philippe Muller
Dominike Thomas
Mihai Bâce
Andreas Bulling
35
16
0
27 Sep 2021
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Florian Strohm
Prajit Dhar
Andreas Bulling
40
19
0
27 Sep 2021
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
Corentin Dancette
Rémi Cadène
Damien Teney
Matthieu Cord
CML
28
75
0
07 Apr 2021
WeaQA: Weak Supervision via Captions for Visual Question Answering
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
19
35
0
04 Dec 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
23
17
0
20 Jan 2020
REMIND Your Neural Network to Prevent Catastrophic Forgetting
Tyler L. Hayes
Kushal Kafle
Robik Shrestha
Manoj Acharya
Christopher Kanan
CLL
31
294
0
06 Oct 2019
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
33
59
0
26 Sep 2019
Why Does a Visual Question Have Different Answers?
Nilavra Bhattacharya
Qing Li
Danna Gurari
23
65
0
12 Aug 2019
Question-Agnostic Attention for Visual Question Answering
M. Farazi
Salman H Khan
Nick Barnes
13
10
0
09 Aug 2019
An Empirical Study on Leveraging Scene Graphs for Visual Question Answering
Cheng Zhang
Wei-Lun Chao
D. Xuan
23
50
0
28 Jul 2019
RUBi: Reducing Unimodal Biases in Visual Question Answering
Rémi Cadène
Corentin Dancette
H. Ben-younes
Matthieu Cord
Devi Parikh
CML
19
368
0
24 Jun 2019
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
17
82
0
01 Mar 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
19
271
0
25 Feb 2019
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
19
92
0
19 Nov 2018
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
20
235
0
08 Oct 2018
Did the Model Understand the Question?
Pramod Kaushik Mudrakarta
Ankur Taly
Mukund Sundararajan
Kedar Dhamdhere
ELM
OOD
FAtt
27
196
0
14 May 2018
DVQA: Understanding Data Visualizations via Question Answering
Kushal Kafle
Brian L. Price
Scott D. Cohen
Christopher Kanan
AIMat
33
363
0
24 Jan 2018
Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
P. Li
Jiangtao Xie
Qilong Wang
Zilin Gao
25
255
0
04 Dec 2017
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Aishwarya Agrawal
Dhruv Batra
Devi Parikh
Aniruddha Kembhavi
OOD
51
581
0
01 Dec 2017
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Wei-Lun Chao
Hexiang Hu
Fei Sha
22
37
0
24 Apr 2017
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
101
3,120
0
02 Dec 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
152
1,465
0
06 Jun 2016
1