ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.07356
  4. Cited By
Analyzing the Behavior of Visual Question Answering Models

Analyzing the Behavior of Visual Question Answering Models

23 June 2016
Aishwarya Agrawal
Dhruv Batra
Devi Parikh
ArXivPDFHTML

Papers citing "Analyzing the Behavior of Visual Question Answering Models"

50 / 79 papers shown
Title
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
Jiacheng Zhang
Yang Jiao
Shaoxiang Chen
Jingjing Chen
Zhiyu Tan
Hao Li
Jingjing Chen
MLLM
61
18
0
25 Sep 2024
Causal Understanding For Video Question Answering
Causal Understanding For Video Question Answering
Bhanu Prakash Reddy Guda
Tanmay Kulkarni
Adithya Sampath
Swarnashree Mysore Sathyendra
CML
54
0
0
23 Jul 2024
VideoDistill: Language-aware Vision Distillation for Video Question
  Answering
VideoDistill: Language-aware Vision Distillation for Video Question Answering
Bo Zou
Chao Yang
Yu Qiao
Chengbin Quan
Youjian Zhao
VGen
50
1
0
01 Apr 2024
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
Xingrui Wang
Wufei Ma
Zhuowan Li
Adam Kortylewski
Alan L. Yuille
CoGe
27
12
0
27 Oct 2023
GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models'
  Over-Reliance on Superficial Clue
GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models' Over-Reliance on Superficial Clue
Yanrui Du
Sendong Zhao
Yuhan Chen
Rai Bai
Jing Liu
Huaqin Wu
Haifeng Wang
Bing Qin
45
2
0
08 Sep 2023
A Joint Study of Phrase Grounding and Task Performance in Vision and
  Language Models
A Joint Study of Phrase Grounding and Task Performance in Vision and Language Models
Noriyuki Kojima
Hadar Averbuch-Elor
Yoav Artzi
26
2
0
06 Sep 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing
  Modality Fusion
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
21
2
0
15 Jun 2023
Are Diffusion Models Vision-And-Language Reasoners?
Are Diffusion Models Vision-And-Language Reasoners?
Benno Krojer
Elinor Poole-Dayan
Vikram S. Voleti
Christopher Pal
Siva Reddy
45
13
0
25 May 2023
Pento-DIARef: A Diagnostic Dataset for Learning the Incremental
  Algorithm for Referring Expression Generation from Examples
Pento-DIARef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from Examples
P. Sadler
David Schlangen
23
2
0
24 May 2023
Combo of Thinking and Observing for Outside-Knowledge VQA
Combo of Thinking and Observing for Outside-Knowledge VQA
Q. Si
Yuchen Mo
Zheng Lin
Huishan Ji
Weiping Wang
46
13
0
10 May 2023
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution
  Generalization of VQA Models
BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution Generalization of VQA Models
Ali Borji
CoGe
10
1
0
28 Jan 2023
Feature-Level Debiased Natural Language Understanding
Feature-Level Debiased Natural Language Understanding
Yougang Lyu
Piji Li
Yechang Yang
Maarten de Rijke
Pengjie Ren
Yukun Zhao
Dawei Yin
Z. Ren
32
10
0
11 Dec 2022
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual
  Reasoning
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li
Xingrui Wang
Elias Stengel-Eskin
Adam Kortylewski
Wufei Ma
Benjamin Van Durme
Max Planck Institute for Informatics
OOD
LRM
26
58
0
01 Dec 2022
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual
  Question Answering
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering
Q. Si
Yuanxin Liu
Zheng Lin
Peng Fu
Weiping Wang
VLM
42
1
0
26 Oct 2022
Towards Robust Visual Question Answering: Making the Most of Biased
  Samples via Contrastive Learning
Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning
Q. Si
Yuanxin Liu
Fandong Meng
Zheng Lin
Peng Fu
Yanan Cao
Weiping Wang
Jie Zhou
37
23
0
10 Oct 2022
Critical Learning Periods for Multisensory Integration in Deep Networks
Critical Learning Periods for Multisensory Integration in Deep Networks
Michael Kleinman
Alessandro Achille
Stefano Soatto
35
10
0
06 Oct 2022
ChemAlgebra: Algebraic Reasoning on Chemical Reactions
ChemAlgebra: Algebraic Reasoning on Chemical Reactions
Andrea Valenti
D. Bacciu
Antonio Vergari
OOD
LRM
35
0
0
05 Oct 2022
Revisiting the Importance of Amplifying Bias for Debiasing
Revisiting the Importance of Amplifying Bias for Debiasing
Jungsoo Lee
Jeonghoon Park
Daeyoung Kim
Juyoung Lee
Edward Choi
Jaegul Choo
47
21
0
29 May 2022
Learning to Retrieve Videos by Asking Questions
Learning to Retrieve Videos by Asking Questions
Avinash Madasu
Junier Oliva
Gedas Bertasius
VGen
32
16
0
11 May 2022
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
Leonard Salewski
A. Sophia Koepke
Hendrik P. A. Lensch
Zeynep Akata
LRM
NAI
30
20
0
05 Apr 2022
Delving Deeper into Cross-lingual Visual Question Answering
Delving Deeper into Cross-lingual Visual Question Answering
Chen Cecilia Liu
Jonas Pfeiffer
Anna Korhonen
Ivan Vulić
Iryna Gurevych
31
8
0
15 Feb 2022
Saving Dense Retriever from Shortcut Dependency in Conversational Search
Saving Dense Retriever from Shortcut Dependency in Conversational Search
Sungdong Kim
Gangwoo Kim
25
26
0
15 Feb 2022
On the Efficacy of Co-Attention Transformer Layers in Visual Question
  Answering
On the Efficacy of Co-Attention Transformer Layers in Visual Question Answering
Ankur Sikarwar
Gabriel Kreiman
ViT
21
1
0
11 Jan 2022
3D Question Answering
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
26
47
0
15 Dec 2021
Understanding Out-of-distribution: A Perspective of Data Dynamics
Understanding Out-of-distribution: A Perspective of Data Dynamics
Dyah Adila
Dongyeop Kang
38
12
0
29 Nov 2021
Diagnosing Errors in Video Relation Detectors
Diagnosing Errors in Video Relation Detectors
Shuo Chen
Pascal Mettes
Cees G. M. Snoek
36
5
0
25 Oct 2021
Counterfactual Samples Synthesizing and Training for Robust Visual
  Question Answering
Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering
Long Chen
Yuhang Zheng
Yulei Niu
Hanwang Zhang
Jun Xiao
AAML
OOD
21
36
0
03 Oct 2021
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real
  Images
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
Zhuowan Li
Elias Stengel-Eskin
Yixiao Zhang
Cihang Xie
Q. Tran
Benjamin Van Durme
Alan Yuille
VLM
24
15
0
01 Oct 2021
Symbolic Brittleness in Sequence Models: on Systematic Generalization in
  Symbolic Mathematics
Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics
Sean Welleck
Peter West
Jize Cao
Yejin Choi
21
28
0
28 Sep 2021
Multimodal Integration of Human-Like Attention in Visual Question
  Answering
Multimodal Integration of Human-Like Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Philippe Muller
Dominike Thomas
Mihai Bâce
Andreas Bulling
41
16
0
27 Sep 2021
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual
  Question Answering
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Florian Strohm
Prajit Dhar
Andreas Bulling
40
19
0
27 Sep 2021
Raising context awareness in motion forecasting
Raising context awareness in motion forecasting
H. Ben-younes
Éloi Zablocki
Mickaël Chen
P. Pérez
Matthieu Cord
TTA
39
11
0
16 Sep 2021
Discovering the Unknown Knowns: Turning Implicit Knowledge in the
  Dataset into Explicit Training Examples for Visual Question Answering
Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering
Jihyung Kil
Cheng Zhang
D. Xuan
Wei-Lun Chao
61
20
0
13 Sep 2021
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Xingdi Yuan
34
3
0
31 Aug 2021
On the Significance of Question Encoder Sequence Model in the
  Out-of-Distribution Performance in Visual Question Answering
On the Significance of Question Encoder Sequence Model in the Out-of-Distribution Performance in Visual Question Answering
K. Gouthaman
Anurag Mittal
CML
45
0
0
28 Aug 2021
Chest ImaGenome Dataset for Clinical Reasoning
Chest ImaGenome Dataset for Clinical Reasoning
Joy T. Wu
Nkechinyere N. Agu
Ismini Lourentzou
Arjun Sharma
J. Paguio
...
William Mitchell
Satyananda Kashyap
Andrea Giovannini
Leo Anthony Celi
Mehdi Moradi
21
64
0
31 Jul 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
32
159
0
15 Jul 2021
Learning Debiased Representation via Disentangled Feature Augmentation
Learning Debiased Representation via Disentangled Feature Augmentation
Jungsoo Lee
Eungyeup Kim
Juyoung Lee
Jihyeon Janel Lee
Jaegul Choo
CML
17
148
0
03 Jul 2021
Ethics Sheets for AI Tasks
Ethics Sheets for AI Tasks
Saif M. Mohammad
17
31
0
02 Jul 2021
Towards Robust Classification Model by Counterfactual and Invariant Data
  Generation
Towards Robust Classification Model by Counterfactual and Invariant Data Generation
C. Chang
George Adam
Anna Goldenberg
OOD
CML
27
31
0
02 Jun 2021
A First Look: Towards Explainable TextVQA Models via Visual and Textual
  Explanations
A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations
Varun Nagaraj Rao
Xingjian Zhen
K. Hovsepian
Mingwei Shen
29
18
0
29 Apr 2021
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language
  Tasks
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks
Hung Le
Nancy F. Chen
Guosheng Lin
MLLM
26
19
0
16 Apr 2021
VisQA: X-raying Vision and Language Reasoning in Transformers
VisQA: X-raying Vision and Language Reasoning in Transformers
Theo Jaunet
Corentin Kervadec
Romain Vuillemot
G. Antipov
M. Baccouche
Christian Wolf
19
26
0
02 Apr 2021
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical
  Visual Question Answering
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering
Bo Liu
Li-Ming Zhan
Li Xu
Lin Ma
Y. Yang
Xiao-Ming Wu
31
236
0
18 Feb 2021
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded
  Dialogue
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue
Hung Le
Chinnadhurai Sankar
Seungwhan Moon
Ahmad Beirami
A. Geramifard
Satwik Kottur
VGen
31
18
0
01 Jan 2021
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
  Functional Entropies
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Itai Gat
Idan Schwartz
A. Schwing
Tamir Hazan
55
90
0
21 Oct 2020
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
21
74
0
01 Sep 2020
Visual Question Answering on Image Sets
Visual Question Answering on Image Sets
Ankan Bansal
Yuting Zhang
Rama Chellappa
CoGe
16
40
0
27 Aug 2020
Not all Failure Modes are Created Equal: Training Deep Neural Networks
  for Explicable (Mis)Classification
Not all Failure Modes are Created Equal: Training Deep Neural Networks for Explicable (Mis)Classification
Alberto Olmo
Sailik Sengupta
S. Kambhampati
25
6
0
26 Jun 2020
Estimating semantic structure for the VQA answer space
Estimating semantic structure for the VQA answer space
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
26
4
0
10 Jun 2020
12
Next