ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.00837
  4. Cited By
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
v1v2v3 (latest)

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

2 December 2016
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"

50 / 2,037 papers shown
Title
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
Robik Shrestha
Kushal Kafle
Christopher Kanan
CML
66
35
0
12 Apr 2020
An Entropy Clustering Approach for Assessing Visual Question Difficulty
An Entropy Clustering Approach for Assessing Visual Question Difficulty
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
Shuníchi Satoh
OODAAML
67
1
0
12 Apr 2020
Rephrasing visual questions by specifying the entropy of the answer
  distribution
Rephrasing visual questions by specifying the entropy of the answer distribution
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
S. Satoh
OOD
49
2
0
10 Apr 2020
Learning to Scale Multilingual Representations for Vision-Language Tasks
Learning to Scale Multilingual Representations for Vision-Language Tasks
Andrea Burns
Donghyun Kim
Derry Wijaya
Kate Saenko
Bryan A. Plummer
50
35
0
09 Apr 2020
Understanding Knowledge Gaps in Visual Question Answering: Implications
  for Gap Identification and Testing
Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing
Goonmeet Bajaj
Bortik Bandyopadhyay
Daniela Schmidt
Pranav Maneriker
Christopher Myers
Srinivasan Parthasarathy
39
2
0
08 Apr 2020
SHOP-VRB: A Visual Reasoning Benchmark for Object Perception
SHOP-VRB: A Visual Reasoning Benchmark for Object Perception
Michal Nazarczuk
K. Mikolajczyk
72
21
0
06 Apr 2020
Generating Rationales in Visual Question Answering
Generating Rationales in Visual Question Answering
Hammad A. Ayyubi
Md. Mehrab Tanjim
Julian McAuley
G. Cottrell
LRM
47
6
0
04 Apr 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal
  Transformers
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
200
440
0
02 Apr 2020
DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style
  Word Generator
DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style Word Generator
Hwanhee Lee
Seunghyun Yoon
Franck Dernoncourt
Doo Soon Kim
Trung Bui
Kyomin Jung
69
15
0
01 Apr 2020
Ontology-based Interpretable Machine Learning for Textual Data
Ontology-based Interpretable Machine Learning for Textual Data
Phung Lai
Nhathai Phan
Han Hu
Anuja Badeti
David Newman
Dejing Dou
33
8
0
01 Apr 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene
  Text
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Difei Gao
Ke Li
Ruiping Wang
Shiguang Shan
Xilin Chen
97
113
0
31 Mar 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLMVGen
124
70
0
25 Mar 2020
Linguistically Driven Graph Capsule Network for Visual Question
  Reasoning
Linguistically Driven Graph Capsule Network for Visual Question Reasoning
Qingxing Cao
Xiaodan Liang
Keze Wang
Liang Lin
GNN
52
3
0
23 Mar 2020
Visual Question Answering for Cultural Heritage
Visual Question Answering for Cultural Heritage
P. Bongini
Federico Becattini
Andrew D. Bagdanov
A. Bimbo
484
24
0
22 Mar 2020
RSVQA: Visual Question Answering for Remote Sensing Data
RSVQA: Visual Question Answering for Remote Sensing Data
Sylvain Lobry
Diego Marcos
J. Murray
D. Tuia
129
223
0
16 Mar 2020
Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI
Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI
L. Arras
Ahmed Osman
Wojciech Samek
XAIAAML
97
157
0
16 Mar 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Long Chen
Xin Yan
Jun Xiao
Hanwang Zhang
Shiliang Pu
Yueting Zhuang
OODAAML
224
294
0
14 Mar 2020
Learning to Respond with Stickers: A Framework of Unifying
  Multi-Modality in Multi-Turn Dialog
Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog
Shen Gao
Preslav Nakov
Chang Liu
Li Liu
Dongyan Zhao
Rui Yan
90
34
0
10 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Deconfounded Image Captioning: A Causal Retrospect
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
79
127
0
09 Mar 2020
PathVQA: 30000+ Questions for Medical Visual Question Answering
PathVQA: 30000+ Questions for Medical Visual Question Answering
Xuehai He
Yichen Zhang
Luntian Mou
Eric Xing
P. Xie
LM&MA
80
246
0
07 Mar 2020
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in
  Natural Language Inference
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference
Tianyu Liu
Xin Zheng
Baobao Chang
Zhifang Sui
128
24
0
05 Mar 2020
A Study on Multimodal and Interactive Explanations for Visual Question
  Answering
A Study on Multimodal and Interactive Explanations for Visual Question Answering
Kamran Alipour
J. Schulze
Yi Yao
Avi Ziskind
Giedrius Burachas
64
27
0
01 Mar 2020
Visual Commonsense R-CNN
Visual Commonsense R-CNN
Tan Wang
Jianqiang Huang
Hanwang Zhang
Qianru Sun
SSLObjDCML
86
252
0
27 Feb 2020
Unshuffling Data for Improved Generalization
Unshuffling Data for Improved Generalization
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
77
78
0
27 Feb 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual
  Question Answering
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
101
97
0
24 Feb 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
78
75
0
19 Feb 2020
Sparse and Structured Visual Attention
Sparse and Structured Visual Attention
Pedro Henrique Martins
S. Becker
Zita Marinho
Michael Arens
81
8
0
13 Feb 2020
Component Analysis for Visual Question Answering Architectures
Component Analysis for Visual Question Answering Architectures
Camila Kolling
Jonatas Wehrmann
Rodrigo C. Barros
CoGe
41
2
0
12 Feb 2020
Adversarial Filters of Dataset Biases
Adversarial Filters of Dataset Biases
Ronan Le Bras
Swabha Swayamdipta
Chandra Bhagavatula
Rowan Zellers
Matthew E. Peters
Ashish Sabharwal
Yejin Choi
176
223
0
10 Feb 2020
Bridging Text and Video: A Universal Multimodal Transformer for
  Video-Audio Scene-Aware Dialog
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang Li
Zongjia Li
Jinchao Zhang
Yang Feng
Cheng Niu
Jie Zhou
143
37
0
01 Feb 2020
Uncertainty based Class Activation Maps for Visual Question Answering
Uncertainty based Class Activation Maps for Visual Question Answering
Badri N. Patro
Mayank Lunayach
Vinay P. Namboodiri
FAttUQCV
44
1
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OODAAML
73
19
0
23 Jan 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
93
60
0
22 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
81
18
0
20 Jan 2020
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
Ramprasaath R. Selvaraju
Purva Tendulkar
Devi Parikh
Eric Horvitz
Marco Tulio Ribeiro
Besmira Nushi
Ece Kamar
LRM
57
14
0
20 Jan 2020
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Li Wang
Zechen Bai
Yonghua Zhang
Hongtao Lu
77
67
0
15 Jan 2020
In Defense of Grid Features for Visual Question Answering
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OODObjD
100
320
0
10 Jan 2020
Visual Question Answering on 360° Images
Visual Question Answering on 360° Images
Shih-Han Chou
Wei-Lun Chao
Wei-Sheng Lai
Min Sun
Ming-Hsuan Yang
54
22
0
10 Jan 2020
Multi-Layer Content Interaction Through Quaternion Product For Visual
  Question Answering
Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering
Lei Shi
Shijie Geng
Kai Shuang
Chiori Hori
Songxiang Liu
Peng Gao
Sen Su
88
11
0
03 Jan 2020
All-in-One Image-Grounded Conversational Agents
All-in-One Image-Grounded Conversational Agents
Da Ju
Kurt Shuster
Y-Lan Boureau
Jason Weston
LLMAG
87
8
0
28 Dec 2019
A Review on Intelligent Object Perception Methods Combining
  Knowledge-based Reasoning and Machine Learning
A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning
Filippos Gouidis
Alexandros Vassiliades
Theodore Patkos
Antonis Argyros
Nick Bassiliades
Dimitris Plexousakis
OCL
73
12
0
26 Dec 2019
Smart Home Appliances: Chat with Your Fridge
Smart Home Appliances: Chat with Your Fridge
Denis A. Gudovskiy
Gyuri Han
Takuya Yamaguchi
Sotaro Tsukizawa
LRM
31
4
0
19 Dec 2019
Deep Exemplar Networks for VQA and VQG
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
36
4
0
19 Dec 2019
Towards Causal VQA: Revealing and Reducing Spurious Correlations by
  Invariant and Covariant Semantic Editing
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CMLAAML
95
159
0
16 Dec 2019
Knowledge-based Conversational Search
Knowledge-based Conversational Search
Svitlana Vakulenko
64
13
0
14 Dec 2019
Weak Supervision helps Emergence of Word-Object Alignment and improves
  Vision-Language Tasks
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
60
15
0
06 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1: Multi-Task Vision and Language Representation Learning
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLMObjD
150
481
0
05 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDLOODUQCV
96
14
0
02 Dec 2019
Exposing and Correcting the Gender Bias in Image Captioning Datasets and
  Models
Exposing and Correcting the Gender Bias in Image Captioning Datasets and Models
Shruti Bhargava
David A. Forsyth
FaML
77
50
0
02 Dec 2019
A Free Lunch in Generating Datasets: Building a VQG and VQA System with
  Attention and Humans in the Loop
A Free Lunch in Generating Datasets: Building a VQG and VQA System with Attention and Humans in the Loop
Jihyeon Janel Lee
S. Arora
29
1
0
30 Nov 2019
Previous
123...353637...394041
Next