ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00468
  4. Cited By
VQA: Visual Question Answering
v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
    CoGe
ArXiv (abs)PDFHTML

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown
Title
Probing Contextual Language Models for Common Ground with Visual
  Representations
Probing Contextual Language Models for Common Ground with Visual Representations
Gabriel Ilharco
Rowan Zellers
Ali Farhadi
Hannaneh Hajishirzi
118
14
0
01 May 2020
HLVU : A New Challenge to Test Deep Understanding of Movies the Way
  Humans do
HLVU : A New Challenge to Test Deep Understanding of Movies the Way Humans do
Keith Curtis
G. Awad
Shahzad Rajput
I. Soboroff
16
32
0
01 May 2020
Visuo-Linguistic Question Answering (VLQA) Challenge
Visuo-Linguistic Question Answering (VLQA) Challenge
Shailaja Keyur Sampat
Yezhou Yang
Chitta Baral
CoGe
28
1
0
01 May 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAMLXAI
120
382
0
30 Apr 2020
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines
Dima Damen
Hazel Doughty
G. Farinella
Sanja Fidler
Antonino Furnari
...
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
107
234
0
29 Apr 2020
Pragmatic Issue-Sensitive Image Captioning
Pragmatic Issue-Sensitive Image Captioning
Allen Nie
Reuben Cohn-Gordon
Christopher Potts
53
24
0
29 Apr 2020
Cross-modal Speaker Verification and Recognition: A Multilingual
  Perspective
Cross-modal Speaker Verification and Recognition: A Multilingual Perspective
M. S. Saeed
Shah Nawaz
Pietro Morerio
Arif Mahmood
I. Gallo
Muhammad Haroon Yousaf
Alessio Del Bue
CVBM
84
27
0
28 Apr 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue Wang
Shafiq Joty
Michael R. Lyu
Irwin King
Caiming Xiong
Guosheng Lin
116
104
0
28 Apr 2020
MCQA: Multimodal Co-attention Based Network for Question Answering
MCQA: Multimodal Co-attention Based Network for Question Answering
Abhishek Kumar
Trisha Mittal
Tianyi Zhou
40
14
0
25 Apr 2020
Deep Multimodal Neural Architecture Search
Deep Multimodal Neural Architecture Search
Zhou Yu
Yuhao Cui
Jun-chen Yu
Meng Wang
Dacheng Tao
Qi Tian
70
100
0
25 Apr 2020
Explicit Domain Adaptation with Loosely Coupled Samples
Explicit Domain Adaptation with Loosely Coupled Samples
Oliver Scheel
L. Schwarz
Nassir Navab
Federico Tombari
OOD
40
2
0
24 Apr 2020
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
Duy-Kien Nguyen
Vedanuj Goswami
Xinlei Chen
71
23
0
24 Apr 2020
Debiasing Skin Lesion Datasets and Models? Not So Fast
Debiasing Skin Lesion Datasets and Models? Not So Fast
Alceu Bissoto
Eduardo Valle
Sandra Avila
102
55
0
23 Apr 2020
Visual Question Answering Using Semantic Information from Image
  Descriptions
Visual Question Answering Using Semantic Information from Image Descriptions
Tasmia Tasrin
Md Sultan al Nahian
Brent Harrison
28
0
0
23 Apr 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
J. S. Park
Chandra Bhagavatula
Roozbeh Mottaghi
Ali Farhadi
Yejin Choi
ReLMLRM
75
6
0
22 Apr 2020
Experience Grounds Language
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
126
361
0
21 Apr 2020
A Revised Generative Evaluation of Visual Dialogue
A Revised Generative Evaluation of Visual Dialogue
Daniela Massiceti
Viveka Kulharia
P. Dokania
N. Siddharth
Philip Torr
40
0
0
20 Apr 2020
Variational Inference for Learning Representations of Natural Language
  Edits
Variational Inference for Learning Representations of Natural Language Edits
Edison Marrese-Taylor
Machel Reid
Y. Matsuo
BDLDRLKELM
101
8
0
20 Apr 2020
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike
  Common Sense
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
Yixin Zhu
Tao Gao
Lifeng Fan
Siyuan Huang
Mark Edmonds
...
Fangqiu Yi
Siyuan Qi
Ying Nian Wu
J. Tenenbaum
Song-Chun Zhu
112
130
0
20 Apr 2020
Learning What Makes a Difference from Counterfactual Examples and
  Gradient Supervision
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OODSSLCML
93
119
0
20 Apr 2020
Are we pretraining it right? Digging deeper into visio-linguistic
  pretraining
Are we pretraining it right? Digging deeper into visio-linguistic pretraining
Amanpreet Singh
Vedanuj Goswami
Devi Parikh
VLM
78
48
0
19 Apr 2020
Multiple Visual-Semantic Embedding for Video Retrieval from Query
  Sentence
Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence
Huy Manh Nguyen
Tomo Miyazaki
Yoshihiro Sugaya
S. Omachi
144
1
0
16 Apr 2020
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge
  Transfer
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer
Gi-Cheon Kang
Junseok Park
Hwaran Lee
Byoung-Tak Zhang
Jin-Hwa Kim
VLM
62
10
0
14 Apr 2020
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
Robik Shrestha
Kushal Kafle
Christopher Kanan
CML
66
35
0
12 Apr 2020
An Entropy Clustering Approach for Assessing Visual Question Difficulty
An Entropy Clustering Approach for Assessing Visual Question Difficulty
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
Shuníchi Satoh
OODAAML
60
1
0
12 Apr 2020
Rephrasing visual questions by specifying the entropy of the answer
  distribution
Rephrasing visual questions by specifying the entropy of the answer distribution
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
S. Satoh
OOD
44
2
0
10 Apr 2020
Multimodal Categorization of Crisis Events in Social Media
Multimodal Categorization of Crisis Events in Social Media
Mahdi Abavisani
Liwei Wu
Shengli Hu
Joel R. Tetreault
A. Jaimes
98
88
0
10 Apr 2020
SpatialSim: Recognizing Spatial Configurations of Objects with Graph
  Neural Networks
SpatialSim: Recognizing Spatial Configurations of Objects with Graph Neural Networks
Laetitia Teodorescu
Katja Hofmann
Pierre-Yves Oudeyer
58
1
0
09 Apr 2020
Learning to Scale Multilingual Representations for Vision-Language Tasks
Learning to Scale Multilingual Representations for Vision-Language Tasks
Andrea Burns
Donghyun Kim
Derry Wijaya
Kate Saenko
Bryan A. Plummer
50
35
0
09 Apr 2020
Understanding Knowledge Gaps in Visual Question Answering: Implications
  for Gap Identification and Testing
Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing
Goonmeet Bajaj
Bortik Bandyopadhyay
Daniela Schmidt
Pranav Maneriker
Christopher Myers
Srinivasan Parthasarathy
35
2
0
08 Apr 2020
Query-controllable Video Summarization
Query-controllable Video Summarization
Jia-Hong Huang
Marcel Worring
47
46
0
07 Apr 2020
Iterative Context-Aware Graph Inference for Visual Dialog
Iterative Context-Aware Graph Inference for Visual Dialog
Dan Guo
Haibo Wang
Hanwang Zhang
Zhengjun Zha
Meng Wang
79
49
0
05 Apr 2020
Generating Rationales in Visual Question Answering
Generating Rationales in Visual Question Answering
Hammad A. Ayyubi
Md. Mehrab Tanjim
Julian McAuley
G. Cottrell
LRM
47
6
0
04 Apr 2020
Open Domain Dialogue Generation with Latent Images
Open Domain Dialogue Generation with Latent Images
Ze Yang
Wei Wu
Huang Hu
Can Xu
Wei Wang
Zhoujun Li
76
30
0
04 Apr 2020
Benchmarking Machine Reading Comprehension: A Psychological Perspective
Benchmarking Machine Reading Comprehension: A Psychological Perspective
Saku Sugawara
Pontus Stenetorp
Akiko Aizawa
54
2
0
04 Apr 2020
Evaluating Multimodal Representations on Visual Semantic Textual
  Similarity
Evaluating Multimodal Representations on Visual Semantic Textual Similarity
Oier López de Lacalle
Ander Salaberria
Aitor Soroa Etxabe
Gorka Azkune
Eneko Agirre
41
2
0
04 Apr 2020
Learning Representations For Images With Hierarchical Labels
Learning Representations For Images With Hierarchical Labels
Ankit Dhall
SSL
50
2
0
02 Apr 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal
  Transformers
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
197
440
0
02 Apr 2020
Consistent Multiple Sequence Decoding
Consistent Multiple Sequence Decoding
Bicheng Xu
Leonid Sigal
57
0
0
02 Apr 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene
  Text
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Difei Gao
Ke Li
Ruiping Wang
Shiguang Shan
Xilin Chen
92
113
0
31 Mar 2020
Modulating Bottom-Up and Top-Down Visual Processing via
  Language-Conditional Filters
Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters
.Ilker Kesen
Ozan Arkan Can
Erkut Erdem
Aykut Erdem
Deniz Yuret
VLM
55
1
0
28 Mar 2020
P $\approx$ NP, at least in Visual Question Answering
P ≈\approx≈ NP, at least in Visual Question Answering
Shailza Jolly
Sebastián M. Palacio
Joachim Folz
Federico Raue
Jörn Hees
Andreas Dengel
24
0
0
26 Mar 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLMVGen
102
70
0
25 Mar 2020
Linguistically Driven Graph Capsule Network for Visual Question
  Reasoning
Linguistically Driven Graph Capsule Network for Visual Question Reasoning
Qingxing Cao
Xiaodan Liang
Keze Wang
Liang Lin
GNN
47
3
0
23 Mar 2020
Visual Question Answering for Cultural Heritage
Visual Question Answering for Cultural Heritage
P. Bongini
Federico Becattini
Andrew D. Bagdanov
A. Bimbo
479
24
0
22 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
201
192
0
19 Mar 2020
RSVQA: Visual Question Answering for Remote Sensing Data
RSVQA: Visual Question Answering for Remote Sensing Data
Sylvain Lobry
Diego Marcos
J. Murray
D. Tuia
126
223
0
16 Mar 2020
Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI
Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI
L. Arras
Ahmed Osman
Wojciech Samek
XAIAAML
97
157
0
16 Mar 2020
Vision-Dialog Navigation by Exploring Cross-modal Memory
Vision-Dialog Navigation by Exploring Cross-modal Memory
Yi Zhu
Fengda Zhu
Zhaohuan Zhan
Bingqian Lin
Jianbin Jiao
Xiaojun Chang
Xiaodan Liang
VLM
91
49
0
15 Mar 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Long Chen
Xin Yan
Jun Xiao
Hanwang Zhang
Shiliang Pu
Yueting Zhuang
OODAAML
219
294
0
14 Mar 2020
Previous
123...424344...585960
Next