ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.01121
  4. Cited By
Ask Your Neurons: A Neural-based Approach to Answering Questions about
  Images

Ask Your Neurons: A Neural-based Approach to Answering Questions about Images

5 May 2015
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
ArXivPDFHTML

Papers citing "Ask Your Neurons: A Neural-based Approach to Answering Questions about Images"

50 / 76 papers shown
Title
LOVA3: Learning to Visual Question Answering, Asking and Assessment
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
77
8
0
21 Feb 2025
Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering
Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering
Zhicheng Zhao
Changfu Zhou
Yu Zhang
Chenglong Li
Xiaoliang Ma
Jin Tang
76
0
0
24 Nov 2024
LOIS: Looking Out of Instance Semantics for Visual Question Answering
LOIS: Looking Out of Instance Semantics for Visual Question Answering
Siyu Zhang
Ye Chen
Yaoru Sun
Fang Wang
Haibo Shi
Haoran Wang
23
4
0
26 Jul 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large
  Language Models
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
33
7
0
14 Jun 2023
AlignVE: Visual Entailment Recognition Based on Alignment Relations
AlignVE: Visual Entailment Recognition Based on Alignment Relations
Biwei Cao
Jiuxin Cao
Jie Gui
Jiayun Shen
Bo Liu
Lei He
Yuan Yan Tang
James T. Kwok
18
7
0
16 Nov 2022
From Pixels to Objects: Cubic Visual Attention for Visual Question
  Answering
From Pixels to Objects: Cubic Visual Attention for Visual Question Answering
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Heng Tao Shen
29
62
0
04 Jun 2022
Attention Mechanism based Cognition-level Scene Understanding
Attention Mechanism based Cognition-level Scene Understanding
Xuejiao Tang
Tai Le Quy
LRM
25
0
0
17 Apr 2022
Multimodal Integration of Human-Like Attention in Visual Question
  Answering
Multimodal Integration of Human-Like Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Philippe Muller
Dominike Thomas
Mihai Bâce
Andreas Bulling
33
16
0
27 Sep 2021
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual
  Question Answering
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Florian Strohm
Prajit Dhar
Andreas Bulling
29
19
0
27 Sep 2021
Longer Version for "Deep Context-Encoding Network for Retinal Image
  Captioning"
Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning"
Jia-Hong Huang
Ting-Wei Wu
Chao-Han Huck Yang
M. Worring
MedIm
15
28
0
30 May 2021
Learning to Respond with Your Favorite Stickers: A Framework of Unifying
  Multi-Modality and User Preference in Multi-Turn Dialog
Learning to Respond with Your Favorite Stickers: A Framework of Unifying Multi-Modality and User Preference in Multi-Turn Dialog
Shen Gao
Xiuying Chen
Li Liu
Dongyan Zhao
Rui Yan
19
14
0
05 Nov 2020
VMSMO: Learning to Generate Multimodal Summary for Video-based News
  Articles
VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles
Li Mingzhe
Xiuying Chen
Shen Gao
Zhangming Chan
Dongyan Zhao
Rui Yan
25
82
0
12 Oct 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual
  Question Answering
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
J. Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
17
125
0
16 Jun 2020
Towards Causal VQA: Revealing and Reducing Spurious Correlations by
  Invariant and Covariant Semantic Editing
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CML
AAML
21
155
0
16 Dec 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
28
9
0
31 Oct 2019
Compact Trilinear Interaction for Visual Question Answering
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
28
59
0
26 Sep 2019
Aesthetic Image Captioning From Weakly-Labelled Photographs
Aesthetic Image Captioning From Weakly-Labelled Photographs
Koustav Ghosal
A. Rana
A. Smolic
19
25
0
29 Aug 2019
Why Does a Visual Question Have Different Answers?
Why Does a Visual Question Have Different Answers?
Nilavra Bhattacharya
Qing Li
Danna Gurari
23
65
0
12 Aug 2019
Adversarial Multimodal Network for Movie Question Answering
Zhaoquan Yuan
Siyuan Sun
Lixin Duan
Xiao Wu
Changsheng Xu
19
3
0
24 Jun 2019
Factor Graph Attention
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
A. Schwing
19
110
0
11 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
19
69
0
11 Apr 2019
Constructing Hierarchical Q&A Datasets for Video Story Understanding
Constructing Hierarchical Q&A Datasets for Video Story Understanding
Y. Heo
Kyoung-Woon On
Seong-Ho Choi
Jaeseo Lim
Jinah Kim
Jeh-Kwang Ryu
Byung-Chull Bae
Byoung-Tak Zhang
23
5
0
01 Apr 2019
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Hexiang Hu
Ishan Misra
L. V. D. van der Maaten
24
22
0
19 Jan 2019
Semi-interactive Attention Network for Answer Understanding in
  Reverse-QA
Semi-interactive Attention Network for Answer Understanding in Reverse-QA
Qing Yin
Guan Luo
Xiao-Dan Zhu
Q. Hu
Ou Wu
21
5
0
12 Jan 2019
Textually Enriched Neural Module Networks for Visual Question Answering
Textually Enriched Neural Module Networks for Visual Question Answering
Khyathi Raghavi Chandu
Mary Arpita Pyreddy
Matthieu Felix
N. Joshi
24
6
0
23 Sep 2018
Interpretable Visual Question Answering by Reasoning on Dependency Trees
Interpretable Visual Question Answering by Reasoning on Dependency Trees
Qingxing Cao
Bailin Li
Xiaodan Liang
Liang Lin
25
55
0
06 Sep 2018
Multimodal Grounding for Language Processing
Multimodal Grounding for Language Processing
Lisa Beinborn
Teresa Botschen
Iryna Gurevych
14
32
0
17 Jun 2018
Learning Visual Knowledge Memory Networks for Visual Question Answering
Learning Visual Knowledge Memory Networks for Visual Question Answering
Zhou Su
Chen Zhu
Yinpeng Dong
Dongqi Cai
Yurong Chen
Jianguo Li
29
62
0
13 Jun 2018
Joint Image Captioning and Question Answering
Joint Image Captioning and Question Answering
Jialin Wu
Zeyuan Hu
Raymond J. Mooney
22
12
0
22 May 2018
Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal
  Data
Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data
Yi Yu
Suhua Tang
Kiyoharu Aizawa
Akiko Aizawa
4
100
0
08 May 2018
Unsupervised Textual Grounding: Linking Words to Image Concepts
Unsupervised Textual Grounding: Linking Words to Image Concepts
Raymond A. Yeh
Minh Do
A. Schwing
22
40
0
29 Mar 2018
Motion-Appearance Co-Memory Networks for Video Question Answering
Motion-Appearance Co-Memory Networks for Video Question Answering
J. Gao
Runzhou Ge
Kan Chen
Ram Nevatia
27
240
0
29 Mar 2018
A Survey on Deep Learning Methods for Robot Vision
A Survey on Deep Learning Methods for Robot Vision
Javier Ruiz-del-Solar
P. Loncomilla
Naiomi Soto
26
60
0
28 Mar 2018
Multimodal Explanations: Justifying Decisions and Pointing to the
  Evidence
Multimodal Explanations: Justifying Decisions and Pointing to the Evidence
Dong Huk Park
Lisa Anne Hendricks
Zeynep Akata
Anna Rohrbach
Bernt Schiele
Trevor Darrell
Marcus Rohrbach
35
418
0
15 Feb 2018
Tell-and-Answer: Towards Explainable Visual Question Answering using
  Attributes and Captions
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
Qing Li
Jianlong Fu
D. Yu
Tao Mei
Jiebo Luo
FAtt
XAI
CoGe
46
60
0
27 Jan 2018
Active Learning for Visual Question Answering: An Empirical Study
Active Learning for Visual Question Answering: An Empirical Study
Xiaoyu Lin
Devi Parikh
36
31
0
06 Nov 2017
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAtt
AIMat
OffRL
AI4CE
70
2,144
0
22 Sep 2017
Visual Question Generation as Dual Task of Visual Question Answering
Visual Question Generation as Dual Task of Visual Question Answering
Yikang Li
Nan Duan
Bolei Zhou
Xiao Chu
Wanli Ouyang
Xiaogang Wang
29
165
0
21 Sep 2017
Tips and Tricks for Visual Question Answering: Learnings from the 2017
  Challenge
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
Damien Teney
Peter Anderson
Xiaodong He
A. Hengel
45
380
0
09 Aug 2017
Recent Trends in Deep Learning Based Natural Language Processing
Recent Trends in Deep Learning Based Natural Language Processing
Tom Young
Devamanyu Hazarika
Soujanya Poria
Erik Cambria
33
2,822
0
09 Aug 2017
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for
  Visual Question Answering
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu
Jun-chen Yu
Jianping Fan
Dacheng Tao
41
663
0
04 Aug 2017
Best of Both Worlds: Transferring Knowledge from Discriminative Learning
  to a Generative Visual Dialog Model
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
Jiasen Lu
A. Kannan
Jianwei Yang
Devi Parikh
Dhruv Batra
BDL
15
136
0
05 Jun 2017
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
13
2,856
0
26 May 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
Y. Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
19
545
0
14 Apr 2017
Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR)
  Approach to Understanding Deep Neural Networks
Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks
Devinder Kumar
Alexander Wong
Graham W. Taylor
21
59
0
13 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core
  tasks, applications and evaluation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
21
809
0
29 Mar 2017
An Analysis of Visual Question Answering Algorithms
An Analysis of Visual Question Answering Algorithms
Kushal Kafle
Christopher Kanan
19
230
0
28 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe-nan Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
36
234
0
23 Mar 2017
Learning Cooperative Visual Dialog Agents with Deep Reinforcement
  Learning
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
Abhishek Das
Satwik Kottur
J. M. F. Moura
Stefan Lee
Dhruv Batra
OffRL
31
423
0
20 Mar 2017
Task-driven Visual Saliency and Attention-based Visual Question
  Answering
Task-driven Visual Saliency and Attention-based Visual Question Answering
Yuetan Lin
Zhangyang Pang
Donghui Wang
Yueting Zhuang
27
26
0
22 Feb 2017
12
Next