Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1603.02814
Cited By
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
9 March 2016
Qi Wu
Chunhua Shen
Anton Van Den Hengel
Peng Wang
A. Dick
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Image Captioning and Visual Question Answering Based on Attributes and External Knowledge"
34 / 34 papers shown
Title
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Junrong Yue
Wenjie Qu
Chuan Qin
Jing Chen
Xiaomin Lie
Xinlei Yu
Wenxin Zhang
Zhendong Zhao
54
0
0
23 Apr 2025
The curse of language biases in remote sensing VQA: the role of spatial attributes, language diversity, and the need for clear evaluation
Christel Chappuis
Eliot Walt
Vincent Mendez
Sylvain Lobry
B. L. Saux
D. Tuia
31
3
0
28 Nov 2023
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition
David M. Chan
Shalini Ghosh
Ariya Rastrow
Björn Hoffmeister
OffRL
18
6
0
06 Jan 2023
What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes
Shivam Sharma
Siddhant Agarwal
Tharun Suresh
Preslav Nakov
Md. Shad Akhtar
Tanmoy Charkraborty
VLM
28
18
0
01 Dec 2022
A survey on the development status and application prospects of knowledge graph in smart grids
Jian Wang
Xi Wang
Chaoqun Ma
Lei Kou
33
74
0
02 Nov 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
30
26
0
17 Oct 2022
Generating image captions with external encyclopedic knowledge
S. Nikiforova
Tejaswini Deoskar
Denis Paperno
Yoad Winter
30
1
0
10 Oct 2022
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
22
3
0
16 Jun 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
29
136
0
26 Mar 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
26
15
0
11 Feb 2022
Knowledge-based Embodied Question Answering
Sinan Tan
Mengmeng Ge
Di Guo
Huaping Liu
F. Sun
30
20
0
16 Sep 2021
DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection
Steven Lang
Fabrizio G. Ventola
Kristian Kersting
34
14
0
13 Sep 2021
Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
Zechen Bai
Yuta Nakashima
Noa Garcia
68
43
0
13 Sep 2021
Zero-Shot Scene Graph Relation Prediction through Commonsense Knowledge Integration
Xuan Kan
Hejie Cui
Carl Yang
76
40
0
11 Jul 2021
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Sangwoong Yoon
Woo-Young Kang
Sungwook Jeon
SeongEun Lee
C. Han
Jonghun Park
Eun-Sol Kim
3DH
29
39
0
29 Dec 2020
Dual ResGCN for Balanced Scene GraphGeneration
Jingyi Zhang
Yong Zhang
Baoyuan Wu
Yanbo Fan
Fumin Shen
Heng Tao Shen
28
12
0
09 Nov 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
118
31
0
16 Oct 2020
Length-Controllable Image Captioning
Chaorui Deng
Ning Ding
Mingkui Tan
Qi Wu
VLM
33
56
0
19 Jul 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
27
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
25
16
0
15 Feb 2020
A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning
Filippos Gouidis
Alexandros Vassiliades
T. Patkos
Antonis Argyros
Nick Bassiliades
Dimitris Plexousakis
OCL
29
12
0
26 Dec 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
23
70
0
17 Nov 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
20
132
0
22 Jul 2019
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
Hui Wu
Yupeng Gao
Xiaoxiao Guo
Ziad Al-Halah
Steven J. Rennie
Kristen Grauman
Rogerio Feris
EgoV
25
63
0
30 May 2019
Object Detection in 20 Years: A Survey
Zhengxia Zou
Keyan Chen
Zhenwei Shi
Yuhong Guo
Jieping Ye
VLM
ObjD
AI4TS
32
2,285
0
13 May 2019
3G structure for image caption generation
Aihong Yuan
Xuelong Li
Xiaoqiang Lu
13
34
0
21 Apr 2019
Pedestrian Attribute Recognition: A Survey
Tianlin Li
Shaofei Zheng
Rui Yang
Aihua Zheng
Zhe Chen
Jin Tang
Bin Luo
CVBM
28
127
0
22 Jan 2019
Holistic Multi-modal Memory Network for Movie Question Answering
Anran Wang
Anh Tuan Luu
Chuan-Sheng Foo
Erik Cambria
Yi Tay
V. Chandrasekhar
28
20
0
12 Nov 2018
A Comprehensive Survey of Deep Learning for Image Captioning
Md Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
45
760
0
06 Oct 2018
Graph R-CNN for Scene Graph Generation
Jianwei Yang
Jiasen Lu
Stefan Lee
Dhruv Batra
Devi Parikh
GNN
42
836
0
01 Aug 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
25
79
0
24 May 2018
Defoiling Foiled Image Captions
Pranava Madhyastha
Josiah Wang
Lucia Specia
24
9
0
16 May 2018
AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding
Jiahong Wu
He Zheng
Bo Zhao
Yixin Li
Baoming Yan
...
Shipei Zhou
G. Lin
Yanwei Fu
Yizhou Wang
Yonggang Wang
VLM
38
149
0
17 Nov 2017
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
14
230
0
10 Oct 2016
1