ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1603.02814
  4. Cited By
Image Captioning and Visual Question Answering Based on Attributes and
  External Knowledge

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

9 March 2016
Qi Wu
Chunhua Shen
Anton Van Den Hengel
Peng Wang
A. Dick
ArXivPDFHTML

Papers citing "Image Captioning and Visual Question Answering Based on Attributes and External Knowledge"

30 / 30 papers shown
Title
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Junrong Yue
Yuhang Zhang
Chuan Qin
Jing Chen
Xiaomin Lie
Xinlei Yu
Wenxin Zhang
Zhendong Zhao
54
0
0
23 Apr 2025
The curse of language biases in remote sensing VQA: the role of spatial
  attributes, language diversity, and the need for clear evaluation
The curse of language biases in remote sensing VQA: the role of spatial attributes, language diversity, and the need for clear evaluation
Christel Chappuis
Eliot Walt
Vincent Mendez
Sylvain Lobry
B. L. Saux
D. Tuia
28
3
0
28 Nov 2023
Using External Off-Policy Speech-To-Text Mappings in Contextual
  End-To-End Automated Speech Recognition
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition
David M. Chan
Shalini Ghosh
Ariya Rastrow
Björn Hoffmeister
OffRL
18
6
0
06 Jan 2023
What do you MEME? Generating Explanations for Visual Semantic Role
  Labelling in Memes
What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes
Shivam Sharma
Siddhant Agarwal
Tharun Suresh
Preslav Nakov
Md. Shad Akhtar
Tanmoy Charkraborty
VLM
25
18
0
01 Dec 2022
A survey on the development status and application prospects of
  knowledge graph in smart grids
A survey on the development status and application prospects of knowledge graph in smart grids
Jian Wang
Xi Wang
Chaoqun Ma
Lei Kou
33
74
0
02 Nov 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
30
26
0
17 Oct 2022
Generating image captions with external encyclopedic knowledge
Generating image captions with external encyclopedic knowledge
S. Nikiforova
Tejaswini Deoskar
Denis Paperno
Yoad Winter
30
1
0
10 Oct 2022
Image Captioning based on Feature Refinement and Reflective Decoding
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
16
3
0
16 Jun 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
29
136
0
26 Mar 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient
  Image Captioning
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
24
15
0
11 Feb 2022
Knowledge-based Embodied Question Answering
Knowledge-based Embodied Question Answering
Sinan Tan
Mengmeng Ge
Di Guo
Huaping Liu
F. Sun
30
20
0
16 Sep 2021
DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection
DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection
Steven Lang
Fabrizio G. Ventola
Kristian Kersting
34
14
0
13 Sep 2021
Explain Me the Painting: Multi-Topic Knowledgeable Art Description
  Generation
Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
Zechen Bai
Yuta Nakashima
Noa Garcia
68
43
0
13 Sep 2021
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Sangwoong Yoon
Woo-Young Kang
Sungwook Jeon
SeongEun Lee
C. Han
Jonghun Park
Eun-Sol Kim
3DH
29
39
0
29 Dec 2020
Dual ResGCN for Balanced Scene GraphGeneration
Dual ResGCN for Balanced Scene GraphGeneration
Jingyi Zhang
Yong Zhang
Baoyuan Wu
Yanbo Fan
Fumin Shen
Heng Tao Shen
28
12
0
09 Nov 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei Chen
Weiping Wang
Li Liu
M. Lew
VLM
115
31
0
16 Oct 2020
Length-Controllable Image Captioning
Length-Controllable Image Captioning
Chaorui Deng
Ning Ding
Mingkui Tan
Qi Wu
VLM
30
56
0
19 Jul 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic
  Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO
  Framework
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
27
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
25
16
0
15 Feb 2020
A Review on Intelligent Object Perception Methods Combining
  Knowledge-based Reasoning and Machine Learning
A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning
Filippos Gouidis
Alexandros Vassiliades
T. Patkos
Antonis Argyros
Nick Bassiliades
Dimitris Plexousakis
OCL
29
12
0
26 Dec 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
23
70
0
17 Nov 2019
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language
  Feedback
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
Hui Wu
Yupeng Gao
Xiaoxiao Guo
Ziad Al-Halah
Steven J. Rennie
Kristen Grauman
Rogerio Feris
EgoV
20
63
0
30 May 2019
Object Detection in 20 Years: A Survey
Object Detection in 20 Years: A Survey
Zhengxia Zou
Keyan Chen
Zhenwei Shi
Yuhong Guo
Jieping Ye
VLM
ObjD
AI4TS
32
2,285
0
13 May 2019
Pedestrian Attribute Recognition: A Survey
Pedestrian Attribute Recognition: A Survey
Tianlin Li
Shaofei Zheng
Rui Yang
Aihua Zheng
Zhe Chen
Jin Tang
B. Luo
CVBM
28
127
0
22 Jan 2019
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md. Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
45
760
0
06 Oct 2018
Graph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph Generation
Jianwei Yang
Jiasen Lu
Stefan Lee
Dhruv Batra
Devi Parikh
GNN
33
836
0
01 Aug 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual
  Question Answering
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
19
79
0
24 May 2018
Defoiling Foiled Image Captions
Defoiling Foiled Image Captions
Pranava Madhyastha
Josiah Wang
Lucia Specia
22
9
0
16 May 2018
AI Challenger : A Large-scale Dataset for Going Deeper in Image
  Understanding
AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding
Jiahong Wu
He Zheng
Bo-Lu Zhao
Yixin Li
Baoming Yan
...
Shipei Zhou
G. Lin
Yanwei Fu
Yizhou Wang
Yonggang Wang
VLM
38
149
0
17 Nov 2017
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
14
229
0
10 Oct 2016
1