ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1506.01144
  4. Cited By
What value do explicit high level concepts have in vision to language
  problems?

What value do explicit high level concepts have in vision to language problems?

3 June 2015
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
Anton Van Den Hengel
ArXivPDFHTML

Papers citing "What value do explicit high level concepts have in vision to language problems?"

50 / 56 papers shown
Title
Reminding Multimodal Large Language Models of Object-aware Knowledge
  with Retrieved Tags
Reminding Multimodal Large Language Models of Object-aware Knowledge with Retrieved Tags
Daiqing Qi
Handong Zhao
Zijun Wei
Sheng Li
46
2
0
16 Jun 2024
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
An Image captioning algorithm based on the Hybrid Deep Learning
  Technique (CNN+GRU)
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
26
10
0
06 Jan 2023
How to Describe Images in a More Funny Way? Towards a Modular Approach
  to Cross-Modal Sarcasm Generation
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
29
1
0
20 Nov 2022
FACT: Learning Governing Abstractions Behind Integer Sequences
FACT: Learning Governing Abstractions Behind Integer Sequences
Peter Belcak
Ard Kastrati
Flavio Schenker
Roger Wattenhofer
38
5
0
20 Sep 2022
A Comprehensive Survey of Natural Language Generation Advances from the
  Perspective of Digital Deception
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
39
3
0
11 Aug 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
22
89
0
31 Jan 2022
An Integrated Approach for Video Captioning and Applications
An Integrated Approach for Video Captioning and Applications
Soheyla Amirian
T. Taha
Khaled Rasheed
H. Arabnia
31
1
0
23 Jan 2022
A Survey of Natural Language Generation
A Survey of Natural Language Generation
Chenhe Dong
Hai-Tao Zheng
Haifan Gong
Mengzhao Chen
Junxin Li
Ying Shen
Min Yang
3DV
27
43
0
22 Dec 2021
ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition
ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition
Xinyu Wang
Min Gui
Yong-jia Jiang
Zixia Jia
Nguyen Bach
Tao Wang
Zhongqiang Huang
Fei Huang
Kewei Tu
44
52
0
13 Dec 2021
Dynamic Visual Reasoning by Learning Differentiable Physics Models from
  Video and Language
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
Mingyu Ding
Zhenfang Chen
Tao Du
Ping Luo
J. Tenenbaum
Chuang Gan
VGen
PINN
OCL
30
74
0
28 Oct 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
254
0
14 Jul 2021
Grounding Physical Concepts of Objects and Events Through Dynamic Visual
  Reasoning
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Zhenfang Chen
Jiayuan Mao
Jiajun Wu
Kwan-Yee K. Wong
J. Tenenbaum
Chuang Gan
VGen
36
92
0
30 Mar 2021
Cross-Media Keyphrase Prediction: A Unified Framework with
  Multi-Modality Multi-Head Attention and Image Wordings
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
Yue Wang
Jing Li
M. Lyu
Irwin King
11
16
0
03 Nov 2020
Teacher-Critical Training Strategies for Image Captioning
Teacher-Critical Training Strategies for Image Captioning
Yiqing Huang
Jiansheng Chen
VLM
29
8
0
30 Sep 2020
Improving Image Captioning with Better Use of Captions
Improving Image Captioning with Better Use of Captions
Zhan Shi
Xu Zhou
Xipeng Qiu
Xiao-Dan Zhu
30
122
0
21 Jun 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic
  Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO
  Framework
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
27
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
25
16
0
15 Feb 2020
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
23
70
0
17 Nov 2019
Exploring Overall Contextual Information for Image Captioning in
  Human-Like Cognitive Style
Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style
Hongwei Ge
Zehang Yan
Kai Zhang
Mingde Zhao
Liang Sun
30
24
0
15 Oct 2019
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image
  Representations
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations
Fenglin Liu
Yuanxin Liu
Xuancheng Ren
Xiaodong He
Xu Sun
VLM
31
81
0
15 May 2019
Pointing Novel Objects in Image Captioning
Pointing Novel Objects in Image Captioning
Yehao Li
Ting Yao
Yingwei Pan
Hongyang Chao
Tao Mei
33
69
0
25 Apr 2019
Multi-modal gated recurrent units for image description
Multi-modal gated recurrent units for image description
Xuelong Li
Aihong Yuan
Xiaoqiang Lu
GAN
21
26
0
20 Apr 2019
Describing like humans: on diversity in image captioning
Describing like humans: on diversity in image captioning
Qingzhong Wang
Antoni B. Chan
24
98
0
28 Mar 2019
Coarse-to-fine: A RNN-based hierarchical attention model for vehicle
  re-identification
Coarse-to-fine: A RNN-based hierarchical attention model for vehicle re-identification
Xiu-Shen Wei
Chen-Da Liu-Zhang
Lingqiao Liu
Chunhua Shen
Jianxin Wu
19
43
0
11 Dec 2018
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
45
760
0
06 Oct 2018
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and
  Comprehensive Image Captions
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Fenglin Liu
Xuancheng Ren
Yuanxin Liu
Houfeng Wang
Xu Sun
98
65
0
27 Aug 2018
Distinctive-attribute Extraction for Image Captioning
Distinctive-attribute Extraction for Image Captioning
Boeun Kim
Young Han Lee
Hyedong Jung
C. Cho
19
6
0
25 Jul 2018
Topic-Guided Attention for Image Captioning
Topic-Guided Attention for Image Captioning
Zhihao Zhu
Zhan Xue
Zejian Yuan
30
23
0
10 Jul 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual
  Question Answering
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
25
79
0
24 May 2018
Object Counts! Bringing Explicit Detections Back into Image Captioning
Object Counts! Bringing Explicit Detections Back into Image Captioning
Josiah Wang
Pranava Madhyastha
Lucia Specia
ObjD
19
37
0
23 Apr 2018
Learning to Guide Decoding for Image Captioning
Learning to Guide Decoding for Image Captioning
Wenhao Jiang
Lin Ma
Xinpeng Chen
Hanwang Zhang
Wei Liu
16
69
0
03 Apr 2018
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual
  Questions
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions
Qing Li
Qingyi Tao
Chenyu You
Jianfei Cai
Jiebo Luo
34
106
0
20 Mar 2018
Neural Aesthetic Image Reviewer
Neural Aesthetic Image Reviewer
Wenshan Wang
Su Yang
Weishan Zhang
Jiulong Zhang
22
38
0
28 Feb 2018
Disjoint Multi-task Learning between Heterogeneous Human-centric Tasks
Disjoint Multi-task Learning between Heterogeneous Human-centric Tasks
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
Youngjin Yoon
In So Kweon
24
27
0
14 Feb 2018
Tell-and-Answer: Towards Explainable Visual Question Answering using
  Attributes and Captions
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
Qing Li
Jianlong Fu
D. Yu
Tao Mei
Jiebo Luo
FAtt
XAI
CoGe
51
60
0
27 Jan 2018
Attacking Visual Language Grounding with Adversarial Examples: A Case
  Study on Neural Image Captioning
Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning
Hongge Chen
Huan Zhang
Pin-Yu Chen
Jinfeng Yi
Cho-Jui Hsieh
GAN
AAML
35
49
0
06 Dec 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training
  dataset for image captioning
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning
Yang Xian
Yingli Tian
VLM
25
22
0
15 Sep 2017
VQS: Linking Segmentations to Questions and Answers for Supervised
  Attention in VQA and Question-Focused Semantic Segmentation
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
27
126
0
15 Aug 2017
Fluency-Guided Cross-Lingual Image Captioning
Fluency-Guided Cross-Lingual Image Captioning
Weiyu Lan
Xirong Li
Jianfeng Dong
19
93
0
15 Aug 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
Y. Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
34
546
0
14 Apr 2017
Recurrent Models for Situation Recognition
Recurrent Models for Situation Recognition
Arun Mallya
Svetlana Lazebnik
20
30
0
18 Mar 2017
MAT: A Multimodal Attentive Translator for Image Captioning
MAT: A Multimodal Attentive Translator for Image Captioning
Chang Liu
F. Sun
Changhu Wang
Feng Wang
Alan Yuille
20
58
0
18 Feb 2017
An Empirical Study of Language CNN for Image Captioning
An Empirical Study of Language CNN for Image Captioning
Jiuxiang Gu
G. Wang
Jianfei Cai
Tsuhan Chen
31
132
0
21 Dec 2016
The VQA-Machine: Learning How to Use Existing Vision Algorithms to
  Answer New Questions
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions
Peng Wang
Qi Wu
Chunhua Shen
Anton Van Den Hengel
OOD
32
86
0
16 Dec 2016
Areas of Attention for Image Captioning
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
33
205
0
03 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
21
232
0
02 Dec 2016
Semantic Regularisation for Recurrent Image Annotation
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
29
103
0
16 Nov 2016
Boosting Image Captioning with Attributes
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
48
620
0
05 Nov 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
14
230
0
10 Oct 2016
12
Next