ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1603.03925
  4. Cited By
Image Captioning with Semantic Attention

Image Captioning with Semantic Attention

12 March 2016
Quanzeng You
Hailin Jin
Zhaowen Wang
Chen Fang
Jiebo Luo
    VLM
ArXivPDFHTML

Papers citing "Image Captioning with Semantic Attention"

50 / 562 papers shown
Title
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and
  Design
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design
Zhen Huang
Yihao Li
Dong Pei
Jiapeng Zhou
Xuliang Ning
Jianlin Han
Xiaoguang Han
Xuejun Chen
40
3
0
13 Nov 2023
ITEm: Unsupervised Image-Text Embedding Learning for eCommerce
ITEm: Unsupervised Image-Text Embedding Learning for eCommerce
Baohao Liao
Michael Kozielski
Sanjika Hewavitharana
Jiangbo Yuan
Shahram Khadivi
Tomer Lancewicki
SSL
18
0
0
22 Oct 2023
A Comparative Study of Pre-trained CNNs and GRU-Based Attention for
  Image Caption Generation
A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation
Rashid Khan
Bingding Huang
Haseeb Hassan
Asim Zaman
Z. Ye
29
2
0
11 Oct 2023
HOI4ABOT: Human-Object Interaction Anticipation for Human Intention
  Reading Collaborative roBOTs
HOI4ABOT: Human-Object Interaction Anticipation for Human Intention Reading Collaborative roBOTs
Esteve Valls Mascaro
Daniel Sliwowski
Dongheui Lee
27
8
0
28 Sep 2023
Targeted Image Data Augmentation Increases Basic Skills Captioning
  Robustness
Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness
Valentin Barriere
Felipe del Rio
Andres Carvallo De Ferari
Carlos Aspillaga
Eugenio Herrera-Berg
Cristian Buc Calderon
DiffM
27
0
0
27 Sep 2023
Towards Answering Health-related Questions from Medical Videos: Datasets
  and Approaches
Towards Answering Health-related Questions from Medical Videos: Datasets and Approaches
Deepak Gupta
Kush Attal
Dina Demner-Fushman
LM&MA
27
1
0
21 Sep 2023
R2GenGPT: Radiology Report Generation with Frozen LLMs
R2GenGPT: Radiology Report Generation with Frozen LLMs
Zhanyu Wang
Lingqiao Liu
Lei Wang
Luping Zhou
MedIm
LM&MA
VLM
22
64
0
18 Sep 2023
From Pixels to Portraits: A Comprehensive Survey of Talking Head
  Generation Techniques and Applications
From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications
Shreyank N. Gowda
Dheeraj Pandey
Shashank Narayana Gowda
52
3
0
30 Aug 2023
DLIP: Distilling Language-Image Pre-training
DLIP: Distilling Language-Image Pre-training
Huafeng Kuang
Jie Wu
Xiawu Zheng
Ming Li
Xuefeng Xiao
Rui Wang
Min Zheng
Rongrong Ji
VLM
44
4
0
24 Aug 2023
With a Little Help from your own Past: Prototypical Memory Networks for
  Image Captioning
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
55
19
0
23 Aug 2023
IIHT: Medical Report Generation with Image-to-Indicator Hierarchical
  Transformer
IIHT: Medical Report Generation with Image-to-Indicator Hierarchical Transformer
Keqi Fan
Xiaohao Cai
M. Niranjan
MedIm
ViT
11
3
0
10 Aug 2023
Asynchronous Evolution of Deep Neural Network Architectures
Asynchronous Evolution of Deep Neural Network Architectures
J. Liang
H. Shahrzad
Risto Miikkulainen
28
0
0
08 Aug 2023
A Comprehensive Analysis of Real-World Image Captioning and Scene
  Identification
A Comprehensive Analysis of Real-World Image Captioning and Scene Identification
Sai Suprabhanu Nallapaneni
Subrahmanyam Konakanchi
30
2
0
05 Aug 2023
Transferable Decoding with Visual Entities for Zero-Shot Image
  Captioning
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning
Junjie Fei
Teng Wang
Jinrui Zhang
Zhenyu He
Chengjie Wang
Feng Zheng
VLM
28
34
0
31 Jul 2023
Enhancing image captioning with depth information using a
  Transformer-based framework
Enhancing image captioning with depth information using a Transformer-based framework
Aya Mahmoud Ahmed
Mohamed Yousef
K. Hussain
Yousef B. Mahdy
ViT
24
4
0
24 Jul 2023
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention
  and Text Attributes
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention and Text Attributes
Guoyun Tu
Ying Liu
Vladimir Vlassov
139
1
0
14 Jul 2023
GEST: the Graph of Events in Space and Time as a Common Representation
  between Vision and Language
GEST: the Graph of Events in Space and Time as a Common Representation between Vision and Language
Mihai Masala
Nicolae Cudlenco
Traian Rebedea
Marius Leordeanu
14
0
0
22 May 2023
Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual
  Cross-modal Structure-pivoted Alignment
Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment
Shengqiong Wu
Hao Fei
Wei Ji
Tat-Seng Chua
24
28
0
20 May 2023
Image-to-Text Translation for Interactive Image Recognition: A
  Comparative User Study with Non-Expert Users
Image-to-Text Translation for Interactive Image Recognition: A Comparative User Study with Non-Expert Users
Wataru Kawabe
Yusuke Sugano
VLM
35
2
0
11 May 2023
Transforming Visual Scene Graphs to Image Captions
Transforming Visual Scene Graphs to Image Captions
Xu Yang
Jiawei Peng
Zihua Wang
Haiyang Xu
Qinghao Ye
Chenliang Li
Mingshi Yan
Feisi Huang
Zhangzikang Li
Yu Zhang
49
19
0
03 May 2023
A Review of Deep Learning for Video Captioning
A Review of Deep Learning for Video Captioning
Moloud Abdar
Meenakshi Kollati
Swaraja Kuraparthi
Farhad Pourpanah
Daniel J. McDuff
...
Shuicheng Yan
Abduallah A. Mohamed
Abbas Khosravi
Min Zhang
Fatih Porikli
3DV
37
21
0
22 Apr 2023
Interactive and Explainable Region-guided Radiology Report Generation
Interactive and Explainable Region-guided Radiology Report Generation
Tim Tanida
Philip Muller
Georgios Kaissis
Daniel Rueckert
MedIm
37
110
0
17 Apr 2023
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for
  Human-Object Interaction Detection
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection
Jeeseung Park
Jin Woo Park
Jongseok Lee
ViT
37
44
0
17 Apr 2023
Model-Agnostic Gender Debiased Image Captioning
Model-Agnostic Gender Debiased Image Captioning
Yusuke Hirota
Yuta Nakashima
Noa Garcia
FaML
35
18
0
07 Apr 2023
METransformer: Radiology Report Generation by Transformer with Multiple
  Learnable Expert Tokens
METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
Zhanyu Wang
Lingqiao Liu
Lei Wang
Luping Zhou
MedIm
13
71
0
05 Apr 2023
Multi-modal reward for visual relationships-based image captioning
Multi-modal reward for visual relationships-based image captioning
Ali Abedi
Hossein Karshenas
Peyman Adibi
44
2
0
19 Mar 2023
Generation-Guided Multi-Level Unified Network for Video Grounding
Generation-Guided Multi-Level Unified Network for Video Grounding
Xingyi Cheng
Xiangyu Wu
Dong Shen
Hezheng Lin
Fan Yang
21
0
0
14 Mar 2023
ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using
  Large Language Models
ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models
Sheng Wang
Zihao Zhao
Xi Ouyang
Qian Wang
Dinggang Shen
LM&MA
MedIm
29
140
0
14 Feb 2023
On The Coherence of Quantitative Evaluation of Visual Explanations
On The Coherence of Quantitative Evaluation of Visual Explanations
Benjamin Vandersmissen
José Oramas
XAI
FAtt
36
3
0
14 Feb 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
21
71
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image
  Captioning
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning
Jingqiang Chen
25
3
0
04 Feb 2023
Training Integer-Only Deep Recurrent Neural Networks
Training Integer-Only Deep Recurrent Neural Networks
V. Nia
Eyyub Sari
Vanessa Courville
M. Asgharian
MQ
53
2
0
22 Dec 2022
Backdoor Attack Detection in Computer Vision by Applying Matrix
  Factorization on the Weights of Deep Networks
Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks
Khondoker Murad Hossain
Tim Oates
AAML
26
4
0
15 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
30
62
0
06 Dec 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng-Wei Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffM
VLM
28
17
0
21 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach
  to Cross-Modal Sarcasm Generation
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
29
1
0
20 Nov 2022
Progressive Tree-Structured Prototype Network for End-to-End Image
  Captioning
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning
Pengpeng Zeng
Jinkuan Zhu
Jingkuan Song
Lianli Gao
VLM
24
27
0
17 Nov 2022
OSIC: A New One-Stage Image Captioner Coined
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
28
3
0
04 Nov 2022
Prophet Attention: Predicting Attention with Future Attention for Image
  Captioning
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
24
46
0
19 Oct 2022
Improving Radiology Summarization with Radiograph and Anatomy Prompts
Improving Radiology Summarization with Radiograph and Anatomy Prompts
Jinpeng Hu
Zhihong Chen
Yang Liu
Xiang Wan
Tsung-Hui Chang
MedIm
34
8
0
15 Oct 2022
What Should the System Do Next?: Operative Action Captioning for
  Estimating System Actions
What Should the System Do Next?: Operative Action Captioning for Estimating System Actions
Taiki Nakamura
Seiya Kawano
Akishige Yuguchi
Yasutomo Kawanishi
Koichiro Yoshino
14
0
0
06 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
33
16
0
05 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
40
10
0
04 Oct 2022
Medical Image Captioning via Generative Pretrained Transformers
Medical Image Captioning via Generative Pretrained Transformers
Alexander Selivanov
Oleg Y. Rogov
Daniil Chesakov
Artem Shelmanov
Irina Fedulova
Dmitry V. Dylov
MedIm
57
55
0
28 Sep 2022
STING: Self-attention based Time-series Imputation Networks using GAN
STING: Self-attention based Time-series Imputation Networks using GAN
Eunkyu Oh
Taehun Kim
Yunhu Ji
Sushil Khyalia
AI4TS
29
25
0
22 Sep 2022
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning
  in Wikipedia
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
36
10
0
21 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
29
2
0
16 Sep 2022
M^4I: Multi-modal Models Membership Inference
M^4I: Multi-modal Models Membership Inference
Pingyi Hu
Zihan Wang
Ruoxi Sun
Hu Wang
Minhui Xue
39
26
0
15 Sep 2022
Efficient Vision-Language Pretraining with Visual Concepts and
  Hierarchical Alignment
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment
Mustafa Shukor
Guillaume Couairon
Matthieu Cord
VLM
CLIP
24
27
0
29 Aug 2022
Previous
12345...101112
Next