ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1603.03925
  4. Cited By
Image Captioning with Semantic Attention

Image Captioning with Semantic Attention

12 March 2016
Quanzeng You
Hailin Jin
Zhaowen Wang
Chen Fang
Jiebo Luo
    VLM
ArXivPDFHTML

Papers citing "Image Captioning with Semantic Attention"

50 / 562 papers shown
Title
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
39
0
0
09 Apr 2025
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
45
0
0
03 Apr 2025
Fair Dynamic Spectrum Access via Fully Decentralized Multi-Agent Reinforcement Learning
Fair Dynamic Spectrum Access via Fully Decentralized Multi-Agent Reinforcement Learning
Yubo Zhang
Pedro Botelho
Trevor Gordon
Gil Zussman
I. Kadota
55
0
0
31 Mar 2025
ChatBEV: A Visual Language Model that Understands BEV Maps
ChatBEV: A Visual Language Model that Understands BEV Maps
Qingyao Xu
Tian Jin
Guang Chen
Yanfeng Wang
Yuyao Zhang
51
0
0
18 Mar 2025
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Xin Gu
Yaojie Shen
Chenxi Luo
Tiejian Luo
Yan Huang
Yuewei Lin
Heng Fan
L. Zhang
66
1
0
16 Feb 2025
An Ensemble Model with Attention Based Mechanism for Image Captioning
Israa Al Badarneh
Bassam Hammo
Omar Al-Kadi
50
3
0
28 Jan 2025
Classifier-Guided Captioning Across Modalities
Ariel Shaulov
Tal Shaharabany
E. Shaar
Gal Chechik
Lior Wolf
33
0
0
03 Jan 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
53
0
0
03 Jan 2025
Progress-Aware Video Frame Captioning
Progress-Aware Video Frame Captioning
Zihui Xue
Joungbin An
Xitong Yang
Kristen Grauman
100
1
0
03 Dec 2024
FINECAPTION: Compositional Image Captioning Focusing on Wherever You
  Want at Any Granularity
FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
Hang Hua
Qing Liu
Lingzhi Zhang
Jing Shi
Zhifei Zhang
Yilin Wang
Jianming Zhang
Jiebo Luo
CoGe
VLM
95
6
0
23 Nov 2024
A Monte Carlo Framework for Calibrated Uncertainty Estimation in
  Sequence Prediction
A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction
Qidong Yang
Weicheng Zhu
Joseph Keslin
L. Zanna
Tim G. J. Rudner
Carlos Fernandez-Granda
BDL
UQCV
AI4TS
46
0
0
30 Oct 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
35
0
0
09 Aug 2024
PC$^2$: Pseudo-Classification Based Pseudo-Captioning for Noisy
  Correspondence Learning in Cross-Modal Retrieval
PC2^22: Pseudo-Classification Based Pseudo-Captioning for Noisy Correspondence Learning in Cross-Modal Retrieval
Yue Duan
Zhangxuan Gu
ZhenZhe Ying
Wei Li
Yu Zhang
Zibin Zheng
26
2
0
02 Aug 2024
A Plug-and-Play Method for Rare Human-Object Interactions Detection by
  Bridging Domain Gap
A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap
Lijun Zhang
Wei Suo
Yiyan Qi
Yanning Zhang
27
2
0
31 Jul 2024
HICEScore: A Hierarchical Metric for Image Captioning Evaluation
HICEScore: A Hierarchical Metric for Image Captioning Evaluation
Zequn Zeng
Jianqiao Sun
Hao Zhang
Tiansheng Wen
Yudi Su
Yan Xie
Zhengjue Wang
Boli Chen
46
3
0
26 Jul 2024
HERGen: Elevating Radiology Report Generation with Longitudinal Data
HERGen: Elevating Radiology Report Generation with Longitudinal Data
Fuying Wang
Shenghui Du
Lequan Yu
MedIm
45
5
0
21 Jul 2024
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
Weitai Kang
Mengxue Qu
Yunchao Wei
Yan Yan
41
6
0
03 Jul 2024
Visual Grounding with Attention-Driven Constraint Balancing
Visual Grounding with Attention-Driven Constraint Balancing
Weitai Kang
Luowei Zhou
Junyi Wu
Changchang Sun
Yan Yan
45
4
0
03 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual
  Grounding
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
38
9
0
03 Jul 2024
Explainable Image Captioning using CNN- CNN architecture and
  Hierarchical Attention
Explainable Image Captioning using CNN- CNN architecture and Hierarchical Attention
Rishi Mohan
Sanjay Sureshkumar
Vignesh Sivasubramaniam
31
1
0
28 Jun 2024
Enhancing Scientific Figure Captioning Through Cross-modal Learning
Enhancing Scientific Figure Captioning Through Cross-modal Learning
Mateo Alejandro Rojas
Rafael Carranza
44
0
0
24 Jun 2024
Reminding Multimodal Large Language Models of Object-aware Knowledge
  with Retrieved Tags
Reminding Multimodal Large Language Models of Object-aware Knowledge with Retrieved Tags
Daiqing Qi
Handong Zhao
Zijun Wei
Sheng Li
46
2
0
16 Jun 2024
What do MLLMs hear? Examining reasoning with text and sound components
  in Multimodal Large Language Models
What do MLLMs hear? Examining reasoning with text and sound components in Multimodal Large Language Models
Enis Berk Çoban
Michael I. Mandel
Johanna Devaney
AuLLM
LRM
38
0
0
07 Jun 2024
Enhancing Multimodal Large Language Models with Multi-instance Visual
  Prompt Generator for Visual Representation Enrichment
Enhancing Multimodal Large Language Models with Multi-instance Visual Prompt Generator for Visual Representation Enrichment
Wenliang Zhong
Wenyi Wu
Qi Li
Rob Barton
Boxin Du
Shioulin Sam
Karim Bouyarmane
Ismail B. Tutar
Junzhou Huang
33
3
0
05 Jun 2024
Image Captioning via Dynamic Path Customization
Image Captioning via Dynamic Path Customization
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Xiaopeng Hong
Yongjian Wu
Rongrong Ji
34
0
0
01 Jun 2024
Enhancing Near OOD Detection in Prompt Learning: Maximum Gains, Minimal
  Costs
Enhancing Near OOD Detection in Prompt Learning: Maximum Gains, Minimal Costs
M. Jung
He Zhao
Joanna Dipnall
Belinda Gabbe
Lan Du
VLM
OODD
34
1
0
25 May 2024
Towards Retrieval-Augmented Architectures for Image Captioning
Towards Retrieval-Augmented Architectures for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
VLM
32
9
0
21 May 2024
Faithful Attention Explainer: Verbalizing Decisions Based on
  Discriminative Features
Faithful Attention Explainer: Verbalizing Decisions Based on Discriminative Features
Yao Rong
David Scheerer
Enkelejda Kasneci
48
0
0
16 May 2024
Topicwise Separable Sentence Retrieval for Medical Report Generation
Topicwise Separable Sentence Retrieval for Medical Report Generation
Junting Zhao
Yang Zhou
Zhihao Chen
Huazhu Fu
Liang Wan
MedIm
25
1
0
07 May 2024
SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision
  Language Models
SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models
M. Kapadnis
Sohan Patnaik
Abhilash Nandy
Sourjyadip Ray
Pawan Goyal
Debdoot Sheet
VLM
33
3
0
27 Apr 2024
Understanding attention-based encoder-decoder networks: a case study
  with chess scoresheet recognition
Understanding attention-based encoder-decoder networks: a case study with chess scoresheet recognition
Sergio Y. Hayashi
N. Hirata
57
0
0
23 Apr 2024
Optimal Kernel Tuning Parameter Prediction using Deep Sequence Models
Optimal Kernel Tuning Parameter Prediction using Deep Sequence Models
Khawir Mahmood
Jehandad Khan
Hammad Afzal
23
0
0
15 Apr 2024
Memory-based Cross-modal Semantic Alignment Network for Radiology Report
  Generation
Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation
Yitian Tao
Liyan Ma
Jing Yu
Han Zhang
MedIm
34
6
0
31 Mar 2024
Text Data-Centric Image Captioning with Interactive Prompts
Text Data-Centric Image Captioning with Interactive Prompts
Yiyu Wang
Hao Luo
Jungang Xu
Yingfei Sun
Fan Wang
VLM
38
0
0
28 Mar 2024
Semi-Supervised Image Captioning Considering Wasserstein Graph Matching
Semi-Supervised Image Captioning Considering Wasserstein Graph Matching
Yang Yang
41
0
0
26 Mar 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
150
310
0
21 Mar 2024
Advancing Security in AI Systems: A Novel Approach to Detecting
  Backdoors in Deep Neural Networks
Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks
Khondoker Murad Hossain
Tim Oates
AAML
29
1
0
13 Mar 2024
How to Understand Named Entities: Using Common Sense for News Captioning
How to Understand Named Entities: Using Common Sense for News Captioning
Ning Xu
Yanhui Wang
Tingting Zhang
Hongshuo Tian
Mohan S. Kankanhalli
An-An Liu
32
0
0
11 Mar 2024
Intensive Vision-guided Network for Radiology Report Generation
Intensive Vision-guided Network for Radiology Report Generation
Fudan Zheng
Mengfei Li
Ying Wang
Weijiang Yu
Ruixuan Wang
Zhiguang Chen
Nong Xiao
Yutong Lu
33
1
0
06 Feb 2024
Context-Guided Spatio-Temporal Video Grounding
Context-Guided Spatio-Temporal Video Grounding
Xin Gu
Hengrui Fan
Yan Huang
Tiejian Luo
Libo Zhang
35
14
0
03 Jan 2024
MIVC: Multiple Instance Visual Component for Visual-Language Models
MIVC: Multiple Instance Visual Component for Visual-Language Models
Wenyi Wu
Qi Li
Leon Wenliang Zhong
Junzhou Huang
33
3
0
28 Dec 2023
User-Aware Prefix-Tuning is a Good Learner for Personalized Image
  Captioning
User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning
Xuan Wang
Guanhong Wang
Wenhao Chai
Jiayu Zhou
Gaoang Wang
37
4
0
08 Dec 2023
Towards Knowledge-driven Autonomous Driving
Towards Knowledge-driven Autonomous Driving
Xin Li
Yeqi Bai
Pinlong Cai
Licheng Wen
Daocheng Fu
...
Yikang Li
Botian Shi
Yong-Jin Liu
Liang He
Yu Qiao
34
26
0
07 Dec 2023
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
Sherwin Bahmani
Ivan Skorokhodov
Victor Rong
Gordon Wetzstein
Leonidas J. Guibas
Peter Wonka
Sergey Tulyakov
Jeong Joon Park
Andrea Tagliasacchi
David B. Lindell
DiffM
54
103
0
29 Nov 2023
EgoThink: Evaluating First-Person Perspective Thinking Capability of
  Vision-Language Models
EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models
Sijie Cheng
Zhicheng Guo
Jingwen Wu
Kechen Fang
Peng Li
Huaping Liu
Yang Liu
EgoV
LRM
36
16
0
27 Nov 2023
Generating Human-Centric Visual Cues for Human-Object Interaction
  Detection via Large Vision-Language Models
Generating Human-Centric Visual Cues for Human-Object Interaction Detection via Large Vision-Language Models
Yu-Wei Zhan
Fan Liu
Xin Luo
Liqiang Nie
Xin-Shun Xu
Mohan S. Kankanhalli
VLM
38
0
0
26 Nov 2023
Multimodal Large Language Models: A Survey
Multimodal Large Language Models: A Survey
Jiayang Wu
Wensheng Gan
Zefeng Chen
Shicheng Wan
Philip S. Yu
36
169
0
22 Nov 2023
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
  Generation
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation
Nurbanu Aksoy
Serge Sharoff
Selçuk Başer
Nishant Ravikumar
Alejandro F Frangi
MedIm
19
4
0
18 Nov 2023
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini
  Decoder
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder
Abdelrahman Mohamed
Fakhraddin Alwajih
El Moatez Billah Nagoudi
Alcides Alcoba Inciarte
Muhammad Abdul-Mageed
VLM
MLLM
30
7
0
15 Nov 2023
To See is to Believe: Prompting GPT-4V for Better Visual Instruction
  Tuning
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Junke Wang
Lingchen Meng
Zejia Weng
Bo He
Zuxuan Wu
Yu-Gang Jiang
MLLM
VLM
32
94
0
13 Nov 2023
1234...101112
Next