ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.01646
  4. Cited By
Boosting Image Captioning with Attributes

Boosting Image Captioning with Attributes

5 November 2016
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
    VLM
ArXiv (abs)PDFHTML

Papers citing "Boosting Image Captioning with Attributes"

50 / 222 papers shown
Title
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
131
0
0
23 Apr 2025
ChatBEV: A Visual Language Model that Understands BEV Maps
ChatBEV: A Visual Language Model that Understands BEV Maps
Qingyao Xu
Tian Jin
Guang Chen
Yanfeng Wang
Yize Zhang
70
1
0
18 Mar 2025
A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
Qing Zhou
Tao Yang
Junyu Gao
W. Ni
Junzheng Wu
Qi Wang
78
0
0
06 Mar 2025
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
Chunbai Zhang
Chao Wang
Yang Zhou
Yan Peng
LRMReLM
154
0
0
02 Feb 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffMVLM
139
0
0
03 Jan 2025
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
73
0
0
09 Aug 2024
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for
  Image-Text Matching
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
Jie Wang
Joemon M. Jose
67
1
0
05 Jun 2024
Enhancing Multimodal Large Language Models with Multi-instance Visual
  Prompt Generator for Visual Representation Enrichment
Enhancing Multimodal Large Language Models with Multi-instance Visual Prompt Generator for Visual Representation Enrichment
Wenliang Zhong
Wenyi Wu
Qi Li
Rob Barton
Boxin Du
Shioulin Sam
Karim Bouyarmane
Ismail B. Tutar
Junzhou Huang
92
3
0
05 Jun 2024
Image Captioning via Dynamic Path Customization
Image Captioning via Dynamic Path Customization
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Xiaopeng Hong
Yongjian Wu
Rongrong Ji
81
1
0
01 Jun 2024
To See is to Believe: Prompting GPT-4V for Better Visual Instruction
  Tuning
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Junke Wang
Lingchen Meng
Zejia Weng
Bo He
Zuxuan Wu
Yu-Gang Jiang
MLLMVLM
121
108
0
13 Nov 2023
Targeted Image Data Augmentation Increases Basic Skills Captioning
  Robustness
Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness
Valentin Barriere
Felipe del Rio
Andres Carvallo De Ferari
Carlos Aspillaga
Eugenio Herrera-Berg
Cristian Buc Calderon
DiffM
71
0
0
27 Sep 2023
Informative Scene Graph Generation via Debiasing
Informative Scene Graph Generation via Debiasing
Lianli Gao
Xinyu Lyu
Yuyu Guo
Yuxuan Hu
Yuanyou Li
Lu Xu
Hengtao Shen
Jingkuan Song
65
5
0
10 Aug 2023
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention
  and Text Attributes
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention and Text Attributes
Guoyun Tu
Ying Liu
Vladimir Vlassov
155
1
0
14 Jul 2023
Learning Knowledge-Rich Sequential Model for Planar Homography
  Estimation in Aerial Video
Learning Knowledge-Rich Sequential Model for Planar Homography Estimation in Aerial Video
Pu Li
Xiaobai Liu
30
1
0
05 Apr 2023
One Small Step for Generative AI, One Giant Leap for AGI: A Complete
  Survey on ChatGPT in AIGC Era
One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era
Chaoning Zhang
Chenshuang Zhang
Chenghao Li
Yu Qiao
Sheng Zheng
...
Sung-Ho Bae
Lik-Hang Lee
Pan Hui
In So Kweon
Choong Seon Hong
LM&MAAI4MHLRMELM
106
137
0
04 Apr 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning
  Evaluation
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
84
60
0
21 Mar 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
100
79
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
84
4
0
08 Feb 2023
Summarize the Past to Predict the Future: Natural Language Descriptions
  of Context Boost Multimodal Object Interaction Anticipation
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
169
14
0
22 Jan 2023
An Image captioning algorithm based on the Hybrid Deep Learning
  Technique (CNN+GRU)
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
119
10
0
06 Jan 2023
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
91
74
0
06 Dec 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffMVLM
100
24
0
21 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach
  to Cross-Modal Sarcasm Generation
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
69
1
0
20 Nov 2022
OSIC: A New One-Stage Image Captioner Coined
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
85
4
0
04 Nov 2022
DiMBERT: Learning Vision-Language Grounded Representations with
  Disentangled Multimodal-Attention
DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Fenglin Liu
Xian Wu
Shen Ge
Xuancheng Ren
Wei Fan
Xu Sun
Yuexian Zou
VLM
108
13
0
28 Oct 2022
Prophet Attention: Predicting Attention with Future Attention for Image
  Captioning
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
111
48
0
19 Oct 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
59
26
0
17 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
85
10
0
04 Oct 2022
M^4I: Multi-modal Models Membership Inference
M^4I: Multi-modal Models Membership Inference
Pingyi Hu
Zihan Wang
Ruoxi Sun
Hu Wang
Minhui Xue
99
27
0
15 Sep 2022
Distinctive Image Captioning via CLIP Guided Group Optimization
Distinctive Image Captioning via CLIP Guided Group Optimization
Youyuan Zhang
Jiuniu Wang
Hao Wu
Wenjia Xu
VLM
101
8
0
08 Aug 2022
Boosting Video-Text Retrieval with Explicit High-Level Semantics
Boosting Video-Text Retrieval with Explicit High-Level Semantics
Haoran Wang
Di Xu
Dongliang He
Fu Li
Zhong Ji
Jungong Han
Errui Ding
75
13
0
08 Aug 2022
Efficient Modeling of Future Context for Image Captioning
Efficient Modeling of Future Context for Image Captioning
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
76
15
0
22 Jul 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
84
114
0
20 Jul 2022
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised
  Referring Expression Grounding
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Xuejing Liu
Liang Li
Shuhui Wang
Zhengjun Zha
Dechao Meng
Qi Tian
Qingming Huang
94
62
0
18 Jul 2022
Learning To Generate Scene Graph from Head to Tail
Learning To Generate Scene Graph from Head to Tail
Chao Zheng
Xinyu Lyu
Yuyu Guo
Pengpeng Zeng
Jingkuan Song
Lianli Gao
69
12
0
23 Jun 2022
Bypass Network for Semantics Driven Image Paragraph Captioning
Bypass Network for Semantics Driven Image Paragraph Captioning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
122
1
0
21 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
82
92
0
14 Jun 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chen Chen
VLM
97
33
0
26 May 2022
Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity
  Resolution
Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution
Georgios Tziafas
S. Kasaei
117
2
0
24 May 2022
Guiding Attention using Partial-Order Relationships for Image Captioning
Guiding Attention using Partial-Order Relationships for Image Captioning
Murad Popattia
Muhammad Rafi
Rizwan Qureshi
Shah Nawaz
52
5
0
15 Apr 2022
Image Captioning In the Transformer Age
Image Captioning In the Transformer Age
Yangliu Xu
Li Li
Haiyang Xu
Songfang Huang
Fei Huang
Jianfei Cai
ViT
59
6
0
15 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
On Distinctive Image Captioning via Comparing and Reweighting
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
91
16
0
08 Apr 2022
Exploiting long-term temporal dynamics for video captioning
Exploiting long-term temporal dynamics for video captioning
Yuyu Guo
Jingqiu Zhang
Lianli Gao
53
18
0
22 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
136
101
0
31 Jan 2022
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular
  Vision-Language Pre-training
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Yehao Li
Jiahao Fan
Yingwei Pan
Ting Yao
Weiyao Lin
Tao Mei
MLLMObjD
81
19
0
11 Jan 2022
Compact Bidirectional Transformer for Image Captioning
Compact Bidirectional Transformer for Image Captioning
Yuanen Zhou
Zhenzhen Hu
Daqing Liu
Huixia Ben
Meng Wang
VLM
67
17
0
06 Jan 2022
Translational Concept Embedding for Generalized Compositional Zero-shot
  Learning
Translational Concept Embedding for Generalized Compositional Zero-shot Learning
He Huang
Wei Tang
Jiawei Zhang
Philip S. Yu
CoGe
48
2
0
20 Dec 2021
Calorie Aware Automatic Meal Kit Generation from an Image
Calorie Aware Automatic Meal Kit Generation from an Image
Ahmad Babaeian Jelodar
Yu Sun
37
2
0
18 Dec 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
95
47
0
29 Nov 2021
Scaling Up Vision-Language Pre-training for Image Captioning
Scaling Up Vision-Language Pre-training for Image Captioning
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Zhengyuan Yang
Zicheng Liu
Yumao Lu
Lijuan Wang
MLLMVLM
173
249
0
24 Nov 2021
12345
Next