ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.00563
  4. Cited By
Self-critical Sequence Training for Image Captioning
v1v2 (latest)

Self-critical Sequence Training for Image Captioning

2 December 2016
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
ArXiv (abs)PDFHTML

Papers citing "Self-critical Sequence Training for Image Captioning"

50 / 862 papers shown
Title
Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning
Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning
Chunlei Li
Jingyang Hou
Yilei Shi
Jingliang Hu
Xiao Xiang Zhu
Lichao Mou
LM&MA
38
0
0
18 Jun 2025
Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
Ankan Deria
Adinath Madhavrao Dukre
Feilong Tang
Sara Atito
Sudipta Roy
Muhammad Awais
Muhammad Haris Khan
Imran Razzak
VLM
66
0
0
18 Jun 2025
Panoptic Captioning: Seeking An Equivalency Bridge for Image and Text
Panoptic Captioning: Seeking An Equivalency Bridge for Image and Text
Kun-Yu Lin
Hongjun Wang
Weining Ren
Kai Han
308
0
0
22 May 2025
ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search
ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search
Hyunseok Lee
Jeonghoon Kim
Beomjun Kim
Jihoon Tack
Chansong Jo
Jaehong Lee
Cheonbok Park
Sookyo In
Jinwoo Shin
Kang Min Yoo
150
0
0
21 May 2025
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Yapei Chang
Yekyung Kim
Michael Krumdick
Amir Zadeh
Chuan Li
Chris Tanner
Mohit Iyyer
ALM
182
0
0
16 May 2025
Anatomical Attention Alignment representation for Radiology Report Generation
Anatomical Attention Alignment representation for Radiology Report Generation
Quang Vinh Nguyen
Minh Duc Nguyen
Thanh Hoang Son Vo
Hyung-Jeong Yang
Soo-Hyung Kim
MedIm
63
0
0
12 May 2025
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
138
0
0
23 Apr 2025
Zero-Shot, But at What Cost? Unveiling the Hidden Overhead of MILS's LLM-CLIP Framework for Image Captioning
Zero-Shot, But at What Cost? Unveiling the Hidden Overhead of MILS's LLM-CLIP Framework for Image Captioning
Yassir Benhammou
Alessandro Tiberio
Gabriel Trautmann
Suman Kalyan
MLLMVLM
78
0
0
21 Apr 2025
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation
Sang-Jun Park
Keun-Soo Heo
Dong-Hee Shin
Young-Han Son
Ji-Hye Oh
Tae-Eui Kam
MedIm
65
0
0
16 Apr 2025
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Shuai Zhao
Linchao Zhu
Yi Yang
106
3
0
14 Apr 2025
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
183
0
0
03 Apr 2025
Semantic-Spatial Feature Fusion with Dynamic Graph Refinement for Remote Sensing Image Captioning
Semantic-Spatial Feature Fusion with Dynamic Graph Refinement for Remote Sensing Image Captioning
Maofu Liu
Jiahui Liu
Xiaokang Zhang
117
1
0
30 Mar 2025
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Zhaochong An
Guolei Sun
Yun Liu
Runjia Li
Junlin Han
Ender Konukoglu
Serge Belongie
VLM
188
3
0
20 Mar 2025
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
Sara Sarto
Marcella Cornia
Rita Cucchiara
90
1
0
18 Mar 2025
SuperCap: Multi-resolution Superpixel-based Image Captioning
Henry Senior
Luca Rossi
Gregory Slabaugh
Shanxin Yuan
VLM
110
0
0
11 Mar 2025
Measuring directional bias amplification in image captions using predictability
Rahul Nair
Bhanu Tokas
Neel Shah
Hannah Kerner
117
0
0
10 Mar 2025
SED2AM: Solving Multi-Trip Time-Dependent Vehicle Routing Problem using Deep Reinforcement Learning
Arash Mozhdehi
Yansen Wang
Sun Sun
Xin Eric Wang
AI4TS
115
0
0
06 Mar 2025
Q&C: When Quantization Meets Cache in Efficient Image Generation
Xin Ding
Xiaochen Li
Haotong Qin
Zhibo Chen
DiffMMQ
176
0
0
04 Mar 2025
Group Relative Policy Optimization for Image Captioning
Xu Liang
95
1
0
03 Mar 2025
AC-Lite : A Lightweight Image Captioning Model for Low-Resource Assamese Language
AC-Lite : A Lightweight Image Captioning Model for Low-Resource Assamese Language
Pankaj Choudhury
Yogesh Aggarwal
Prabhanjan Jadhav
Prithwijit Guha
Sukumar Nandi
210
0
0
03 Mar 2025
Improved Diffusion-based Generative Model with Better Adversarial Robustness
Improved Diffusion-based Generative Model with Better Adversarial Robustness
Zekun Wang
Mingyang Yi
Shuchen Xue
Zhiyu Li
Ming Liu
Bing Qin
Zhi-Ming Ma
DiffM
121
0
0
24 Feb 2025
Pretrained Image-Text Models are Secretly Video Captioners
Pretrained Image-Text Models are Secretly Video Captioners
Chunhui Zhang
Yiren Jian
Z. Ouyang
Soroush Vosoughi
VLM
149
8
0
20 Feb 2025
Performance Analysis of Traditional VQA Models Under Limited Computational Resources
Jihao Gu
164
0
0
09 Feb 2025
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Kei Katsumata
Motonari Kambara
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
210
0
0
28 Jan 2025
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
Yueqin Yin
Shentao Yang
Yujia Xie
Ziyi Yang
Yuting Sun
Hany Awadalla
Weizhu Chen
Mingyuan Zhou
137
2
0
07 Jan 2025
Classifier-Guided Captioning Across Modalities
Ariel Shaulov
Tal Shaharabany
E. Shaar
Gal Chechik
Lior Wolf
96
0
0
03 Jan 2025
Human-inspired Perspectives: A Survey on AI Long-term Memory
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
266
3
0
01 Nov 2024
Preserving Pre-trained Representation Space: On Effectiveness of
  Prefix-tuning for Large Multi-modal Models
Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models
Donghoon Kim
Gusang Lee
Kyuhong Shim
B. Shim
104
1
0
29 Oct 2024
Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model
  with Meta-Exploration
Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration
Yun-Yen Chuang
Hung-Min Hsu
Kevin Lin
Chen-Sheng Gu
Ling Zhen Li
Ray-I Chang
Hung-yi Lee
DiffMVLM
80
1
0
17 Oct 2024
Increasing the Difficulty of Automatically Generated Questions via
  Reinforcement Learning with Synthetic Preference
Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference
William Thorne
Ambrose Robinson
Bohua Peng
Chenghua Lin
Diana Maynard
76
2
0
10 Oct 2024
Positive-Augmented Contrastive Learning for Vision-and-Language
  Evaluation and Training
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
87
4
0
09 Oct 2024
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
Manu Gaur
Darshan Singh
Makarand Tapaswi
471
1
0
04 Sep 2024
Revisiting Image Captioning Training Paradigm via Direct CLIP-based
  Optimization
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Nicholas Moratelli
Davide Caffagni
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
CLIP
102
3
0
26 Aug 2024
Shifted Window Fourier Transform And Retention For Image Captioning
Shifted Window Fourier Transform And Retention For Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
VLM
122
1
0
25 Aug 2024
DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation
DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation
Xiaowei Mao
Yan Lin
Shengnan Guo
Yubin Chen
Xingyu Xian
Haomin Wen
Qisen Xu
Youfang Lin
Huaiyu Wan
128
1
0
23 Aug 2024
TRRG: Towards Truthful Radiology Report Generation With Cross-modal
  Disease Clue Enhanced Large Language Model
TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Model
Yuhao Wang
Chao Hao
Yawen Cui
Xinqi Su
Weicheng Xie
Tao Tan
Zitong Yu
LM&MAMedIm
84
0
0
22 Aug 2024
See It All: Contextualized Late Aggregation for 3D Dense Captioning
See It All: Contextualized Late Aggregation for 3D Dense Captioning
Minjung Kim
Hyung Suk Lim
Seung Hwan Kim
Soonyoung Lee
Bumsoo Kim
Gunhee Kim
94
4
0
14 Aug 2024
Bi-directional Contextual Attention for 3D Dense Captioning
Bi-directional Contextual Attention for 3D Dense Captioning
Minjung Kim
Hyung Suk Lim
Soonyoung Lee
Bumsoo Kim
Gunhee Kim
87
3
0
13 Aug 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
84
0
0
09 Aug 2024
e-Health CSIRO at RRG24: Entropy-Augmented Self-Critical Sequence
  Training for Radiology Report Generation
e-Health CSIRO at RRG24: Entropy-Augmented Self-Critical Sequence Training for Radiology Report Generation
Aaron Nicolson
Jinghui Liu
Jason Dowling
Anthony N. Nguyen
Bevan Koopman
103
5
0
07 Aug 2024
Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation
Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation
Hui Ma
Bo Zhang
Bo Xu
Jian Wang
Hongfei Lin
Xiao Sun
142
1
0
06 Aug 2024
GazeXplain: Learning to Predict Natural Language Explanations of Visual
  Scanpaths
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
Xianyu Chen
Ming Jiang
Qi Zhao
77
3
0
05 Aug 2024
Positive Text Reframing under Multi-strategy Optimization
Positive Text Reframing under Multi-strategy Optimization
Shutong Jia
Biwei Cao
Qingqing Gao
Jiuxin Cao
Bo Liu
85
1
0
25 Jul 2024
Take a Step and Reconsider: Sequence Decoding for Self-Improved Neural
  Combinatorial Optimization
Take a Step and Reconsider: Sequence Decoding for Self-Improved Neural Combinatorial Optimization
Jonathan Pirnay
D. G. Grimm
BDL
123
3
0
24 Jul 2024
Conversational Query Reformulation with the Guidance of Retrieved Documents
Conversational Query Reformulation with the Guidance of Retrieved Documents
Jeonghyun Park
Hwanhee Lee
116
0
0
17 Jul 2024
LEMoN: Label Error Detection using Multimodal Neighbors
LEMoN: Label Error Detection using Multimodal Neighbors
Haoran Zhang
Aparna Balagopalan
Nassim Oufattole
Hyewon Jeong
Yan Wu
Jiacheng Zhu
Marzyeh Ghassemi
144
0
0
10 Jul 2024
Powerful and Flexible: Personalized Text-to-Image Generation via
  Reinforcement Learning
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Fanyue Wei
Wei Zeng
Zhenyang Li
Dawei Yin
Lixin Duan
Wen Li
EGVM
82
3
0
09 Jul 2024
Edge-DIRECT: A Deep Reinforcement Learning-based Method for Solving
  Heterogeneous Electric Vehicle Routing Problem with Time Window Constraints
Edge-DIRECT: A Deep Reinforcement Learning-based Method for Solving Heterogeneous Electric Vehicle Routing Problem with Time Window Constraints
Arash Mozhdehi
Mahdi Mohammadizadeh
Xin Eric Wang
63
2
0
28 Jun 2024
The Impact of Auxiliary Patient Data on Automated Chest X-Ray Report
  Generation and How to Incorporate It
The Impact of Auxiliary Patient Data on Automated Chest X-Ray Report Generation and How to Incorporate It
Aaron Nicolson
Shengyao Zhuang
Jason Dowling
Bevan Koopman
65
1
0
19 Jun 2024
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks
  and Algorithms
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Miaosen Zhang
Yixuan Wei
Zhen Xing
Yifei Ma
Zuxuan Wu
...
Zheng Zhang
Qi Dai
Chong Luo
Xin Geng
Baining Guo
VLM
104
1
0
13 Jun 2024
1234...161718
Next