ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.00563
  4. Cited By
Self-critical Sequence Training for Image Captioning
v1v2 (latest)

Self-critical Sequence Training for Image Captioning

2 December 2016
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
ArXiv (abs)PDFHTML

Papers citing "Self-critical Sequence Training for Image Captioning"

50 / 862 papers shown
Title
Input Perturbation Reduces Exposure Bias in Diffusion Models
Input Perturbation Reduces Exposure Bias in Diffusion Models
Mang Ning
E. Sangineto
Angelo Porrello
Simone Calderara
Rita Cucchiara
DiffM
105
67
0
27 Jan 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Yucheng Zhou
Guodong Long
68
23
0
26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled
  Data
Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
SSLVLM
59
4
0
26 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
Embodied Agents for Efficient Exploration and Smart Scene Description
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
71
7
0
17 Jan 2023
RMM: Reinforced Memory Management for Class-Incremental Learning
RMM: Reinforced Memory Management for Class-Incremental Learning
Yaoyao Liu
Bernt Schiele
Qianru Sun
CLL
111
97
0
14 Jan 2023
End-to-End 3D Dense Captioning with Vote2Cap-DETR
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen
Erik Cambria
Xin Chen
Yinjie Lei
Tao Chen
YU Gang
ViT
79
60
0
06 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
182
0
0
05 Jan 2023
Generating Multiple-Length Summaries via Reinforcement Learning for
  Unsupervised Sentence Summarization
Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization
Dongmin Hyun
Xiting Wang
Chanyoung Park
Xing Xie
Hwanjo Yu
65
8
0
21 Dec 2022
Inverse Reinforcement Learning for Text Summarization
Inverse Reinforcement Learning for Text Summarization
Yujiao Fu
Deyi Xiong
Yue Dong
99
4
0
19 Dec 2022
EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level
  Weakly Supervised Instance Segmentation
EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation
Yunhao Ge
Lyne Tchapmi
Brian Nlong Zhao
Laurent Itti
Vibhav Vineet
DiffM
77
5
0
15 Dec 2022
NLIP: Noise-robust Language-Image Pre-training
NLIP: Noise-robust Language-Image Pre-training
Runhu Huang
Yanxin Long
Jianhua Han
Hang Xu
Xiwen Liang
Chunjing Xu
Xiaodan Liang
VLM
111
30
0
14 Dec 2022
Leveraging Modality-specific Representations for Audio-visual Speech
  Recognition via Reinforcement Learning
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Chen Chen
Yuchen Hu
Qiang Zhang
Heqing Zou
Beier Zhu
Eng Siong Chng
94
28
0
10 Dec 2022
Switching to Discriminative Image Captioning by Relieving a Bottleneck
  of Reinforcement Learning
Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning
Ukyo Honda
Taro Watanabe
Yuji Matsumoto
63
9
0
06 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
97
74
0
06 Dec 2022
Towards Generating Diverse Audio Captions via Adversarial Training
Towards Generating Diverse Audio Captions via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
90
2
0
05 Dec 2022
Multilingual Communication System with Deaf Individuals Utilizing
  Natural and Visual Languages
Multilingual Communication System with Deaf Individuals Utilizing Natural and Visual Languages
Tuan-Luc Huynh
Khoi-Nguyen Nguyen-Ngoc
Chi-Bien Chu
Minh-Triet Tran
Trung-Nghia Le
SLR
60
0
0
01 Dec 2022
Uncertainty-Aware Image Captioning
Uncertainty-Aware Image Captioning
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
UQLM
71
13
0
30 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffMVLM
105
24
0
21 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach
  to Cross-Modal Sarcasm Generation
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
74
1
0
20 Nov 2022
Progressive Tree-Structured Prototype Network for End-to-End Image
  Captioning
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning
Pengpeng Zeng
Jinkuan Zhu
Jingkuan Song
Lianli Gao
VLM
69
30
0
17 Nov 2022
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal
  Pre-trained Knowledge
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge
Linli Yao
Wei Chen
Qin Jin
VLM
132
11
0
17 Nov 2022
Toward expanding the scope of radiology report summarization to multiple
  anatomies and modalities
Toward expanding the scope of radiology report summarization to multiple anatomies and modalities
Zhihong Chen
M. Varma
Xiang Wan
C. Langlotz
Jean-Benoit Delbrouck
68
19
0
15 Nov 2022
A Unified Mutual Supervision Framework for Referring Expression
  Segmentation and Generation
A Unified Mutual Supervision Framework for Referring Expression Segmentation and Generation
Shijia Huang
Feng Li
Hao Zhang
Siyi Liu
Lei Zhang
Liwei Wang
70
5
0
15 Nov 2022
Hierarchical Phrase-based Sequence-to-Sequence Learning
Hierarchical Phrase-based Sequence-to-Sequence Learning
Bailin Wang
Ivan Titov
Jacob Andreas
Yoon Kim
75
7
0
15 Nov 2022
VieCap4H-VLSP 2021: ObjectAoA-Enhancing performance of Object Relation
  Transformer with Attention on Attention for Vietnamese image captioning
VieCap4H-VLSP 2021: ObjectAoA-Enhancing performance of Object Relation Transformer with Attention on Attention for Vietnamese image captioning
Nghia Hieu Nguyen
Duong T.D. Vo
Minh-Quan Ha
ViT
73
1
0
10 Nov 2022
Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content
  Dilutions
Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions
Gaurav Verma
Vishwa Vinay
Ryan A. Rossi
Srijan Kumar
VLMAAML
62
8
0
04 Nov 2022
Evaluating and Improving Factuality in Multimodal Abstractive
  Summarization
Evaluating and Improving Factuality in Multimodal Abstractive Summarization
David Wan
Joey Tianyi Zhou
100
10
0
04 Nov 2022
OSIC: A New One-Stage Image Captioner Coined
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
98
4
0
04 Nov 2022
CAMANet: Class Activation Map Guided Attention Network for Radiology
  Report Generation
CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation
Jun Wang
A. Bhalerao
Terry Yin
Simon See
Yulan He
MedIm
99
18
0
02 Nov 2022
DiMBERT: Learning Vision-Language Grounded Representations with
  Disentangled Multimodal-Attention
DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Fenglin Liu
Xian Wu
Shen Ge
Xuancheng Ren
Wei Fan
Xu Sun
Yuexian Zou
VLM
110
13
0
28 Oct 2022
Reinforced Question Rewriting for Conversational Question Answering
Reinforced Question Rewriting for Conversational Question Answering
Zhiyu Zoey Chen
Jie Zhao
Anjie Fang
B. Fetahu
Oleg Rokhlenko
S. Malmasi
74
27
0
27 Oct 2022
Improving the Factual Correctness of Radiology Report Generation with
  Semantic Rewards
Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards
Jean-Benoit Delbrouck
Pierre J. Chambon
Christian Blüthgen
E. Tsai
Omar Almusa
C. Langlotz
MedIm
116
81
0
21 Oct 2022
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text
  Generation
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
Yu Zhao
Jianguo Wei
Zhichao Lin
Yueheng Sun
Meishan Zhang
Hao Fei
81
16
0
20 Oct 2022
Prophet Attention: Predicting Attention with Future Attention for Image
  Captioning
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
122
48
0
19 Oct 2022
On effects of Knowledge Distillation on Transfer Learning
On effects of Knowledge Distillation on Transfer Learning
Sushil Thapa
46
1
0
18 Oct 2022
Probing Cross-modal Semantics Alignment Capability from the Textual
  Perspective
Probing Cross-modal Semantics Alignment Capability from the Textual Perspective
Zheng Ma
Shi Zong
Mianzhi Pan
Jianbing Zhang
Shujian Huang
Xinyu Dai
Jiajun Chen
61
4
0
18 Oct 2022
Hybrid Reinforced Medical Report Generation with M-Linear Attention and
  Repetition Penalty
Hybrid Reinforced Medical Report Generation with M-Linear Attention and Repetition Penalty
Wenting Xu
Zhenghua Xu
Junyang Chen
Chang Qi
Thomas Lukasiewicz
MedIm
86
8
0
14 Oct 2022
Plausible May Not Be Faithful: Probing Object Hallucination in
  Vision-Language Pre-training
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
Wenliang Dai
Zihan Liu
Ziwei Ji
Jane Polak Scowcroft
Pascale Fung
MLLMVLM
101
67
0
14 Oct 2022
Contextual Modeling for 3D Dense Captioning on Point Clouds
Contextual Modeling for 3D Dense Captioning on Point Clouds
Yufeng Zhong
Longdao Xu
Jiebo Luo
Lin Ma
94
15
0
08 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
91
10
0
04 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing:
  Benchmarks, Baselines, and Building Blocks for Natural Language Policy
  Optimization
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
139
251
0
03 Oct 2022
Human-in-the-loop Robotic Grasping using BERT Scene Representation
Human-in-the-loop Robotic Grasping using BERT Scene Representation
Yaoxian Song
Penglei Sun
Pengfei Fang
Linyi Yang
Yanghua Xiao
Yue Zhang
122
5
0
28 Sep 2022
Paraphrasing Is All You Need for Novel Object Captioning
Paraphrasing Is All You Need for Novel Object Captioning
Cheng Yang
Yao-Hung Hubert Tsai
Wanshu Fan
Ruslan Salakhutdinov
Louis-Philippe Morency
Yu-Chiang Frank Wang
96
4
0
25 Sep 2022
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning
  in Wikipedia
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
75
11
0
21 Sep 2022
Learning Distinct and Representative Styles for Image Captioning
Learning Distinct and Representative Styles for Image Captioning
Qi Chen
Chaorui Deng
Qi Wu
VLM
98
24
0
17 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
80
2
0
16 Sep 2022
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story
  Continuation
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation
A. Maharana
Darryl Hannan
Joey Tianyi Zhou
DiffM
117
83
0
13 Sep 2022
Checklist Models for Improved Output Fluency in Piano Fingering
  Prediction
Checklist Models for Improved Output Fluency in Piano Fingering Prediction
Nikita Srivatsan
Taylor Berg-Kirkpatrick
63
2
0
12 Sep 2022
Representative Image Feature Extraction via Contrastive Learning
  Pretraining for Chest X-ray Report Generation
Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation
Yu-Jen Chen
Wei-Hsiang Shen
Hao-Wei Chung
Jing-Hao Chiu
Da-Cheng Juan
T. Ho
Chin-Tung Cheng
Meng Li
Tsung-Yi Ho
MedIm
117
12
0
04 Sep 2022
A Medical Semantic-Assisted Transformer for Radiographic Report
  Generation
A Medical Semantic-Assisted Transformer for Radiographic Report Generation
Zhanyu Wang
Mingkang Tang
Lei Wang
Xiu Li
Luping Zhou
ViTMedIm
89
58
0
22 Aug 2022
Previous
123456...161718
Next