Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.00563
Cited By
Self-critical Sequence Training for Image Captioning
2 December 2016
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-critical Sequence Training for Image Captioning"
50 / 858 papers shown
Title
RMM: Reinforced Memory Management for Class-Incremental Learning
Yaoyao Liu
Bernt Schiele
Qianru Sun
CLL
39
93
0
14 Jan 2023
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen
Erik Cambria
Xin Chen
Yinjie Lei
Tao Chen
YU Gang
ViT
28
52
0
06 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
88
0
0
05 Jan 2023
Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization
Dongmin Hyun
Xiting Wang
Chanyoung Park
Xing Xie
Hwanjo Yu
19
7
0
21 Dec 2022
Inverse Reinforcement Learning for Text Summarization
Yujiao Fu
Deyi Xiong
Yue Dong
45
4
0
19 Dec 2022
EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation
Yunhao Ge
Lyne Tchapmi
Brian Nlong Zhao
Laurent Itti
Vibhav Vineet
DiffM
39
5
0
15 Dec 2022
NLIP: Noise-robust Language-Image Pre-training
Runhu Huang
Yanxin Long
Jianhua Han
Hang Xu
Xiwen Liang
Chunjing Xu
Xiaodan Liang
VLM
41
30
0
14 Dec 2022
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Chen Chen
Yuchen Hu
Qiang Zhang
Heqing Zou
Beier Zhu
Eng Siong Chng
33
26
0
10 Dec 2022
Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning
Ukyo Honda
Taro Watanabe
Yuji Matsumoto
16
9
0
06 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
30
62
0
06 Dec 2022
Towards Generating Diverse Audio Captions via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
41
2
0
05 Dec 2022
Multilingual Communication System with Deaf Individuals Utilizing Natural and Visual Languages
Tuan-Luc Huynh
Khoi-Nguyen Nguyen-Ngoc
Chi-Bien Chu
Minh-Triet Tran
Trung-Nghia Le
SLR
15
0
0
01 Dec 2022
Uncertainty-Aware Image Captioning
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
UQLM
23
10
0
30 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng-Wei Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffM
VLM
36
17
0
21 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
34
1
0
20 Nov 2022
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning
Pengpeng Zeng
Jinkuan Zhu
Jingkuan Song
Lianli Gao
VLM
24
27
0
17 Nov 2022
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge
Linli Yao
Wei Chen
Qin Jin
VLM
30
10
0
17 Nov 2022
Toward expanding the scope of radiology report summarization to multiple anatomies and modalities
Zhihong Chen
M. Varma
Xiang Wan
C. Langlotz
Jean-Benoit Delbrouck
20
18
0
15 Nov 2022
A Unified Mutual Supervision Framework for Referring Expression Segmentation and Generation
Shijia Huang
Feng Li
Hao Zhang
Siyi Liu
Lei Zhang
Liwei Wang
30
5
0
15 Nov 2022
Hierarchical Phrase-based Sequence-to-Sequence Learning
Bailin Wang
Ivan Titov
Jacob Andreas
Yoon Kim
26
7
0
15 Nov 2022
VieCap4H-VLSP 2021: ObjectAoA-Enhancing performance of Object Relation Transformer with Attention on Attention for Vietnamese image captioning
Nghia Hieu Nguyen
Duong T.D. Vo
Minh-Quan Ha
ViT
35
1
0
10 Nov 2022
Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions
Gaurav Verma
Vishwa Vinay
Ryan A. Rossi
Srijan Kumar
VLM
AAML
15
8
0
04 Nov 2022
Evaluating and Improving Factuality in Multimodal Abstractive Summarization
David Wan
Joey Tianyi Zhou
26
10
0
04 Nov 2022
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
36
3
0
04 Nov 2022
CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation
Jun Wang
A. Bhalerao
Terry Yin
Simon See
Yulan He
MedIm
36
16
0
02 Nov 2022
DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Fenglin Liu
Xian Wu
Shen Ge
Xuancheng Ren
Wei Fan
Xu Sun
Yuexian Zou
VLM
77
12
0
28 Oct 2022
Reinforced Question Rewriting for Conversational Question Answering
Zhiyu Zoey Chen
Jie Zhao
Anjie Fang
B. Fetahu
Oleg Rokhlenko
S. Malmasi
25
27
0
27 Oct 2022
Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards
Jean-Benoit Delbrouck
Pierre J. Chambon
Christian Blüthgen
E. Tsai
Omar Almusa
C. Langlotz
MedIm
61
76
0
21 Oct 2022
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
Yu Zhao
Jianguo Wei
Zhichao Lin
Yueheng Sun
Meishan Zhang
Hao Fei
25
16
0
20 Oct 2022
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
29
46
0
19 Oct 2022
On effects of Knowledge Distillation on Transfer Learning
Sushil Thapa
24
1
0
18 Oct 2022
Probing Cross-modal Semantics Alignment Capability from the Textual Perspective
Zheng Ma
Shi Zong
Mianzhi Pan
Jianbing Zhang
Shujian Huang
Xinyu Dai
Jiajun Chen
30
4
0
18 Oct 2022
Hybrid Reinforced Medical Report Generation with M-Linear Attention and Repetition Penalty
Wenting Xu
Zhenghua Xu
Junyang Chen
Chang Qi
Thomas Lukasiewicz
MedIm
34
7
0
14 Oct 2022
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
Wenliang Dai
Zihan Liu
Ziwei Ji
Dan Su
Pascale Fung
MLLM
VLM
32
63
0
14 Oct 2022
Contextual Modeling for 3D Dense Captioning on Point Clouds
Yufeng Zhong
Longdao Xu
Jiebo Luo
Lin Ma
44
15
0
08 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
45
10
0
04 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
31
240
0
03 Oct 2022
Human-in-the-loop Robotic Grasping using BERT Scene Representation
Yaoxian Song
Penglei Sun
Pengfei Fang
Linyi Yang
Yanghua Xiao
Yue Zhang
73
5
0
28 Sep 2022
Paraphrasing Is All You Need for Novel Object Captioning
Cheng Yang
Yao-Hung Hubert Tsai
Wanshu Fan
Ruslan Salakhutdinov
Louis-Philippe Morency
Yu-Chiang Frank Wang
51
4
0
25 Sep 2022
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
36
10
0
21 Sep 2022
Learning Distinct and Representative Styles for Image Captioning
Qi Chen
Chaorui Deng
Qi Wu
VLM
45
23
0
17 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
32
2
0
16 Sep 2022
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation
A. Maharana
Darryl Hannan
Joey Tianyi Zhou
DiffM
37
78
0
13 Sep 2022
Checklist Models for Improved Output Fluency in Piano Fingering Prediction
Nikita Srivatsan
Taylor Berg-Kirkpatrick
29
2
0
12 Sep 2022
Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation
Yu-Jen Chen
Wei-Hsiang Shen
Hao-Wei Chung
Jing-Hao Chiu
Da-Cheng Juan
T. Ho
Chin-Tung Cheng
Meng Li
Tsung-Yi Ho
MedIm
13
12
0
04 Sep 2022
A Medical Semantic-Assisted Transformer for Radiographic Report Generation
Zhanyu Wang
Mingkang Tang
Lei Wang
Xiu Li
Luping Zhou
ViT
MedIm
29
57
0
22 Aug 2022
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Zhi-Qi Cheng
Qianwen Dai
Siyao Li
Teruko Mitamura
Alexander G. Hauptmann
16
34
0
18 Aug 2022
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
29
21
0
13 Aug 2022
Towards Sequence-Level Training for Visual Tracking
Minji Kim
Seungkwang Lee
Jungseul Ok
Bohyung Han
Minsu Cho
29
31
0
11 Aug 2022
Attribute Controllable Beautiful Caucasian Face Generation by Aesthetics Driven Reinforcement Learning
Xin Jin
Shu Zhao
Le Zhang
Xin Zhao
Qiang Deng
Chaoen Xiao
EGVM
CVBM
42
2
0
09 Aug 2022
Previous
1
2
3
4
5
6
...
16
17
18
Next