v1v2 (latest)

CIDEr: Consensus-based Image Description Evaluation

20 November 2014

Ramakrishna Vedantam

C. L. Zitnick

Devi Parikh

ArXiv (abs)PDF HTML

Papers citing "CIDEr: Consensus-based Image Description Evaluation"

50 / 2,184 papers shown

Title
VirTex: Learning Visual Representations from Textual Annotations Karan Desai Justin Johnson SSL VLM 173 437 0 11 Jun 2020
Toward Building Safer Smart Homes for the People with Disabilities Shahinur Alam M. Mahmud M. Yeasin 30 4 0 10 Jun 2020
CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training Qipeng Guo Zhijing Jin Xipeng Qiu Weinan Zhang David Wipf Zheng Zhang 129 61 0 08 Jun 2020
NITS-VC System for VATEX Video Captioning Challenge 2020 Alok Singh Thoudam Doren Singh Sivaji Bandyopadhyay 46 16 0 07 Jun 2020
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation Mingjie Li Fuyu Wang Xiaojun Chang Xiaodan Liang MedIm 86 107 0 06 Jun 2020
Audio Captioning using Gated Recurrent Units Aysegül Özkaya Eren M. Sert 74 10 0 05 Jun 2020
Pick-Object-Attack: Type-Specific Adversarial Attack for Object Detection Omid Mohamad Nezami Akshay Chaturvedi Mark Dras Utpal Garain AAML ObjD 61 19 0 05 Jun 2020
Graph-Stega: Semantic Controllable Steganographic Text Generation Guided by Knowledge Graph Zhongliang Yang Baitao Gong Yamin Li Jinshuai Yang Zhiwen Hu Yongfeng Huang 60 7 0 02 Jun 2020
Controlling Length in Image Captioning Ruotian Luo G. Shakhnarovich VLM 99 3 0 29 May 2020
TIME: Text and Image Mutual-Translation Adversarial Networks Bingchen Liu Kunpeng Song Yizhe Zhu Gerard de Melo Ahmed Elgammal 63 32 0 27 May 2020
Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding Fenglin Liu Xuancheng Ren Guangxiang Zhao Chenyu You Xuewei Ma Xian Wu Xu Sun 77 2 0 16 May 2020
Visual Relationship Detection using Scene Graphs: A Survey Aniket Agarwal Ayush Mangal Vipul GNN 70 21 0 16 May 2020
C3VQG: Category Consistent Cyclic Visual Question Generation Shagun Uppal Anish Madan Sarthak Bhagat Yi Yu R. Shah 57 19 0 15 May 2020
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning Jie Lei Liwei Wang Yelong Shen Dong Yu Tamara L. Berg Joey Tianyi Zhou 72 191 0 11 May 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes Douwe Kiela Hamed Firooz Aravind Mohan Vedanuj Goswami Amanpreet Singh Pratik Ringshia Davide Testuggine 109 612 0 10 May 2020
Posterior Control of Blackbox Generation Xiang Lisa Li Alexander M. Rush 67 25 0 10 May 2020
Improving Adversarial Text Generation by Modeling the Distant Future Ruiyi Zhang Changyou Chen Zhe Gan Wenlin Wang Dinghan Shen Guoyin Wang Zheng Wen Lawrence Carin 71 12 0 04 May 2020
Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence Xiaoyu Shen Ernie Chang Hui Su Jie Zhou Dietrich Klakow 77 49 0 03 May 2020
Clue: Cross-modal Coherence Modeling for Caption Generation Malihe Alikhani Piyush Sharma Shengjie Li Radu Soricut Matthew Stone 122 57 0 02 May 2020
Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage Ashish V. Thapliyal Radu Soricut 44 12 0 01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training Linjie Li Yen-Chun Chen Yu Cheng Zhe Gan Licheng Yu Jingjing Liu MLLM VLM OffRL AI4TS 133 507 0 01 May 2020
NUBIA: NeUral Based Interchangeability Assessor for Text Generation Hassan Kané Muhammed Yusuf Kocyigit Ali Abdalla Pelkins Ajanoh Mohamed Coulibali 79 59 0 30 Apr 2020
Towards Embodied Scene Description Sinan Tan Huaping Liu Di Guo Xinyu Zhang F. Sun LM&Ro 52 9 0 30 Apr 2020
Image Captioning through Image Transformer Sen He Wentong Liao Hamed R. Tavakoli M. Yang Bodo Rosenhahn N. Pugeault ViT 95 94 0 29 Apr 2020
BLEU Neighbors: A Reference-less Approach to Automatic Evaluation Kawin Ethayarajh Dorsa Sadigh 30 4 0 27 Apr 2020
Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-Ray Reports Baoyu Jing Zeya Wang Eric Xing 102 142 0 26 Apr 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image J. S. Park Chandra Bhagavatula Roozbeh Mottaghi Ali Farhadi Yejin Choi ReLM LRM 75 6 0 22 Apr 2020
A Revised Generative Evaluation of Visual Dialogue Daniela Massiceti Viveka Kulharia P. Dokania N. Siddharth Philip Torr 40 0 0 20 Apr 2020
Transform and Tell: Entity-Aware News Image Captioning Alasdair Tran A. Mathews Lexing Xie VLM 60 97 0 17 Apr 2020
Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity Hamza Harkous Isabel Groves Amir Saffari 86 89 0 08 Apr 2020
Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning Zhaojiang Lin Andrea Madotto Pascale Fung 105 163 0 08 Apr 2020
Context-Aware Group Captioning via Self-Attention and Contrastive Features Zhuowan Li Quan Hung Tran Long Mai Zhe Lin Alan Yuille VLM 81 44 0 07 Apr 2020
B-SCST: Bayesian Self-Critical Sequence Training for Image Captioning Shashank Bujimalla Mahesh Subedar Omesh Tickoo BDL UQCV 25 10 0 06 Apr 2020
Machine Translation Pre-training for Data-to-Text Generation -- A Case Study in Czech Mihir Kale Scott Roy 51 14 0 05 Apr 2020
More Grounded Image Captioning by Distilling Image-Text Matching Model Yuanen Zhou Meng Wang Daqing Liu Zhenzhen Hu Hanwang Zhang 90 126 0 01 Apr 2020
X-Linear Attention Networks for Image Captioning Yingwei Pan Ting Yao Yehao Li Tao Mei 134 519 0 31 Mar 2020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation Boxiao Pan Haoye Cai De-An Huang Kuan-Hui Lee Adrien Gaidon Ehsan Adeli Juan Carlos Niebles 79 236 0 31 Mar 2020
Detection and Description of Change in Visual Streams Davis Gilton Ruotian Luo Rebecca Willett Gregory Shakhnarovich AI4TS 52 4 0 27 Mar 2020
Assessing Image Quality Issues for Real-World Problems Tai-Yin Chiu Yinan Zhao Danna Gurari 137 54 0 27 Mar 2020
Grounded Situation Recognition Sarah M Pratt Mark Yatskar Luca Weihs Ali Farhadi Aniruddha Kembhavi 99 112 0 26 Mar 2020
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models Pranav Agarwal Alejandro Betancourt V. Panagiotou Natalia Díaz Rodríguez EGVM 82 10 0 26 Mar 2020
TextCaps: a Dataset for Image Captioning with Reading Comprehension Oleksii Sidorov Ronghang Hu Marcus Rohrbach Amanpreet Singh 103 418 0 24 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image Captioning Longteng Guo Jing Liu Xinxin Zhu Peng Yao Shichen Lu Hanqing Lu ViT 201 192 0 19 Mar 2020
Video Caption Dataset for Describing Human Actions in Japanese Yutaro Shigeto Yuya Yoshikawa Jiaqing Lin A. Takeuchi 34 3 0 10 Mar 2020
Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading Mingshuang Luo Shuang Yang Shiguang Shan Xilin Chen 87 41 0 09 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect Xu Yang Hanwang Zhang Jianfei Cai CML 79 127 0 09 Mar 2020
Better Captioning with Sequence-Level Exploration Jia Chen Qin Jin 61 12 0 08 Mar 2020
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement Fangyi Zhu Lei Li Zhanyu Ma Guang Chen Jun Guo 36 1 0 08 Mar 2020
Captioning Images with Novel Objects via Online Vocabulary Expansion Mikihiro Tanaka Tatsuya Harada 3DV 77 2 0 06 Mar 2020
Show, Edit and Tell: A Framework for Editing Image Captions Fawaz Sammani Luke Melas-Kyriazi KELM DiffM 108 59 0 06 Mar 2020