ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1411.5726
  4. Cited By
CIDEr: Consensus-based Image Description Evaluation

CIDEr: Consensus-based Image Description Evaluation

20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
ArXivPDFHTML

Papers citing "CIDEr: Consensus-based Image Description Evaluation"

50 / 2,142 papers shown
Title
Dual-Stream Transformer for Generic Event Boundary Captioning
Dual-Stream Transformer for Generic Event Boundary Captioning
Xin Gu
Hanhua Ye
Guang Chen
Yufei Wang
Libo Zhang
Longyin Wen
19
4
0
07 Jul 2022
Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Bin Li
Yixuan Weng
Ziyu Ma
Bin Sun
Shutao Li
VLM
17
2
0
05 Jul 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
Guosheng Lin
SyDa
ALM
135
243
0
05 Jul 2022
Are metrics measuring what they should? An evaluation of image
  captioning task metrics
Are metrics measuring what they should? An evaluation of image captioning task metrics
Othón González-Chávez
Guillermo Ruiz
Daniela Moctezuma
Tania A. Ramirez-delreal
21
9
0
04 Jul 2022
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of
  3D Human Motions and Texts
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts
Chuan Guo
Xinxin Xuo
Sen Wang
Li Cheng
VGen
87
230
0
04 Jul 2022
Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray
  Report Generation
Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray Report Generation
Sixing Yan
William K. Cheung
Keith W H Chiu
Terence M. Tong
Charles K. Cheung
Simon See
MedIm
38
14
0
04 Jul 2022
Enabling Harmonious Human-Machine Interaction with Visual-Context
  Augmented Dialogue System: A Review
Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang
Bin Guo
Y. Zeng
Yasan Ding
Chen Qiu
Ying Zhang
Li Yao
Zhiwen Yu
45
2
0
02 Jul 2022
Syntax Controlled Knowledge Graph-to-Text Generation with Order and
  Semantic Consistency
Syntax Controlled Knowledge Graph-to-Text Generation with Order and Semantic Consistency
Jin Liu
Chongfeng Fan
Feng Zhou
Huijuan Xu
36
5
0
02 Jul 2022
Rethinking Surgical Captioning: End-to-End Window-Based MLP Transformer
  Using Patches
Rethinking Surgical Captioning: End-to-End Window-Based MLP Transformer Using Patches
Mengya Xu
Mobarakol Islam
Hongliang Ren
MedIm
32
11
0
30 Jun 2022
ZoDIAC: Zoneout Dropout Injection Attention Calculation
ZoDIAC: Zoneout Dropout Injection Attention Calculation
Zanyar Zohourianshahzadi
Jugal Kalita
36
0
0
28 Jun 2022
VLCap: Vision-Language with Contrastive Learning for Coherent Video
  Paragraph Captioning
VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning
Kashu Yamazaki
Sang Truong
Khoa T. Vo
Michael Kidd
Chase Rainwater
Khoa Luu
Ngan Le
VLM
CoGe
15
25
0
26 Jun 2022
MVP: Multi-task Supervised Pre-training for Natural Language Generation
MVP: Multi-task Supervised Pre-training for Natural Language Generation
Tianyi Tang
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
54
24
0
24 Jun 2022
Surgical-VQA: Visual Question Answering in Surgical Scenes using
  Transformer
Surgical-VQA: Visual Question Answering in Surgical Scenes using Transformer
Lalithkumar Seenivasan
Mobarakol Islam
Adithya K. Krishna
Hongliang Ren
MedIm
21
45
0
22 Jun 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
137
1,076
0
22 Jun 2022
Bypass Network for Semantics Driven Image Paragraph Captioning
Bypass Network for Semantics Driven Image Paragraph Captioning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
32
1
0
21 Jun 2022
REVECA -- Rich Encoder-decoder framework for Video Event CAptioner
REVECA -- Rich Encoder-decoder framework for Video Event CAptioner
Jaehyuk Heo
YongGi Jeong
Sunwoo Kim
Jaehee Kim
Pilsung Kang
18
0
0
18 Jun 2022
Self-Supervised Learning for Videos: A Survey
Self-Supervised Learning for Videos: A Survey
Madeline Chantry Schiappa
Yogesh S Rawat
M. Shah
SSL
45
132
0
18 Jun 2022
Image Captioning based on Feature Refinement and Reflective Decoding
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
22
3
0
16 Jun 2022
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Zi-Yi Dou
Aishwarya Kamath
Zhe Gan
Pengchuan Zhang
Jianfeng Wang
...
Ce Liu
Yann LeCun
Nanyun Peng
Jianfeng Gao
Lijuan Wang
VLM
ObjD
35
124
0
15 Jun 2022
Measuring Representational Harms in Image Captioning
Measuring Representational Harms in Image Captioning
Angelina Wang
Solon Barocas
Kristen Laird
Hanna M. Wallach
21
51
0
14 Jun 2022
Automatic Clipping: Differentially Private Deep Learning Made Easier and
  Stronger
Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger
Zhiqi Bu
Yu Wang
Sheng Zha
George Karypis
37
69
0
14 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
28
88
0
14 Jun 2022
Language Models are General-Purpose Interfaces
Language Models are General-Purpose Interfaces
Y. Hao
Haoyu Song
Li Dong
Shaohan Huang
Zewen Chi
Wenhui Wang
Shuming Ma
Furu Wei
MLLM
35
96
0
13 Jun 2022
CoSe-Co: Text Conditioned Generative CommonSense Contextualizer
CoSe-Co: Text Conditioned Generative CommonSense Contextualizer
Rachit Bansal
Milan Aggarwal
S. Bhatia
Jivat Neet Kaur
Balaji Krishnamurthy
19
4
0
12 Jun 2022
Bridging the Gap Between Training and Inference of Bayesian Controllable
  Language Models
Bridging the Gap Between Training and Inference of Bayesian Controllable Language Models
Han Liu
Bingning Wang
Ting Yao
Haijin Liang
Jianjin Xu
Xiaolin Hu
BDL
40
1
0
11 Jun 2022
Improving Image Captioning with Control Signal of Sentence Quality
Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu
Hong Qu
22
0
0
07 Jun 2022
Intra-agent speech permits zero-shot task acquisition
Intra-agent speech permits zero-shot task acquisition
Chen Yan
Federico Carnevale
Petko Georgiev
Adam Santoro
Aurelia Guy
Alistair Muldal
Chia-Chun Hung
Josh Abramson
Timothy Lillicrap
Greg Wayne
LM&Ro
48
9
0
07 Jun 2022
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Mingjie Li
Wenjia Cai
Karin Verspoor
Shirui Pan
Xiaodan Liang
Xiaojun Chang
MedIm
41
35
0
04 Jun 2022
Automated Audio Captioning with Epochal Difficult Captions for
  Curriculum Learning
Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning
Andrew Koh
Soham Dinesh Tiwari
Chng Eng Siong
25
1
0
04 Jun 2022
Visual Clues: Bridging Vision and Language Foundations for Image
  Paragraph Captioning
Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Yujia Xie
Luowei Zhou
Xiyang Dai
Lu Yuan
Nguyen Bach
Ce Liu
Michael Zeng
VLM
MLLM
37
28
0
03 Jun 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning
  Language Models with no Catastrophic Forgetting
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
35
51
0
01 Jun 2022
CLIP4IDC: CLIP for Image Difference Captioning
CLIP4IDC: CLIP for Image Difference Captioning
Zixin Guo
Tong Wang
Jorma T. Laaksonen
VLM
29
27
0
01 Jun 2022
HierarchyNet: Learning to Summarize Source Code with Heterogeneous
  Representations
HierarchyNet: Learning to Summarize Source Code with Heterogeneous Representations
Minh Huynh Nguyen
Nghi D. Q. Bui
Truong-Son Hy
Long Tran-Thanh
Tien N. Nguyen
40
4
0
31 May 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGe
VLM
37
13
0
30 May 2022
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
16
6
0
28 May 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
64
531
0
27 May 2022
Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach
Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach
Chao Zhao
Faeze Brahman
Tenghao Huang
Snigdha Chaturvedi
LRM
29
3
0
26 May 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chen Chen
VLM
29
31
0
26 May 2022
Fine-grained Image Captioning with CLIP Reward
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
134
77
0
26 May 2022
InstructDial: Improving Zero and Few-shot Generalization in Dialogue
  through Instruction Tuning
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
Prakhar Gupta
Cathy Jiao
Yi-Ting Yeh
Shikib Mehri
M. Eskénazi
Jeffrey P. Bigham
ALM
49
47
0
25 May 2022
Multimodal Knowledge Alignment with Reinforcement Learning
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu
Jiwan Chung
Heeseung Yun
Jack Hessel
Jinho Park
...
Prithviraj Ammanabrolu
Rowan Zellers
Ronan Le Bras
Gunhee Kim
Yejin Choi
VLM
123
36
0
25 May 2022
Mutual Information Divergence: A Unified Metric for Multimodal
  Generative Models
Mutual Information Divergence: A Unified Metric for Multimodal Generative Models
Jin-Hwa Kim
Yunji Kim
Jiyoung Lee
Kang Min Yoo
Sang-Woo Lee
EGVM
45
34
0
25 May 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
90
72
0
25 May 2022
TempLM: Distilling Language Models into Template-Based Generators
TempLM: Distilling Language Models into Template-Based Generators
Tianyi Zhang
Mina Lee
Lisa Li
Ende Shen
Tatsunori B. Hashimoto
VLM
50
5
0
23 May 2022
Language Models with Image Descriptors are Strong Few-Shot
  Video-Language Learners
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang
Manling Li
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Joey Tianyi Zhou
Heng Ji
MLLM
VLM
170
138
0
22 May 2022
GL-RG: Global-Local Representation Granularity for Video Captioning
GL-RG: Global-Local Representation Granularity for Video Captioning
Liqi Yan
Qifan Wang
Yiming Cui
Fuli Feng
Xiaojun Quan
Xinming Zhang
Dongfang Liu
31
59
0
22 May 2022
Context Matters for Image Descriptions for Accessibility: Challenges for
  Referenceless Evaluation Metrics
Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics
Elisa Kreiss
Cynthia L. Bennett
Shayan Hooshmand
E. Zelikman
Meredith Ringel Morris
Christopher Potts
53
27
0
21 May 2022
What's in a Caption? Dataset-Specific Linguistic Diversity and Its
  Effect on Visual Description Models and Metrics
What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
David M. Chan
Austin Myers
Sudheendra Vijayanarasimhan
David A. Ross
Bryan Seybold
John F. Canny
33
6
0
12 May 2022
Automated Audio Captioning: An Overview of Recent Progress and New
  Challenges
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
34
38
0
12 May 2022
Explainable Deep Learning Methods in Medical Image Classification: A
  Survey
Explainable Deep Learning Methods in Medical Image Classification: A Survey
Cristiano Patrício
João C. Neves
Luís F. Teixeira
XAI
29
53
0
10 May 2022
Previous
123...222324...414243
Next