ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1411.5726
  4. Cited By
CIDEr: Consensus-based Image Description Evaluation

CIDEr: Consensus-based Image Description Evaluation

20 November 2014
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
ArXivPDFHTML

Papers citing "CIDEr: Consensus-based Image Description Evaluation"

50 / 2,142 papers shown
Title
Leveraging Pre-trained BERT for Audio Captioning
Leveraging Pre-trained BERT for Audio Captioning
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
38
30
0
06 Mar 2022
RACE: Retrieval-Augmented Commit Message Generation
RACE: Retrieval-Augmented Commit Message Generation
Ensheng Shi
Yanlin Wang
Wei Tao
Lun Du
Hongyu Zhang
Shi Han
Dongmei Zhang
Hongbin Sun
VLM
27
41
0
05 Mar 2022
FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in
  Context
FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context
Pinaki Nath Chowdhury
Aneeshan Sain
A. Bhunia
Tao Xiang
Yulia Gryaditskaya
Yi-Zhe Song
3DV
48
52
0
04 Mar 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large
  Models
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
Shixuan Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TS
VLM
24
37
0
03 Mar 2022
A Deep Neural Framework for Image Caption Generation Using GRU-Based
  Attention Mechanism
A Deep Neural Framework for Image Caption Generation Using GRU-Based Attention Mechanism
Rashid Khan
Shujah Islam
Khadija Kanwal
Mansoor Iqbal
Md. Imran Hossain
Z. Ye
3DV
28
16
0
03 Mar 2022
COLD Decoding: Energy-based Constrained Text Generation with Langevin
  Dynamics
COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
Lianhui Qin
Sean Welleck
Daniel Khashabi
Yejin Choi
AI4CE
63
145
0
23 Feb 2022
Exploiting long-term temporal dynamics for video captioning
Exploiting long-term temporal dynamics for video captioning
Yuyu Guo
Jingqiu Zhang
Lianli Gao
19
18
0
22 Feb 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
43
27
0
21 Feb 2022
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment
  Act Flows
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows
Jianqiao Zhao
Yanyang Li
Wanyu Du
Yangfeng Ji
Dong Yu
M. Lyu
Liwei Wang
33
4
0
14 Feb 2022
I-Tuning: Tuning Frozen Language Models with Image for Lightweight Image
  Captioning
I-Tuning: Tuning Frozen Language Models with Image for Lightweight Image Captioning
Ziyang Luo
Zhipeng Hu
Yadong Xi
Rongsheng Zhang
Jing Ma
VLM
28
13
0
14 Feb 2022
Describing image focused in cognitive and visual details for visually
  impaired people: An approach to generating inclusive paragraphs
Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs
Daniel Louzada Fernandes
Marcos Henrique Fonseca Ribeiro
F. Cerqueira
Michel Melo Silva
22
6
0
10 Feb 2022
Image Difference Captioning with Pre-training and Contrastive Learning
Image Difference Captioning with Pre-training and Contrastive Learning
Linli Yao
Weiying Wang
Qin Jin
SSL
VLM
33
41
0
09 Feb 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of
  Text-to-Image Generation Models
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
ViT
145
171
0
08 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
74
852
0
07 Feb 2022
Webly Supervised Concept Expansion for General Purpose Vision Models
Webly Supervised Concept Expansion for General Purpose Vision Models
Amita Kamath
Christopher Clark
Tanmay Gupta
Eric Kolve
Derek Hoiem
Aniruddha Kembhavi
VLM
37
54
0
04 Feb 2022
Joint Speech Recognition and Audio Captioning
Joint Speech Recognition and Audio Captioning
Chaitanya Narisetty
E. Tsunoo
Xuankai Chang
Yosuke Kashiwagi
Michael Hentschel
Shinji Watanabe
27
10
0
03 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
30
89
0
31 Jan 2022
A Frustratingly Simple Approach for End-to-End Image Captioning
A Frustratingly Simple Approach for End-to-End Image Captioning
Ziyang Luo
Yadong Xi
Rongsheng Zhang
Jing Ma
VLM
MLLM
25
16
0
30 Jan 2022
BERTHA: Video Captioning Evaluation Via Transfer-Learned Human
  Assessment
BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment
Luis Lebron
Yvette Graham
Kevin McGuinness
K. Kouramas
Noel E. O'Connor
46
3
0
25 Jan 2022
Transformers in Medical Imaging: A Survey
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
Fahad Shahbaz Khan
Huazhu Fu
ViT
LM&MA
MedIm
111
663
0
24 Jan 2022
Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Aaron Nicolson
Jason Dowling
Bevan Koopman
ViT
LM&MA
MedIm
32
90
0
24 Jan 2022
WIDAR -- Weighted Input Document Augmented ROUGE
WIDAR -- Weighted Input Document Augmented ROUGE
Raghav Jain
Vaibhav Mavi
Anubhav Jangra
S. Saha
22
4
0
23 Jan 2022
End-to-end Generative Pretraining for Multimodal Video Captioning
End-to-end Generative Pretraining for Multimodal Video Captioning
Paul Hongsuck Seo
Arsha Nagrani
Anurag Arnab
Cordelia Schmid
29
166
0
20 Jan 2022
Instance-aware Prompt Learning for Language Understanding and Generation
Instance-aware Prompt Learning for Language Understanding and Generation
Feihu Jin
Jinliang Lu
Jiajun Zhang
Chengqing Zong
27
32
0
18 Jan 2022
What Makes the Story Forward? Inferring Commonsense Explanations as
  Prompts for Future Event Generation
What Makes the Story Forward? Inferring Commonsense Explanations as Prompts for Future Event Generation
Li Lin
Yixin Cao
Lifu Huang
Shuang Li
Xuming Hu
Lijie Wen
Jianmin Wang
AI4TS
45
17
0
18 Jan 2022
Prior Knowledge Enhances Radiology Report Generation
Prior Knowledge Enhances Radiology Report Generation
Song Wang
Liyan Tang
Mingquan Lin
George Shih
Ying Ding
Yifan Peng
MedIm
37
20
0
11 Jan 2022
Local Information Assisted Attention-free Decoder for Audio Captioning
Local Information Assisted Attention-free Decoder for Audio Captioning
Feiyang Xiao
Jian Guan
Haiyan Lan
Qiaoxi Zhu
Wenwu Wang
40
11
0
10 Jan 2022
Glance and Focus Networks for Dynamic Visual Recognition
Glance and Focus Networks for Dynamic Visual Recognition
Gao Huang
Yulin Wang
Kangchen Lv
Haojun Jiang
Wenhui Huang
Pengfei Qi
S. Song
3DH
79
49
0
09 Jan 2022
Compact Bidirectional Transformer for Image Captioning
Compact Bidirectional Transformer for Image Captioning
Yuanen Zhou
Zhenzhen Hu
Daqing Liu
Huixia Ben
Meng Wang
VLM
22
16
0
06 Jan 2022
All You Need In Sign Language Production
All You Need In Sign Language Production
R. Rastgoo
Kourosh Kiani
Sergio Escalera
V. Athitsos
Mohammad Sabokrou
17
8
0
05 Jan 2022
StyleM: Stylized Metrics for Image Captioning Built with Contrastive
  N-grams
StyleM: Stylized Metrics for Image Captioning Built with Contrastive N-grams
Chengxi Li
Brent Harrison
30
3
0
04 Jan 2022
Radiology Report Generation with a Learned Knowledge Base and
  Multi-modal Alignment
Radiology Report Generation with a Learned Knowledge Base and Multi-modal Alignment
Shuxin Yang
Xian Wu
Shen Ge
S.Kevin Zhou
Li Xiao
MedIm
39
90
0
30 Dec 2021
Knowledge Matters: Radiology Report Generation with General and Specific
  Knowledge
Knowledge Matters: Radiology Report Generation with General and Specific Knowledge
Shuxin Yang
Xian Wu
Shen Ge
S.Kevin Zhou
Li Xiao
MedIm
27
110
0
30 Dec 2021
Synchronized Audio-Visual Frames with Fractional Positional Encoding for
  Transformers in Video-to-Text Translation
Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation
Philipp Harzig
Moritz Einfalt
Rainer Lienhart
ViT
42
2
0
28 Dec 2021
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
36
49
0
27 Dec 2021
ScanQA: 3D Question Answering for Spatial Scene Understanding
ScanQA: 3D Question Answering for Spatial Scene Understanding
Daich Azuma
Taiki Miyanishi
Shuhei Kurita
M. Kawanabe
32
179
0
20 Dec 2021
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead
  Heuristics
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics
Ximing Lu
Sean Welleck
Peter West
Liwei Jiang
Jungo Kasai
...
Lianhui Qin
Youngjae Yu
Rowan Zellers
Noah A. Smith
Yejin Choi
11
158
0
16 Dec 2021
Dense Video Captioning Using Unsupervised Semantic Information
Dense Video Captioning Using Unsupervised Semantic Information
Valter Estevam
Rayson Laroca
Hélio Pedrini
David Menotti
14
9
0
15 Dec 2021
KGR^4: Retrieval, Retrospect, Refine and Rethink for Commonsense
  Generation
KGR^4: Retrieval, Retrospect, Refine and Rethink for Commonsense Generation
Xin Liu
Dayiheng Liu
Baosong Yang
Haibo Zhang
Junwei Ding
Wenqing Yao
Weihua Luo
Haiying Zhang
Jinsong Su
LRM
32
8
0
15 Dec 2021
CoCo-BERT: Improving Video-Language Pre-training with Contrastive
  Cross-modal Matching and Denoising
CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Hongyang Chao
Tao Mei
VLM
18
42
0
14 Dec 2021
MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and
  Unpaired Text-based Image Captioning
MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-based Image Captioning
Wenqiao Zhang
Haochen Shi
Jiannan Guo
Shengyu Zhang
Qingpeng Cai
Juncheng Li
Sihui Luo
Yueting Zhuang
DiffM
31
46
0
13 Dec 2021
Contextualized Scene Imagination for Generative Commonsense Reasoning
Contextualized Scene Imagination for Generative Commonsense Reasoning
Peifeng Wang
Jonathan Zamora
Junfeng Liu
Filip Ilievski
Muhao Chen
Xiang Ren
ReLM
LRM
40
16
0
12 Dec 2021
Unified Multimodal Pre-training and Prompt-based Tuning for
  Vision-Language Understanding and Generation
Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation
Tianyi Liu
Zuxuan Wu
Wenhan Xiong
Jingjing Chen
Yu-Gang Jiang
VLM
MLLM
32
10
0
10 Dec 2021
Injecting Semantic Concepts into End-to-End Image Captioning
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
32
86
0
09 Dec 2021
Self-Supervised Image-to-Text and Text-to-Image Synthesis
Self-Supervised Image-to-Text and Text-to-Image Synthesis
Anindya Sundar Das
S. Saha
SSL
21
5
0
09 Dec 2021
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Lavinia Dunagan
Jacob Morrison
Alexander R. Fabbri
Yejin Choi
Noah A. Smith
57
39
0
08 Dec 2021
Search and Learn: Improving Semantic Coverage for Data-to-Text
  Generation
Search and Learn: Improving Semantic Coverage for Data-to-Text Generation
Shailza Jolly
Zi Xuan Zhang
Andreas Dengel
Lili Mou
39
11
0
06 Dec 2021
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation
Pierre Colombo
Chloe Clave
Pablo Piantanida
40
41
0
02 Dec 2021
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning
  and Visual Grounding
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Dave Zhenyu Chen
Qirui Wu
Matthias Nießner
Angel X. Chang
23
29
0
02 Dec 2021
Object-Centric Unsupervised Image Captioning
Object-Centric Unsupervised Image Captioning
Zihang Meng
David Yang
Xuefei Cao
Ashish Shah
Ser-Nam Lim
OCL
VLM
27
11
0
02 Dec 2021
Previous
123...242526...414243
Next