ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.08822
  4. Cited By
SPICE: Semantic Propositional Image Caption Evaluation

SPICE: Semantic Propositional Image Caption Evaluation

29 July 2016
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
    EGVM
ArXiv (abs)PDFHTML

Papers citing "SPICE: Semantic Propositional Image Caption Evaluation"

50 / 949 papers shown
Title
An overview on the evaluated video retrieval tasks at TRECVID 2022
An overview on the evaluated video retrieval tasks at TRECVID 2022
G. Awad
Keith Curtis
A. Butt
Jonathan G. Fiscus
A. Godil
...
Eliot Godard
Lukas L. Diduch
Jeffrey Liu
Yvette Graham
Georges Quénot
41
10
0
22 Jun 2023
SituatedGen: Incorporating Geographical and Temporal Contexts into
  Generative Commonsense Reasoning
SituatedGen: Incorporating Geographical and Temporal Contexts into Generative Commonsense Reasoning
Yunxiang Zhang
Xiaojun Wan
AILawLRM
72
7
0
21 Jun 2023
Learning to Generate Better Than Your LLM
Learning to Generate Better Than Your LLM
Jonathan D. Chang
Kianté Brantley
Rajkumar Ramamurthy
Dipendra Kumar Misra
Wen Sun
68
49
0
20 Jun 2023
Improving Image Captioning Descriptiveness by Ranking and LLM-based
  Fusion
Improving Image Captioning Descriptiveness by Ranking and LLM-based Fusion
Simone Bianco
Luigi Celona
Marco Donzella
Paolo Napoletano
75
20
0
20 Jun 2023
Improving Audio Caption Fluency with Automatic Error Correction
Improving Audio Caption Fluency with Automatic Error Correction
Hanxue Zhang
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
48
0
0
16 Jun 2023
Listener Model for the PhotoBook Referential Game with CLIPScores as
  Implicit Reference Chain
Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Shih-Lun Wu
Yi-Hui Chou
Liang Li
58
0
0
16 Jun 2023
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Chen Cai
Suchen Wang
Kim-Hui Yap
Yi Wang
ObjD
51
3
0
13 Jun 2023
Embodied Executable Policy Learning with Language-based Scene
  Summarization
Embodied Executable Policy Learning with Language-based Scene Summarization
Jielin Qiu
Mengdi Xu
William Jongwon Han
Seungwhan Moon
Ding Zhao
LM&Ro
83
8
0
09 Jun 2023
Towards Adaptable and Interactive Image Captioning with Data
  Augmentation and Episodic Memory
Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory
Aliki Anagnostopoulou
Mareike Hartmann
Daniel Sonntag
CLLVLM
56
0
0
06 Jun 2023
SciCap+: A Knowledge Augmented Dataset to Study the Challenges of
  Scientific Figure Captioning
SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
Zhishen Yang
Raj Dabre
Hideki Tanaka
Naoaki Okazaki
135
19
0
06 Jun 2023
Enhance Temporal Relations in Audio Captioning with Sound Event
  Detection
Enhance Temporal Relations in Audio Captioning with Sound Event Detection
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
68
10
0
02 Jun 2023
Adapting a ConvNeXt model to audio classification on AudioSet
Adapting a ConvNeXt model to audio classification on AudioSet
Thomas Pellegrini
Ismail Khalfaoui-Hassani
Etienne Labbé
T. Masquelier
90
23
0
01 Jun 2023
CapText: Large Language Model-based Caption Generation From Image
  Context and Description
CapText: Large Language Model-based Caption Generation From Image Context and Description
Shinjini Ghosh
Sagnik Anupam
VLM
64
3
0
01 Jun 2023
DisCLIP: Open-Vocabulary Referring Expression Generation
DisCLIP: Open-Vocabulary Referring Expression Generation
Lior Bracha
E. Shaar
Aviv Shamsian
Ethan Fetaya
Gal Chechik
ObjD
123
7
0
30 May 2023
Dual Transformer Decoder based Features Fusion Network for Automated
  Audio Captioning
Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Jianyuan Sun
Xubo Liu
Xinhao Mei
V. Kılıç
Mark D. Plumbley
Wenwu Wang
58
3
0
30 May 2023
FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph
  Parsing
FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing
Zhuang Li
Yuyang Chai
Terry Yue Zhuo
Zhuang Li
Gholamreza Haffari
Fei Li
Donghong Ji
Quan Hung Tran
115
33
0
27 May 2023
Learning to Imagine: Visually-Augmented Natural Language Generation
Learning to Imagine: Visually-Augmented Natural Language Generation
Tianyi Tang
Yushuo Chen
Yifan Du
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
DiffM
75
9
0
26 May 2023
Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion
  Data and Natural Language
Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language
Nicola Messina
J. Sedmidubský
Fabrizio Falchi
Tomávs Rebok
EGVM
61
12
0
25 May 2023
Visual Programming for Text-to-Image Generation and Evaluation
Visual Programming for Text-to-Image Generation and Evaluation
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
MLLM
114
51
0
24 May 2023
Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying
  References
Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References
Tianyi Tang
Hongyuan Lu
Yuchen Eleanor Jiang
Haoyang Huang
Dongdong Zhang
Wayne Xin Zhao
Tom Kocmi
Furu Wei
58
7
0
24 May 2023
#REVAL: a semantic evaluation framework for hashtag recommendation
#REVAL: a semantic evaluation framework for hashtag recommendation
Areej Alsini
D. Huynh
A. Datta
31
0
0
24 May 2023
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Haoyi Qiu
Zi-Yi Dou
Tianlu Wang
Asli Celikyilmaz
Nanyun Peng
EGVM
112
16
0
24 May 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based
  Text-to-Image Generation by Selection
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik
Karsten Roth
Massimiliano Mancini
Zeynep Akata
86
21
0
22 May 2023
GEST: the Graph of Events in Space and Time as a Common Representation
  between Vision and Language
GEST: the Graph of Events in Space and Time as a Common Representation between Vision and Language
Mihai Masala
Nicolae Cudlenco
Traian Rebedea
Marius Leordeanu
70
0
0
22 May 2023
A request for clarity over the End of Sequence token in the
  Self-Critical Sequence Training
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
73
6
0
20 May 2023
What Makes for Good Visual Tokenizers for Large Language Models?
What Makes for Good Visual Tokenizers for Large Language Models?
Guangzhi Wang
Yixiao Ge
Xiaohan Ding
Mohan S. Kankanhalli
Ying Shan
MLLMVLM
93
39
0
20 May 2023
DiffCap: Exploring Continuous Diffusion on Image Captioning
DiffCap: Exploring Continuous Diffusion on Image Captioning
Yufeng He
Zefan Cai
Xu Gan
Baobao Chang
DiffM
73
7
0
20 May 2023
PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For
  Vision-and-Language Navigation
PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation
Liuyi Wang
Chengju Liu
Zongtao He
Shu Li
Qingqing Yan
Huiyi Chen
Qi Chen
71
10
0
19 May 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image
  Synthesis Evaluation
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Yujie Lu
Xianjun Yang
Xiujun Li
Xinze Wang
William Yang Wang
EGVM
140
79
0
18 May 2023
Listen, Think, and Understand
Listen, Think, and Understand
Yuan Gong
Hongyin Luo
Alexander H. Liu
Leonid Karlinsky
James R. Glass
ELMMLLMLRM
126
161
0
18 May 2023
Foundations of Spatial Perception for Robotics: Hierarchical
  Representations and Real-time Systems
Foundations of Spatial Perception for Robotics: Hierarchical Representations and Real-time Systems
Nathan Hughes
Yun Chang
Siyi Hu
Rajat Talak
Rumaisa Abdulhai
Jared Strader
Luca Carlone
63
54
0
11 May 2023
Simple Token-Level Confidence Improves Caption Correctness
Simple Token-Level Confidence Improves Caption Correctness
Suzanne Petryk
Spencer Whitehead
Joseph E. Gonzalez
Trevor Darrell
Anna Rohrbach
Marcus Rohrbach
85
7
0
11 May 2023
InfoMetIC: An Informative Metric for Reference-free Image Caption
  Evaluation
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation
Anwen Hu
Shizhe Chen
Liang Zhang
Qin Jin
61
22
0
10 May 2023
Transforming Visual Scene Graphs to Image Captions
Transforming Visual Scene Graphs to Image Captions
Xu Yang
Jiawei Peng
Zihua Wang
Haiyang Xu
Qinghao Ye
Chenliang Li
Mingshi Yan
Feisi Huang
Zhangzikang Li
Yu Zhang
97
21
0
03 May 2023
Diverse and Vivid Sound Generation from Text Descriptions
Diverse and Vivid Sound Generation from Text Descriptions
Guangwei Li
Xuenan Xu
Lingfeng Dai
Mengyue Wu
K. Yu
95
4
0
03 May 2023
Visual Transformation Telling
Visual Transformation Telling
Wanqing Cui
Mustafa Nasir-Moin
Yanyan Lan
Viola J. Chen
Jiafeng Guo
Xueqi Cheng
LRM
105
1
0
03 May 2023
Multimodal Data Augmentation for Image Captioning using Diffusion Models
Multimodal Data Augmentation for Image Captioning using Diffusion Models
Changrong Xiao
S. Xu
Kunpeng Zhang
DiffM
73
10
0
03 May 2023
Multitask learning in Audio Captioning: a sentence embedding regression
  loss acts as a regularizer
Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
Etienne Labbé
J. Pinquier
Thomas Pellegrini
92
5
0
02 May 2023
VPGTrans: Transfer Visual Prompt Generator across LLMs
VPGTrans: Transfer Visual Prompt Generator across LLMs
Ao Zhang
Hao Fei
Yuan Yao
Wei Ji
Li Li
Zhiyuan Liu
Tat-Seng Chua
MLLMVLM
82
89
0
02 May 2023
Quality-agnostic Image Captioning to Safely Assist People with Vision
  Impairment
Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment
Lu Yu
Malvina Nikandrou
Jiali Jin
Verena Rieser
88
6
0
28 Apr 2023
From Association to Generation: Text-only Captioning by Unsupervised
  Cross-modal Mapping
From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping
Junyan Wang
Ming Yan
Yi Zhang
Jitao Sang
CLIPVLM
74
9
0
26 Apr 2023
A Review of Deep Learning for Video Captioning
A Review of Deep Learning for Video Captioning
Moloud Abdar
Meenakshi Kollati
Swaraja Kuraparthi
Farhad Pourpanah
Daniel J. McDuff
...
Shuicheng Yan
Abduallah A. Mohamed
Abbas Khosravi
Min Zhang
Fatih Porikli
3DV
115
22
0
22 Apr 2023
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Sihan Chen
Xingjian He
Longteng Guo
Xinxin Zhu
Weining Wang
Jinhui Tang
Jinhui Tang
VLM
126
111
0
17 Apr 2023
Tractable Control for Autoregressive Language Generation
Tractable Control for Autoregressive Language Generation
Honghua Zhang
Meihua Dang
Nanyun Peng
Guy Van den Broeck
BDL
107
45
0
15 Apr 2023
A-CAP: Anticipation Captioning with Commonsense Knowledge
A-CAP: Anticipation Captioning with Commonsense Knowledge
D. Vo
Quoc-An Luong
Akihiro Sugimoto
Hideki Nakayama
65
1
0
13 Apr 2023
Model-Agnostic Gender Debiased Image Captioning
Model-Agnostic Gender Debiased Image Captioning
Yusuke Hirota
Yuta Nakashima
Noa Garcia
FaML
109
18
0
07 Apr 2023
Graph Attention for Automated Audio Captioning
Graph Attention for Automated Audio Captioning
Feiyang Xiao
Jian Guan
Qiaoxi Zhu
Wenwu Wang
64
8
0
07 Apr 2023
Cross-Domain Image Captioning with Discriminative Finetuning
Cross-Domain Image Captioning with Discriminative Finetuning
Roberto Dessì
Michele Bevilacqua
Eleonora Gualdoni
Nathanaël Carraz Rakotonirina
Francesca Franzon
Marco Baroni
CLIP
92
19
0
04 Apr 2023
Prefix tuning for automated audio captioning
Prefix tuning for automated audio captioning
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
100
45
0
30 Mar 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for
  Audio-Language Multimodal Research
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
175
220
0
30 Mar 2023
Previous
123...678...171819
Next