ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.08822
  4. Cited By
SPICE: Semantic Propositional Image Caption Evaluation

SPICE: Semantic Propositional Image Caption Evaluation

29 July 2016
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
    EGVM
ArXiv (abs)PDFHTML

Papers citing "SPICE: Semantic Propositional Image Caption Evaluation"

50 / 949 papers shown
Title
Retrieval-Augmented Transformer for Image Captioning
Retrieval-Augmented Transformer for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
81
59
0
26 Jul 2022
Chunk-aware Alignment and Lexical Constraint for Visual Entailment with
  Natural Language Explanations
Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language Explanations
Qian Yang
Yunxin Li
Baotian Hu
Lin Ma
Yuxin Ding
Min Zhang
86
10
0
23 Jul 2022
Zero-Shot Video Captioning with Evolving Pseudo-Tokens
Zero-Shot Video Captioning with Evolving Pseudo-Tokens
Yoad Tewel
Yoav Shalev
Roy Nadler
Idan Schwartz
Lior Wolf
58
27
0
22 Jul 2022
Efficient Modeling of Future Context for Image Captioning
Efficient Modeling of Future Context for Image Captioning
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
76
15
0
22 Jul 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
104
306
0
20 Jul 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
84
113
0
20 Jul 2022
Explicit Image Caption Editing
Explicit Image Caption Editing
Zhen Wang
Long Chen
Wenbo Ma
G. Han
Yulei Niu
Jian Shao
Jun Xiao
55
12
0
20 Jul 2022
Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation
Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation
Chao Zheng
Lianli Gao
Xinyu Lyu
Pengpeng Zeng
Abdulmotaleb El Saddik
Hengtao Shen
89
16
0
16 Jul 2022
Adaptive Fine-Grained Predicates Learning for Scene Graph Generation
Adaptive Fine-Grained Predicates Learning for Scene Graph Generation
Xinyu Lyu
Lianli Gao
Pengpeng Zeng
Hengtao Shen
Jingkuan Song
94
21
0
11 Jul 2022
Predicting Word Learning in Children from the Performance of Computer
  Vision Systems
Predicting Word Learning in Children from the Performance of Computer Vision Systems
Sunayana Rane
Mira L. Nencheva
Zeyu Wang
C. Lew‐Williams
Olga Russakovsky
Thomas Griffiths
99
3
0
07 Jul 2022
Dual-Stream Transformer for Generic Event Boundary Captioning
Dual-Stream Transformer for Generic Event Boundary Captioning
Xin Gu
Hanhua Ye
Guang Chen
Yufei Wang
Libo Zhang
Longyin Wen
29
4
0
07 Jul 2022
Are metrics measuring what they should? An evaluation of image
  captioning task metrics
Are metrics measuring what they should? An evaluation of image captioning task metrics
Othón González-Chávez
Guillermo Ruiz
Daniela Moctezuma
Tania A. Ramirez-delreal
73
9
0
04 Jul 2022
Rethinking Surgical Captioning: End-to-End Window-Based MLP Transformer
  Using Patches
Rethinking Surgical Captioning: End-to-End Window-Based MLP Transformer Using Patches
Mengya Xu
Mobarakol Islam
Hongliang Ren
MedIm
68
12
0
30 Jun 2022
ZoDIAC: Zoneout Dropout Injection Attention Calculation
ZoDIAC: Zoneout Dropout Injection Attention Calculation
Zanyar Zohourianshahzadi
Jugal Kalita
89
0
0
28 Jun 2022
From Shallow to Deep: Compositional Reasoning over Graphs for Visual
  Question Answering
From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering
Zihao Zhu
NAIReLMGNN
115
3
0
25 Jun 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
211
1,134
0
22 Jun 2022
REVECA -- Rich Encoder-decoder framework for Video Event CAptioner
REVECA -- Rich Encoder-decoder framework for Video Event CAptioner
Jaehyuk Heo
YongGi Jeong
Sunwoo Kim
Jaehee Kim
Pilsung Kang
28
0
0
18 Jun 2022
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Zi-Yi Dou
Aishwarya Kamath
Zhe Gan
Pengchuan Zhang
Jianfeng Wang
...
Ce Liu
Yann LeCun
Nanyun Peng
Jianfeng Gao
Lijuan Wang
VLMObjD
115
129
0
15 Jun 2022
Measuring Representational Harms in Image Captioning
Measuring Representational Harms in Image Captioning
Angelina Wang
Solon Barocas
Kristen Laird
Hanna M. Wallach
111
54
0
14 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
77
91
0
14 Jun 2022
Language Models are General-Purpose Interfaces
Language Models are General-Purpose Interfaces
Y. Hao
Haoyu Song
Li Dong
Shaohan Huang
Zewen Chi
Wenhui Wang
Shuming Ma
Furu Wei
MLLM
73
102
0
13 Jun 2022
CoSe-Co: Text Conditioned Generative CommonSense Contextualizer
CoSe-Co: Text Conditioned Generative CommonSense Contextualizer
Rachit Bansal
Milan Aggarwal
S. Bhatia
Jivat Neet Kaur
Balaji Krishnamurthy
30
4
0
12 Jun 2022
Improving Image Captioning with Control Signal of Sentence Quality
Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu
Hong Qu
76
0
0
07 Jun 2022
Automated Audio Captioning with Epochal Difficult Captions for
  Curriculum Learning
Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning
Andrew Koh
Soham Dinesh Tiwari
Chng Eng Siong
53
1
0
04 Jun 2022
Visual Clues: Bridging Vision and Language Foundations for Image
  Paragraph Captioning
Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
Yujia Xie
Luowei Zhou
Xiyang Dai
Lu Yuan
Nguyen Bach
Ce Liu
Michael Zeng
VLMMLLM
69
28
0
03 Jun 2022
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
46
6
0
28 May 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
172
561
0
27 May 2022
A Survey on Long-Tailed Visual Recognition
A Survey on Long-Tailed Visual Recognition
Lu Yang
He Jiang
Q. Song
Jun Guo
93
134
0
27 May 2022
Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach
Revisiting Generative Commonsense Reasoning: A Pre-Ordering Approach
Chao Zhao
Faeze Brahman
Tenghao Huang
Snigdha Chaturvedi
LRM
62
5
0
26 May 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chen Chen
VLM
97
33
0
26 May 2022
Fine-grained Image Captioning with CLIP Reward
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
226
79
0
26 May 2022
Mutual Information Divergence: A Unified Metric for Multimodal
  Generative Models
Mutual Information Divergence: A Unified Metric for Multimodal Generative Models
Jin-Hwa Kim
Yunji Kim
Jiyoung Lee
Kang Min Yoo
Sang-Woo Lee
EGVM
101
35
0
25 May 2022
Context Matters for Image Descriptions for Accessibility: Challenges for
  Referenceless Evaluation Metrics
Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics
Elisa Kreiss
Cynthia L. Bennett
Shayan Hooshmand
E. Zelikman
Meredith Ringel Morris
Christopher Potts
74
27
0
21 May 2022
What's in a Caption? Dataset-Specific Linguistic Diversity and Its
  Effect on Visual Description Models and Metrics
What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
David M. Chan
Austin Myers
Sudheendra Vijayanarasimhan
David A. Ross
Bryan Seybold
John F. Canny
62
6
0
12 May 2022
Automated Audio Captioning: An Overview of Recent Progress and New
  Challenges
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
104
44
0
12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges
  in Audio Captioning
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning
Xuenan Xu
Zeyu Xie
Mengyue Wu
K. Yu
84
16
0
11 May 2022
RoViST:Learning Robust Metrics for Visual Storytelling
RoViST:Learning Robust Metrics for Visual Storytelling
Eileen Wang
S. Han
Josiah Poon
49
10
0
08 May 2022
Language Models Can See: Plugging Visual Controls in Text Generation
Language Models Can See: Plugging Visual Controls in Text Generation
Yixuan Su
Tian Lan
Yahui Liu
Fangyu Liu
Dani Yogatama
Yan Wang
Lingpeng Kong
Nigel Collier
VLMMLLM
102
98
0
05 May 2022
Reducing Predictive Feature Suppression in Resource-Constrained
  Contrastive Image-Caption Retrieval
Reducing Predictive Feature Suppression in Resource-Constrained Contrastive Image-Caption Retrieval
Maurits J. R. Bleeker
Andrew Yates
Maarten de Rijke
79
4
0
28 Apr 2022
Controllable Image Captioning
Luka Maxwell
92
0
0
28 Apr 2022
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo
  and Text
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
94
30
0
25 Apr 2022
Caption Feature Space Regularization for Audio Captioning
Caption Feature Space Regularization for Audio Captioning
Yiming Zhang
Hong Yu
Ruoyi Du
Zhanyu Ma
Yuan Dong
122
1
0
18 Apr 2022
Non-Parallel Text Style Transfer with Self-Parallel Supervision
Non-Parallel Text Style Transfer with Self-Parallel Supervision
Ruibo Liu
Chongyang Gao
Chenyan Jia
Guangxuan Xu
Soroush Vosoughi
VLM
82
16
0
18 Apr 2022
Towards Lightweight Transformer via Group-wise Transformation for
  Vision-and-Language Tasks
Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Yan Wang
Liujuan Cao
Yongjian Wu
Feiyue Huang
Rongrong Ji
ViT
64
47
0
16 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
On Distinctive Image Captioning via Comparing and Reweighting
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
87
16
0
08 Apr 2022
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
Yuxuan Wang
Difei Gao
Licheng Yu
Stan Weixian Lei
Matt Feiszli
Mike Zheng Shou
98
25
0
01 Apr 2022
Reproducibility Issues for BERT-based Evaluation Metrics
Reproducibility Issues for BERT-based Evaluation Metrics
Yanran Chen
Jonas Belouadi
Steffen Eger
118
17
0
30 Mar 2022
Counterfactual Cycle-Consistent Learning for Instruction Following and
  Generation in Vision-Language Navigation
Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation
Hongru Wang
Wei Liang
Jianbing Shen
Luc Van Gool
Wenguan Wang
95
58
0
30 Mar 2022
Interactive Audio-text Representation for Automated Audio Captioning
  with Contrastive Learning
Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning
Chen Chen
Nana Hou
Yuchen Hu
Heqing Zou
Xiaofeng Qi
Chng Eng Siong
VLM
84
21
0
29 Mar 2022
End-to-End Transformer Based Model for Image Captioning
End-to-End Transformer Based Model for Image Captioning
Yiyu Wang
Jungang Xu
Yingfei Sun
VLMViT
62
124
0
29 Mar 2022
Previous
123...91011...171819
Next