ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1411.4555
  4. Cited By
Show and Tell: A Neural Image Caption Generator

Show and Tell: A Neural Image Caption Generator

17 November 2014
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
    3DV
ArXivPDFHTML

Papers citing "Show and Tell: A Neural Image Caption Generator"

50 / 2,022 papers shown
Title
vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain
  using Swin Transformer and Attention-based LSTM
vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain using Swin Transformer and Attention-based LSTM
THANH VAN NGUYEN
Long H. Nguyen
Nhat Truong Pham
Liu Tai Nguyen
Van Huong Do
Hai Nguyen
Ngoc Duy Nguyen
VLM
ViT
20
1
0
03 Sep 2022
A Deep Perceptual Measure for Lens and Camera Calibration
A Deep Perceptual Measure for Lens and Camera Calibration
Yannick Hold-Geoffroy
Dominique Piché-Meunier
Kalyan Sunkavalli
Jean-Charles Bazin
Franccois Rameau
Jean-François Lalonde
HAI
14
10
0
25 Aug 2022
Large-Scale Traffic Congestion Prediction based on Multimodal Fusion and
  Representation Mapping
Large-Scale Traffic Congestion Prediction based on Multimodal Fusion and Representation Mapping
Bo Zhou
Jiahui Liu
Songyi Cui
Yaping Zhao
26
5
0
23 Aug 2022
A Medical Semantic-Assisted Transformer for Radiographic Report
  Generation
A Medical Semantic-Assisted Transformer for Radiographic Report Generation
Zhanyu Wang
Mingkang Tang
Lei Wang
Xiu Li
Luping Zhou
ViT
MedIm
24
57
0
22 Aug 2022
Exploiting Multiple Sequence Lengths in Fast End to End Training for
  Image Captioning
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
29
21
0
13 Aug 2022
Aesthetic Attributes Assessment of Images with AMANv2 and DPC-CaptionsV2
Aesthetic Attributes Assessment of Images with AMANv2 and DPC-CaptionsV2
Xinghui Zhou
Xin Jin
Jianwen Lv
Heng Huang
Ming Mao
Shuai Cui
CoGe
18
0
0
09 Aug 2022
Distinctive Image Captioning via CLIP Guided Group Optimization
Distinctive Image Captioning via CLIP Guided Group Optimization
Youyuan Zhang
Jiuniu Wang
Hao Wu
Wenjia Xu
VLM
40
8
0
08 Aug 2022
Neural Message Passing for Visual Relationship Detection
Neural Message Passing for Visual Relationship Detection
Yue Hu
Siheng Chen
Xu Chen
Ya Zhang
Xiao Gu
44
17
0
08 Aug 2022
Pro-tuning: Unified Prompt Tuning for Vision Tasks
Pro-tuning: Unified Prompt Tuning for Vision Tasks
Xing Nie
Bolin Ni
Jianlong Chang
Gaomeng Meng
Chunlei Huo
Zhaoxiang Zhang
Shiming Xiang
Qi Tian
Chunhong Pan
AAML
VPVLM
VLM
34
70
0
28 Jul 2022
Visual Recognition by Request
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
16
15
0
28 Jul 2022
Retrieval-Augmented Transformer for Image Captioning
Retrieval-Augmented Transformer for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
29
57
0
26 Jul 2022
Rethinking the Reference-based Distinctive Image Captioning
Rethinking the Reference-based Distinctive Image Captioning
Yangjun Mao
Long Chen
Zhihong Jiang
Dong Zhang
Zhimeng Zhang
Jian Shao
Jun Xiao
DiffM
30
22
0
22 Jul 2022
Efficient Modeling of Future Context for Image Captioning
Efficient Modeling of Future Context for Image Captioning
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
42
14
0
22 Jul 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
36
106
0
20 Jul 2022
Explicit Image Caption Editing
Explicit Image Caption Editing
Zhen Wang
Long Chen
Wenbo Ma
G. Han
Yulei Niu
Jian Shao
Jun Xiao
25
12
0
20 Jul 2022
Unifying Event Detection and Captioning as Sequence Generation via
  Pre-Training
Unifying Event Detection and Captioning as Sequence Generation via Pre-Training
Qi Zhang
Yuqing Song
Qin Jin
30
24
0
18 Jul 2022
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised
  Referring Expression Grounding
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Xuejing Liu
Liang Li
Shuhui Wang
Zhengjun Zha
Dechao Meng
Qi Tian
Qingming Huang
30
60
0
18 Jul 2022
A Baseline for Detecting Out-of-Distribution Examples in Image
  Captioning
A Baseline for Detecting Out-of-Distribution Examples in Image Captioning
Gabi Shalev
Gal-Lev Shalev
Joseph Keshet
OODD
27
7
0
12 Jul 2022
Towards Multimodal Vision-Language Models Generating Non-Generic Text
Towards Multimodal Vision-Language Models Generating Non-Generic Text
Wes Robbins
Zanyar Zohourianshahzadi
Jugal Kalita
14
1
0
09 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image
  Captioning
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
38
3
0
07 Jul 2022
Zero-shot Cross-Linguistic Learning of Event Semantics
Zero-shot Cross-Linguistic Learning of Event Semantics
Malihe Alikhani
Thomas Kober
Bashar Alhafni
Yue (Eleanor) Chen
Mert Inan
Elizabeth Nielsen
Shahab Raji
Mark Steedman
Matthew Stone
34
3
0
05 Jul 2022
Are metrics measuring what they should? An evaluation of image
  captioning task metrics
Are metrics measuring what they should? An evaluation of image captioning task metrics
Othón González-Chávez
Guillermo Ruiz
Daniela Moctezuma
Tania A. Ramirez-delreal
21
9
0
04 Jul 2022
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of
  3D Human Motions and Texts
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts
Chuan Guo
Xinxin Xuo
Sen Wang
Li Cheng
VGen
87
230
0
04 Jul 2022
Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray
  Report Generation
Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray Report Generation
Sixing Yan
William K. Cheung
Keith W H Chiu
Terence M. Tong
Charles K. Cheung
Simon See
MedIm
33
14
0
04 Jul 2022
Contrastive Cross-Modal Knowledge Sharing Pre-training for
  Vision-Language Representation Learning and Retrieval
Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation Learning and Retrieval
Keyu Wen
Zhenshan Tan
Qingrong Cheng
Cheng Chen
X. Gu
VLM
29
0
0
02 Jul 2022
ZoDIAC: Zoneout Dropout Injection Attention Calculation
ZoDIAC: Zoneout Dropout Injection Attention Calculation
Zanyar Zohourianshahzadi
Jugal Kalita
36
0
0
28 Jun 2022
Automatic Generation of Product-Image Sequence in E-commerce
Automatic Generation of Product-Image Sequence in E-commerce
Xiaochuan Fan
Chi Zhang
Yong-Jie Yang
Yue Shang
Xueying Zhang
Zhen He
Yun Xiao
Bo Long
Lingfei Wu
28
4
0
26 Jun 2022
Competence-based Multimodal Curriculum Learning for Medical Report
  Generation
Competence-based Multimodal Curriculum Learning for Medical Report Generation
Fenglin Liu
Shen Ge
Yuexian Zou
Xian Wu
MedIm
27
132
0
24 Jun 2022
Bypass Network for Semantics Driven Image Paragraph Captioning
Bypass Network for Semantics Driven Image Paragraph Captioning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
32
1
0
21 Jun 2022
DALL-E for Detection: Language-driven Compositional Image Synthesis for
  Object Detection
DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection
Yunhao Ge
Lyne Tchapmi
Brian Nlong Zhao
Neel Joshi
Laurent Itti
Vibhav Vineet
DiffM
ObjD
28
16
0
20 Jun 2022
A Self-Guided Framework for Radiology Report Generation
A Self-Guided Framework for Radiology Report Generation
Jun Li
Shibo Li
Ying Hu
Huiren Tao
MedIm
22
21
0
19 Jun 2022
REVECA -- Rich Encoder-decoder framework for Video Event CAptioner
REVECA -- Rich Encoder-decoder framework for Video Event CAptioner
Jaehyuk Heo
YongGi Jeong
Sunwoo Kim
Jaehee Kim
Pilsung Kang
18
0
0
18 Jun 2022
Image Captioning based on Feature Refinement and Reflective Decoding
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
22
3
0
16 Jun 2022
Psychologically-Inspired, Unsupervised Inference of Perceptual Groups of
  GUI Widgets from GUI Images
Psychologically-Inspired, Unsupervised Inference of Perceptual Groups of GUI Widgets from GUI Images
Mulong Xie
Zhenchang Xing
Sidong Feng
Chunyang Chen
Liming Zhu
Xiwei Xu
32
28
0
15 Jun 2022
Measuring Representational Harms in Image Captioning
Measuring Representational Harms in Image Captioning
Angelina Wang
Solon Barocas
Kristen Laird
Hanna M. Wallach
21
51
0
14 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
26
88
0
14 Jun 2022
Improving Image Captioning with Control Signal of Sentence Quality
Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu
Hong Qu
15
0
0
07 Jun 2022
Norm Participation Grounds Language
Norm Participation Grounds Language
David Schlangen
12
6
0
06 Jun 2022
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation
Mingjie Li
Wenjia Cai
Karin Verspoor
Shirui Pan
Xiaodan Liang
Xiaojun Chang
MedIm
36
35
0
04 Jun 2022
CLIP4IDC: CLIP for Image Difference Captioning
CLIP4IDC: CLIP for Image Difference Captioning
Zixin Guo
T. Wang
Jorma T. Laaksonen
VLM
29
27
0
01 Jun 2022
Heterogeneous Data-Centric Architectures for Modern Data-Intensive
  Applications: Case Studies in Machine Learning and Databases
Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases
Geraldo F. Oliveira
Amirali Boroumand
Saugata Ghose
Juan Gómez Luna
O. Mutlu
28
7
0
29 May 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
51
22
0
28 May 2022
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
16
6
0
28 May 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chia-Ju Chen
VLM
27
31
0
26 May 2022
Fine-grained Image Captioning with CLIP Reward
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
134
76
0
26 May 2022
Face2Text revisited: Improved data set and baseline results
Face2Text revisited: Improved data set and baseline results
Marc Tanti
Shaun Abdilla
A. Muscat
Claudia Borg
R. Farrugia
Albert Gatt
CVBM
13
3
0
24 May 2022
Let's Talk! Striking Up Conversations via Conversational Visual Question
  Generation
Let's Talk! Striking Up Conversations via Conversational Visual Question Generation
Shih-Han Chan
Tsai-Lun Yang
Yun-Wei Chu
Chi-Yang Hsu
Ting-Hao 'Kenneth' Huang
Yu-Shian Chiu
Lun-Wei Ku
21
1
0
19 May 2022
It Isn't Sh!tposting, It's My CAT Posting
It Isn't Sh!tposting, It's My CAT Posting
Parthsarthi Rawat
Sayan Das
Jorge Aguirre
Akhil Daphara
ViT
25
0
0
18 May 2022
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual
  Context for Image Captioning
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Chia-Wen Kuo
Z. Kira
24
52
0
09 May 2022
Language Models Can See: Plugging Visual Controls in Text Generation
Language Models Can See: Plugging Visual Controls in Text Generation
Yixuan Su
Tian Lan
Yahui Liu
Fangyu Liu
Dani Yogatama
Yan Wang
Lingpeng Kong
Nigel Collier
VLM
MLLM
51
97
0
05 May 2022
Previous
123...8910...394041
Next