ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.10407
  4. Cited By
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for
  Image Captioning

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

20 February 2021
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
    VLM
ArXivPDFHTML

Papers citing "VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning"

26 / 76 papers shown
Title
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
111
4,208
0
25 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
626
130,942
0
12 Jun 2017
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Yufei Wang
Zhe Lin
Xiaohui Shen
Scott D. Cohen
G. Cottrell
69
106
0
23 Apr 2017
Attend to You: Personalized Image Captioning with Context Sequence
  Memory Networks
Attend to You: Personalized Image Captioning with Context Sequence Memory Networks
C. C. Park
Byeongchang Kim
Gunhee Kim
50
173
0
21 Apr 2017
Speaking the Same Language: Matching Machine to Human Captions by
  Adversarial Training
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
Rakshith Shetty
Marcus Rohrbach
Lisa Anne Hendricks
Mario Fritz
Bernt Schiele
43
144
0
30 Mar 2017
Towards Diverse and Natural Image Descriptions via a Conditional GAN
Towards Diverse and Natural Image Descriptions via a Conditional GAN
Bo Dai
Sanja Fidler
R. Urtasun
Dahua Lin
GAN
49
453
0
17 Mar 2017
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image
  Captioning
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
115
1,450
0
06 Dec 2016
Self-critical Sequence Training for Image Captioning
Self-critical Sequence Training for Image Captioning
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
105
1,886
0
02 Dec 2016
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
120
446
0
01 Dec 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks
  for Image Captioning
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
63
1,655
0
17 Nov 2016
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
262
2,844
0
26 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
91
854
0
21 Sep 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
84
1,909
0
29 Jul 2016
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
336
10,467
0
21 Jul 2016
Captioning Images with Diverse Objects
Captioning Images with Diverse Objects
Subhashini Venugopalan
Lisa Anne Hendricks
Marcus Rohrbach
Raymond J. Mooney
Trevor Darrell
Kate Saenko
VLM
53
178
0
24 Jun 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
194
5,726
0
23 Feb 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.9K
193,426
0
10 Dec 2015
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
124
1,167
0
24 Nov 2015
Deep Compositional Captioning: Describing Novel Object Categories
  without Paired Training Data
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
Lisa Anne Hendricks
Subhashini Venugopalan
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Trevor Darrell
CoGe
58
284
0
17 Nov 2015
Neural Machine Translation of Rare Words with Subword Units
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
195
7,729
0
31 Aug 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
  Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
463
62,122
0
04 Jun 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
312
10,050
0
10 Feb 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
91
5,578
0
07 Dec 2014
CIDEr: Consensus-based Image Description Evaluation
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
252
4,471
0
20 Nov 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and
  Description
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
138
6,048
0
17 Nov 2014
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
369
43,524
0
01 May 2014
Previous
12