ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.11004
  4. Cited By
Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

26 December 2018
Jingkuan Song
Xiangpeng Li
Lianli Gao
Heng Tao Shen
ArXivPDFHTML

Papers citing "Hierarchical LSTMs with Adaptive Attention for Visual Captioning"

49 / 49 papers shown
Title
Discriminability objective for training descriptive captions
Discriminability objective for training descriptive captions
Ruotian Luo
Brian L. Price
Scott D. Cohen
Gregory Shakhnarovich
100
203
0
12 Mar 2018
Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder
Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder
Jingkuan Song
Hanwang Zhang
Xiangpeng Li
Lianli Gao
Ming Wang
Richang Hong
62
247
0
07 Feb 2018
Non-local Neural Networks
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
271
8,888
0
21 Nov 2017
From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video
  Captioning
From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning
Jingkuan Song
Yuyu Guo
Lianli Gao
Xuelong Li
Alan Hanjalic
Heng Tao Shen
58
221
0
08 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
113
4,208
0
25 Jul 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li Li
52
324
0
12 Apr 2017
Attention-Based Multimodal Fusion for Video Description
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
76
359
0
11 Jan 2017
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image
  Captioning
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
118
1,450
0
06 Dec 2016
Self-critical Sequence Training for Image Captioning
Self-critical Sequence Training for Image Captioning
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
105
1,886
0
02 Dec 2016
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
120
446
0
01 Dec 2016
Video Captioning with Multi-Faceted Attention
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
65
88
0
01 Dec 2016
Semantic Compositional Networks for Visual Captioning
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
92
426
0
23 Nov 2016
Video Captioning with Transferred Semantic Attributes
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
63
328
0
23 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks
  for Image Captioning
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
63
1,655
0
17 Nov 2016
Boosting Image Captioning with Attributes
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
83
621
0
05 Nov 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
115
231
0
10 Oct 2016
Temporal Segment Networks: Towards Good Practices for Deep Action
  Recognition
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
98
3,825
0
02 Aug 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
90
1,909
0
29 Jul 2016
A Survey on Learning to Hash
A Survey on Learning to Hash
Jingdong Wang
Ting Zhang
Jingkuan Song
N. Sebe
Heng Tao Shen
91
969
0
01 Jun 2016
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
73
357
0
12 May 2016
Incorporating Copying Mechanism in Sequence-to-Sequence Learning
Incorporating Copying Mechanism in Sequence-to-Sequence Learning
Jiatao Gu
Zhengdong Lu
Hang Li
Victor O.K. Li
173
1,539
0
21 Mar 2016
Image Captioning with Semantic Attention
Image Captioning with Semantic Attention
Quanzeng You
Hailin Jin
Zhaowen Wang
Chen Fang
Jiebo Luo
VLM
164
1,662
0
12 Mar 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
194
5,726
0
23 Feb 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.0K
193,426
0
10 Dec 2015
Video captioning with recurrent networks based on frame- and video-level
  features and visual content classification
Video captioning with recurrent networks based on frame- and video-level features and visual content classification
Rakshith Shetty
Jorma T. Laaksonen
42
31
0
09 Dec 2015
Sequence Level Training with Recurrent Neural Networks
Sequence Level Training with Recurrent Neural Networks
MarcÁurelio Ranzato
S. Chopra
Michael Auli
Wojciech Zaremba
98
1,614
0
20 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with
  Application to Captioning
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Leilei Gan
Yueting Zhuang
43
385
0
11 Nov 2015
Stacked Attention Networks for Image Question Answering
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
103
1,877
0
07 Nov 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
Wenyuan Xu
85
560
0
26 Oct 2015
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
362
7,955
0
17 Aug 2015
Describing Multimedia Content using Attention-based Encoder--Decoder
  Networks
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
66
411
0
04 Jul 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
  Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
473
62,122
0
04 Jun 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
75
534
0
07 May 2015
Sequence to Sequence -- Video to Text
Sequence to Sequence -- Video to Text
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
125
1,417
0
03 May 2015
Using Descriptive Video Services to Create a Large Data Source for Video
  Annotation Research
Using Descriptive Video Services to Create a Large Data Source for Video Annotation Research
Atousa Torabi
C. Pal
Hugo Larochelle
Aaron Courville
VGen
83
205
0
03 Mar 2015
Describing Videos by Exploiting Temporal Structure
Describing Videos by Exploiting Temporal Structure
L. Yao
Atousa Torabi
Kyunghyun Cho
Nicolas Ballas
C. Pal
Hugo Larochelle
Aaron Courville
139
1,063
0
27 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
322
10,050
0
10 Feb 2015
A Dataset for Movie Description
A Dataset for Movie Description
Anna Rohrbach
Marcus Rohrbach
Niket Tandon
Bernt Schiele
VGen
105
499
0
12 Jan 2015
Translating Videos to Natural Language Using Deep Recurrent Neural
  Networks
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
120
952
0
15 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
101
5,578
0
07 Dec 2014
CIDEr: Consensus-based Image Description Evaluation
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
262
4,471
0
20 Nov 2014
Learning a Recurrent Visual Representation for Image Caption Generation
Learning a Recurrent Visual Representation for Image Caption Generation
Xinlei Chen
C. L. Zitnick
SSL
GAN
93
195
0
20 Nov 2014
From Captions to Visual Concepts and Back
From Captions to Visual Concepts and Back
Hao Fang
Saurabh Gupta
F. Iandola
R. Srivastava
Li Deng
...
Xiaodong He
Margaret Mitchell
John C. Platt
C. L. Zitnick
Geoffrey Zweig
VLM
91
1,309
0
18 Nov 2014
Show and Tell: A Neural Image Caption Generator
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
223
6,018
0
17 Nov 2014
Going Deeper with Convolutions
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
411
43,589
0
17 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.5K
100,213
0
04 Sep 2014
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
A. Karpathy
Armand Joulin
Li Fei-Fei
VLM
91
936
0
22 Jun 2014
Two-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
237
7,526
0
09 Jun 2014
ADADELTA: An Adaptive Learning Rate Method
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
132
6,623
0
22 Dec 2012
1