ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00487
  4. Cited By
Sequence to Sequence -- Video to Text

Sequence to Sequence -- Video to Text

3 May 2015
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
ArXivPDFHTML

Papers citing "Sequence to Sequence -- Video to Text"

50 / 459 papers shown
Title
Semantic Compositional Networks for Visual Captioning
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
53
425
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
41
14
0
23 Nov 2016
Video Captioning with Transferred Semantic Attributes
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
27
329
0
23 Nov 2016
Recurrent Memory Addressing for describing videos
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
38
10
0
20 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks
  for Image Captioning
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
27
1,650
0
17 Nov 2016
Multimodal Memory Modelling for Video Captioning
Multimodal Memory Modelling for Video Captioning
Junbo Wang
Wei Wang
Yan Huang
Liang Wang
Tieniu Tan
32
142
0
17 Nov 2016
Learning long-term dependencies for action recognition with a
  biologically-inspired deep network
Learning long-term dependencies for action recognition with a biologically-inspired deep network
Yemin Shi
Yonghong Tian
Yaowei Wang
Tiejun Huang
29
63
0
16 Nov 2016
Leveraging Video Descriptions to Learn Video Question Answering
Leveraging Video Descriptions to Learn Video Question Answering
Kuo-Hao Zeng
Tseng-Hung Chen
Ching-Yao Chuang
Yuan-Hong Liao
Juan Carlos Niebles
Min Sun
32
175
0
12 Nov 2016
Memory-augmented Attention Modelling for Videos
Memory-augmented Attention Modelling for Videos
Rasool Fakoor
Abdel-rahman Mohamed
Margaret Mitchell
S. B. Kang
Pushmeet Kohli
48
20
0
07 Nov 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
35
50
0
17 Oct 2016
Video Fill in the Blank with Merging LSTMs
Video Fill in the Blank with Merging LSTMs
Amir Mazaheri
Dong-Ming Zhang
M. Shah
29
18
0
13 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
14
230
0
10 Oct 2016
Learning Spatial-Semantic Context with Fully Convolutional Recurrent
  Network for Online Handwritten Chinese Text Recognition
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
Zecheng Xie
Zenghui Sun
Lianwen Jin
Hao Ni
Terry Lyons
43
122
0
09 Oct 2016
Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence
  Models
Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models
Ashwin K. Vijayakumar
Michael Cogswell
Ramprasaath R. Selvaraju
Q. Sun
Stefan Lee
David J. Crandall
Dhruv Batra
22
542
0
07 Oct 2016
A Survey of Multi-View Representation Learning
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS
3DV
37
509
0
03 Oct 2016
Learning Language-Visual Embedding for Movie Understanding with
  Natural-Language
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
22
97
0
26 Sep 2016
Deep Learning for Video Classification and Captioning
Deep Learning for Video Classification and Captioning
Zuxuan Wu
Ting Yao
Yanwei Fu
Yu-Gang Jiang
3DV
VLM
30
123
0
22 Sep 2016
Title Generation for User Generated Videos
Title Generation for User Generated Videos
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
35
69
0
25 Aug 2016
Frame- and Segment-Level Features and Candidate Pool Evaluation for
  Video Caption Generation
Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation
Rakshith Shetty
Jorma T. Laaksonen
15
94
0
17 Aug 2016
DeepDiary: Automatic Caption Generation for Lifelogging Image Streams
DeepDiary: Automatic Caption Generation for Lifelogging Image Streams
Chenyou Fan
David J. Crandall
DiffM
14
5
0
12 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
37
144
0
01 Aug 2016
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
De-An Huang
Li Fei-Fei
Juan Carlos Niebles
24
237
0
28 Jul 2016
A Comprehensive Survey on Cross-modal Retrieval
A Comprehensive Survey on Cross-modal Retrieval
Kun Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
42
294
0
21 Jul 2016
Network Trimming: A Data-Driven Neuron Pruning Approach towards
  Efficient Deep Architectures
Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures
Hengyuan Hu
Rui Peng
Yu-Wing Tai
Chi-Keung Tang
18
881
0
12 Jul 2016
Weakly Supervised Learning of Heterogeneous Concepts in Videos
Weakly Supervised Learning of Heterogeneous Concepts in Videos
Sohil Shah
K. Kulkarni
Arijit Biswas
Ankit Gandhi
Om Deshmukh
L. Davis
32
2
0
12 Jul 2016
VideoMCC: a New Benchmark for Video Comprehension
VideoMCC: a New Benchmark for Video Comprehension
Du Tran
Maksim Bolonkin
Manohar Paluri
Lorenzo Torresani
29
1
0
23 Jun 2016
Bidirectional Long-Short Term Memory for Video Description
Bidirectional Long-Short Term Memory for Video Description
Yi Bin
Yang Yang
Zi Huang
Fumin Shen
Xing Xu
Heng Tao Shen
39
60
0
15 Jun 2016
Sequence-to-Sequence Learning as Beam-Search Optimization
Sequence-to-Sequence Learning as Beam-Search Optimization
Sam Wiseman
Alexander M. Rush
44
589
0
09 Jun 2016
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent
  Neural Network
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network
Yu Liu
Jianlong Fu
Tao Mei
C. Chen
13
4
0
02 Jun 2016
Video Summarization with Long Short-term Memory
Video Summarization with Long Short-term Memory
Ke Zhang
Wei-Lun Chao
Fei Sha
Kristen Grauman
38
682
0
26 May 2016
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
32
353
0
12 May 2016
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
24
101
0
09 May 2016
Convolutional Two-Stream Network Fusion for Video Action Recognition
Convolutional Two-Stream Network Fusion for Video Action Recognition
Christoph Feichtenhofer
A. Pinz
Andrew Zisserman
36
2,605
0
22 Apr 2016
Fully Convolutional Recurrent Network for Handwritten Chinese Text
  Recognition
Fully Convolutional Recurrent Network for Handwritten Chinese Text Recognition
Zecheng Xie
Zenghui Sun
Lianwen Jin
Ziyong Feng
Shuye Zhang
19
47
0
18 Apr 2016
Learning Visual Storylines with Skipping Recurrent Neural Networks
Learning Visual Storylines with Skipping Recurrent Neural Networks
Gunnar A. Sigurdsson
Xinlei Chen
Abhinav Gupta
29
38
0
14 Apr 2016
Video Description using Bidirectional Recurrent Neural Networks
Video Description using Bidirectional Recurrent Neural Networks
Álvaro Peris
Marc Bolaños
Petia Radeva
F. Casacuberta
20
33
0
12 Apr 2016
Attributes as Semantic Units between Natural Language and Visual
  Recognition
Attributes as Semantic Units between Natural Language and Visual Recognition
Marcus Rohrbach
VLM
22
3
0
12 Apr 2016
TGIF: A New Dataset and Benchmark on Animated GIF Description
TGIF: A New Dataset and Benchmark on Animated GIF Description
Yuncheng Li
Yale Song
Liangliang Cao
Joel R. Tetreault
Larry Goldberg
A. Jaimes
Jiebo Luo
25
270
0
10 Apr 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity
  Understanding
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar A. Sigurdsson
Gül Varol
Xueliang Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
43
1,224
0
06 Apr 2016
Improving LSTM-based Video Description with Linguistic Knowledge Mined
  from Text
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Subhashini Venugopalan
Lisa Anne Hendricks
Raymond J. Mooney
Kate Saenko
VLM
28
117
0
06 Apr 2016
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for
  Automated Image Annotation
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation
Hoo-Chang Shin
Kirk Roberts
Le Lu
Dina Demner-Fushman
Jianhua Yao
Ronald M. Summers
18
347
0
28 Mar 2016
Abstractive Text Summarization Using Sequence-to-Sequence RNNs and
  Beyond
Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond
Ramesh Nallapati
Bowen Zhou
Cicero Nogueira dos Santos
Çağlar Gülçehre
Bing Xiang
AIMat
85
2,522
0
19 Feb 2016
Recognition of Visually Perceived Compositional Human Actions by
  Multiple Spatio-Temporal Scales Recurrent Neural Networks
Recognition of Visually Perceived Compositional Human Actions by Multiple Spatio-Temporal Scales Recurrent Neural Networks
Haanvid Lee
Minju Jung
Jun Tani
17
20
0
05 Feb 2016
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Suraj Srinivas
Ravi Kiran Sarvadevabhatla
Konda Reddy Mopuri
N. Prabhu
S. Kruthiventi
R. Venkatesh Babu
OOD
35
215
0
25 Jan 2016
Learning Articulated Motion Models from Visual and Lingual Signals
Learning Articulated Motion Models from Visual and Lingual Signals
Zhengyang Wu
Joey Tianyi Zhou
Matthew R. Walter
27
0
0
17 Nov 2015
Deep Compositional Captioning: Describing Novel Object Categories
  without Paired Training Data
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
Lisa Anne Hendricks
Subhashini Venugopalan
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Trevor Darrell
CoGe
16
284
0
17 Nov 2015
Uncovering Temporal Context for Video Question and Answering
Uncovering Temporal Context for Video Question and Answering
Linchao Zhu
Zhongwen Xu
Yi Yang
Alexander G. Hauptmann
BDL
27
44
0
15 Nov 2015
Oracle performance for visual captioning
Oracle performance for visual captioning
L. Yao
Nicolas Ballas
Kyunghyun Cho
John R. Smith
Yoshua Bengio
VLM
39
8
0
14 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with
  Application to Captioning
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Fei Wu
Yueting Zhuang
24
385
0
11 Nov 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
Wenyuan Xu
44
560
0
26 Oct 2015
Previous
123...1089
Next