ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.00487
  4. Cited By
Sequence to Sequence -- Video to Text

Sequence to Sequence -- Video to Text

3 May 2015
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
ArXivPDFHTML

Papers citing "Sequence to Sequence -- Video to Text"

50 / 459 papers shown
Title
Temporal Deformable Convolutional Encoder-Decoder Networks for Video
  Captioning
Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning
Jingwen Chen
Yingwei Pan
Yehao Li
Ting Yao
Hongyang Chao
Tao Mei
21
104
0
03 May 2019
Hierarchical Recurrent Neural Network for Video Summarization
Hierarchical Recurrent Neural Network for Video Summarization
Bin Zhao
Xuelong Li
Xiaoqiang Lu
23
174
0
28 Apr 2019
Knowing When to Stop: Evaluation and Verification of Conformity to
  Output-size Specifications
Knowing When to Stop: Evaluation and Verification of Conformity to Output-size Specifications
Chenglong Wang
Rudy Bunel
Krishnamurthy Dvijotham
Po-Sen Huang
Edward Grefenstette
Pushmeet Kohli
30
5
0
26 Apr 2019
FishNet: A Camera Localizer using Deep Recurrent Networks
FishNet: A Camera Localizer using Deep Recurrent Networks
Hsin-I Chen
Sebastian Agethen
Chia-Min Wu
Winston H. Hsu
Bing-Yu Chen
16
0
0
22 Apr 2019
Neural-Attention-Based Deep Learning Architectures for Modeling Traffic
  Dynamics on Lane Graphs
Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs
Matthew A. Wright
Simon F. G. Ehlers
R. Horowitz
AI4CE
GNN
14
4
0
18 Apr 2019
Referring to Objects in Videos using Spatio-Temporal Identifying
  Descriptions
Referring to Objects in Videos using Spatio-Temporal Identifying Descriptions
Peratham Wiriyathammabhum
Abhinav Shrivastava
Vlad I. Morariu
L. Davis
25
4
0
08 Apr 2019
Streamlined Dense Video Captioning
Streamlined Dense Video Captioning
Jonghwan Mun
L. Yang
Zhou Ren
N. Xu
Bohyung Han
28
136
0
08 Apr 2019
VATEX: A Large-Scale, High-Quality Multilingual Dataset for
  Video-and-Language Research
VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
Xin Eric Wang
Jiawei Wu
Junkun Chen
Lei Li
Yuan-fang Wang
William Yang Wang
32
539
0
06 Apr 2019
The Steep Road to Happily Ever After: An Analysis of Current Visual
  Storytelling Models
The Steep Road to Happily Ever After: An Analysis of Current Visual Storytelling Models
Yatri Modi
Natalie Parde
21
16
0
06 Apr 2019
Weakly Supervised Video Moment Retrieval From Text Queries
Weakly Supervised Video Moment Retrieval From Text Queries
Niluthpol Chowdhury Mithun
S. Paul
A. Roy-Chowdhury
30
193
0
05 Apr 2019
Scene Understanding for Autonomous Manipulation with Deep Learning
Scene Understanding for Autonomous Manipulation with Deep Learning
A. Nguyen
22
6
0
23 Mar 2019
V2CNet: A Deep Learning Framework to Translate Videos to Commands for
  Robotic Manipulation
V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation
A. Nguyen
Thanh-Toan Do
Ian Reid
D. Caldwell
Nikos G. Tsagarakis
29
21
0
23 Mar 2019
M-VAD Names: a Dataset for Video Captioning with Naming
M-VAD Names: a Dataset for Video Captioning with Naming
S. Pini
Marcella Cornia
Federico Bolelli
Lorenzo Baraldi
Rita Cucchiara
21
29
0
04 Mar 2019
Spatiotemporal Pyramid Network for Video Action Recognition
Spatiotemporal Pyramid Network for Video Action Recognition
Yunbo Wang
Mingsheng Long
Jianmin Wang
Philip S. Yu
32
227
0
04 Mar 2019
Video Summarization via Actionness Ranking
Video Summarization via Actionness Ranking
Mohamed Elfeki
Ali Borji
19
42
0
01 Mar 2019
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding
  for Video Captioning
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning
Nayyer Aafaq
Naveed Akhtar
Wei Liu
Syed Zulqarnain Gilani
Ajmal Mian
31
204
0
27 Feb 2019
Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep
  Learning
Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep Learning
Youngeun Kwon
Minsoo Rhu
19
56
0
18 Feb 2019
Self-supervised Visual Feature Learning with Deep Neural Networks: A
  Survey
Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey
Longlong Jing
Yingli Tian
SSL
20
1,689
0
16 Feb 2019
Hierarchical Photo-Scene Encoder for Album Storytelling
Hierarchical Photo-Scene Encoder for Album Storytelling
Bairui Wang
Lin Ma
Wei Zhang
Wenhao Jiang
Feng-Li Zhang
11
28
0
02 Feb 2019
Not All Words are Equal: Video-specific Information Loss for Video
  Captioning
Not All Words are Equal: Video-specific Information Loss for Video Captioning
Jiarong Dong
Ke Gao
Xiaokai Chen
Junbo Guo
Juan Cao
Yongdong Zhang
21
7
0
01 Jan 2019
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Jingkuan Song
Xiangpeng Li
Lianli Gao
Heng Tao Shen
23
221
0
26 Dec 2018
Context, Attention and Audio Feature Explorations for Audio Visual
  Scene-Aware Dialog
Context, Attention and Audio Feature Explorations for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Juan Jose Alvarado Leanos
Jonathan Huang
L. Nachman
8
10
0
20 Dec 2018
Adversarial Inference for Multi-Sentence Video Description
Adversarial Inference for Multi-Sentence Video Description
J. S. Park
Marcus Rohrbach
Trevor Darrell
Anna Rohrbach
21
79
0
13 Dec 2018
Weakly Supervised Dense Event Captioning in Videos
Weakly Supervised Dense Event Captioning in Videos
Xuguang Duan
Wen-bing Huang
Chuang Gan
Jingdong Wang
Wenwu Zhu
Junzhou Huang
33
148
0
10 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
36
20
0
07 Dec 2018
Zero-Shot Anticipation for Instructional Activities
Zero-Shot Anticipation for Instructional Activities
Fadime Sener
Angela Yao
LM&Ro
25
68
0
06 Dec 2018
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web
  Instructional Videos
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos
Shaojie Wang
Wentian Zhao
Ziyi Kou
Chenliang Xu
9
5
0
02 Dec 2018
Multi-Stream Dynamic Video Summarization
Multi-Stream Dynamic Video Summarization
Mohamed Elfeki
Liqiang Wang
Ali Borji
EgoV
34
15
0
01 Dec 2018
A deep neural network to enhance prediction of 1-year mortality using
  echocardiographic videos of the heart
A deep neural network to enhance prediction of 1-year mortality using echocardiographic videos of the heart
Alvaro E. Ulloa
Linyuan Jing
Christopher W. Good
David P. vanMaanen
S. Raghunath
...
Aalpen A. Patel
H. Kirchner
Marios S. Pattichis
C. Haggerty
Brandon K. Fornwalt
22
3
0
26 Nov 2018
Chat More If You Like: Dynamic Cue Words Planning to Flow Longer
  Conversations
Chat More If You Like: Dynamic Cue Words Planning to Flow Longer Conversations
Lili Yao
Ruijian Xu
Chong Li
Dongyan Zhao
Rui Yan
14
9
0
19 Nov 2018
A Perceptual Prediction Framework for Self Supervised Event Segmentation
A Perceptual Prediction Framework for Self Supervised Event Segmentation
Sathyanarayanan N. Aakur
Sudeep Sarkar
19
69
0
12 Nov 2018
Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video
  Captioning
Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning
Yoonchang Sung
Jiawei Wu
Da Zhang
Yu-Chuan Su
Pratap Tokekar
32
38
0
07 Nov 2018
Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by
  Joint Reconstruction and Prediction of View and Word Sequences
Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences
Simon Denman
Mingyang Shang
Sabesan Sivapalan
Yu-Shen Liu
Matthias Zwicker
3DV
19
53
0
07 Nov 2018
Middle-Out Decoding
Middle-Out Decoding
Shikib Mehri
Leonid Sigal
24
22
0
28 Oct 2018
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
Shubham Agarwal
Ondrej Dusek
Ioannis Konstas
Verena Rieser
31
22
0
20 Oct 2018
Cross-Modal and Hierarchical Modeling of Video and Text
Cross-Modal and Hierarchical Modeling of Video and Text
Bowen Zhang
Hexiang Hu
Fei Sha
BDL
AI4TS
23
188
0
16 Oct 2018
Trellis Networks for Sequence Modeling
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
25
145
0
15 Oct 2018
Deep Photovoltaic Nowcasting
Deep Photovoltaic Nowcasting
Jinsong Zhang
Rodrigo Verschae
S. Nobuhara
Jean-François Lalonde
20
158
0
15 Oct 2018
Image-to-Video Person Re-Identification by Reusing Cross-modal
  Embeddings
Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings
Zhongwei Xie
Lin Li
Xian Zhong
Luo Zhong
14
2
0
04 Oct 2018
Vector Learning for Cross Domain Representations
Vector Learning for Cross Domain Representations
Shagan Sah
Chi Zhang
Thang Nguyen
D. Peri
Ameya Shringi
R. Ptucha
GAN
21
3
0
27 Sep 2018
Semantic Sentence Embeddings for Paraphrasing and Text Summarization
Semantic Sentence Embeddings for Paraphrasing and Text Summarization
Chi Zhang
Shagan Sah
Thang Nguyen
D. Peri
A. Loui
C. Salvaggio
R. Ptucha
29
31
0
26 Sep 2018
MTLE: A Multitask Learning Encoder of Visual Feature Representations for
  Video and Movie Description
MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description
Oliver A. Nina
Washington Garcia
Scott Clouse
Alper Yilmaz
20
4
0
19 Sep 2018
LiveBot: Generating Live Video Comments Based on Visual and Textual
  Contexts
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
Shuming Ma
Lei Cui
Damai Dai
Furu Wei
Xu Sun
VGen
23
61
0
13 Sep 2018
Game-Based Video-Context Dialogue
Game-Based Video-Context Dialogue
Ramakanth Pasunuru
Joey Tianyi Zhou
31
33
0
12 Sep 2018
Ensemble Sequence Level Training for Multimodal MT: OSU-Baidu WMT18
  Multimodal Machine Translation System Report
Ensemble Sequence Level Training for Multimodal MT: OSU-Baidu WMT18 Multimodal Machine Translation System Report
Renjie Zheng
Yilin Yang
Mingbo Ma
Liang Huang
12
8
0
31 Aug 2018
Multi-Reference Training with Pseudo-References for Neural Translation
  and Text Generation
Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation
Renjie Zheng
Mingbo Ma
Liang Huang
41
35
0
28 Aug 2018
Natural Language Generation with Neural Variational Models
Natural Language Generation with Neural Variational Models
Hareesh Bahuleyan
DRL
16
6
0
27 Aug 2018
Attentive Sequence to Sequence Translation for Localizing Clips of
  Interest by Natural Language Descriptions
Attentive Sequence to Sequence Translation for Localizing Clips of Interest by Natural Language Descriptions
Ke Ning
Linchao Zhu
Ming Cai
Yi Yang
Di Xie
Fei Wu
21
2
0
27 Aug 2018
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and
  Comprehensive Image Captions
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Fenglin Liu
Xuancheng Ren
Yuanxin Liu
Houfeng Wang
Xu Sun
98
65
0
27 Aug 2018
Deep Adaptive Temporal Pooling for Activity Recognition
Deep Adaptive Temporal Pooling for Activity Recognition
Sibo Song
Ngai-man Cheung
V. Chandrasekhar
Bappaditya Mandal
16
16
0
22 Aug 2018
Previous
123...1056789
Next