Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00487
Cited By
Sequence to Sequence -- Video to Text
3 May 2015
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sequence to Sequence -- Video to Text"
50 / 459 papers shown
Title
Object Relational Graph with Teacher-Recommended Learning for Video Captioning
Ziqi Zhang
Yaya Shi
Chunfen Yuan
Bing Li
Peijin Wang
Weiming Hu
Zhengjun Zha
VLM
37
271
0
26 Feb 2020
Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition
Hao Zhou
Wen-gang Zhou
Yun Zhou
Houqiang Li
NoLa
32
195
0
08 Feb 2020
Multimodal Matching Transformer for Live Commenting
Chaoqun Duan
Lei Cui
Shuming Ma
Furu Wei
Conghui Zhu
Tiejun Zhao
6
12
0
07 Feb 2020
Convolutional Hierarchical Attention Network for Query-Focused Video Summarization
Shuwen Xiao
Zhou Zhao
Zijian Zhang
Ziyu Guan
Deng Cai
21
48
0
31 Jan 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
22
19
0
17 Jan 2020
Actions as Moving Points
Yixuan Li
Zixu Wang
Limin Wang
Gangshan Wu
22
104
0
14 Jan 2020
Exploiting Event Cameras for Spatio-Temporal Prediction of Fast-Changing Trajectories
Marco Monforte
A. Arriandiaga
Arren J. Glover
Chiara Bartolozzi
26
10
0
05 Jan 2020
Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social Network
W. Ramos
M. Silva
Edson Roteia Araujo Junior
Alan C. Neves
Erickson R. Nascimento
22
6
0
29 Dec 2019
Vision and Language: from Visual Perception to Content Creation
Tao Mei
Wei Zhang
Ting Yao
VLM
17
8
0
26 Dec 2019
Meaning guided video captioning
Rushi J. Babariya
Toru Tamaki
30
3
0
12 Dec 2019
Forecasting future action sequences with attention: a new approach to weakly supervised action forecasting
Yan Bin Ng
Basura Fernando
AI4TS
19
33
0
10 Dec 2019
Recurrent Neural Networks (RNNs): A gentle Introduction and Overview
Robin M. Schmidt
8
149
0
23 Nov 2019
Characterizing the impact of using features extracted from pre-trained models on the quality of video captioning sequence-to-sequence models
Menatallh Hammad
May Hammad
Mohamed Elshenawy
24
2
0
22 Nov 2019
Crowd Video Captioning
Liqi Yan
Mingjian Zhu
Changbin (Brad) Yu
11
4
0
13 Nov 2019
Video Captioning with Text-based Dynamic Attention and Step-by-Step Learning
Huanhou Xiao
Jinglun Shi
11
24
0
05 Nov 2019
On Compositionality in Neural Machine Translation
Vikas Raunak
Vaibhav Kumar
Florian Metze
13
17
0
04 Nov 2019
SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network
R. Yazdani
Olatunji Ruwase
Minjia Zhang
Yuxiong He
J. Arnau
Antonio González
27
4
0
04 Nov 2019
Diverse Video Captioning Through Latent Variable Expansion
Huanhou Xiao
Jinglun Shi
DiffM
35
15
0
26 Oct 2019
Imperial College London Submission to VATEX Video Captioning Task
Ozan Caglayan
Zixiu "Alex" Wu
Pranava Madhyastha
Josiah Wang
Lucia Specia
17
0
0
16 Oct 2019
Human Action Sequence Classification
Yan Bin Ng
Basura Fernando
30
4
0
07 Oct 2019
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
Kexin Yi
Yuta Saito
Yunzhu Li
Pushmeet Kohli
Jiajun Wu
Antonio Torralba
J. Tenenbaum
NAI
43
457
0
03 Oct 2019
Translation, Sentiment and Voices: A Computational Model to Translate and Analyse Voices from Real-Time Video Calling
A. Roy
22
1
0
28 Sep 2019
Learning Actions from Human Demonstration Video for Robotic Manipulation
Shuo Yang
Wei Zhang
Weizhi Lu
Hesheng Wang
Yibin Li
14
26
0
10 Sep 2019
Visual Semantic Reasoning for Image-Text Matching
Kunpeng Li
Yulun Zhang
Keqin Li
Yuanyuan Li
Y. Fu
VLM
17
499
0
06 Sep 2019
A Better Way to Attend: Attention with Trees for Video Question Answering
Hongyang Xue
Wenqing Chu
Zhou Zhao
Deng Cai
25
33
0
05 Sep 2019
A Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling
Haoran Chen
Ke Lin
A. Maye
Jianmin Li
Xiaoling Hu
25
47
0
31 Aug 2019
Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network
Bairui Wang
Lin Ma
Wei Zhang
Wenhao Jiang
Jingwen Wang
Wei Liu
74
163
0
27 Aug 2019
Language Features Matter: Effective Language Representations for Vision-Language Tasks
Andrea Burns
Reuben Tan
Kate Saenko
Stan Sclaroff
Bryan A. Plummer
VLM
27
27
0
17 Aug 2019
SF-Net: Structured Feature Network for Continuous Sign Language Recognition
Zhaoyang Yang
Zhenmei Shi
Xiaoyong Shen
Yu-Wing Tai
SLR
27
63
0
04 Aug 2019
Prediction and Description of Near-Future Activities in Video
T. Mahmud
Mohammad Billah
Mahmudul Hasan
A. Roy-Chowdhury
28
16
0
02 Aug 2019
Falls Prediction Based on Body Keypoints and Seq2Seq Architecture
Minjie Hua
Yibing Nan
Shiguo Lian
3DH
33
12
0
01 Aug 2019
ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a Mapping from Parts Detected in Multiple Views to Sentences
Zhizhong Han
Chao Chen
Yu-Shen Liu
Matthias Zwicker
3DPC
27
46
0
31 Jul 2019
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
36
387
0
31 Jul 2019
Deep Multi-Kernel Convolutional LSTM Networks and an Attention-Based Mechanism for Videos
Sebastian Agethen
Winston H. Hsu
HAI
24
25
0
30 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
25
132
0
22 Jul 2019
Watch It Twice: Video Captioning with a Refocused Video Encoder
Xiangxi Shi
Jianfei Cai
Chenyu You
Jiuxiang Gu
21
29
0
21 Jul 2019
Activity2Vec: Learning ADL Embeddings from Sensor Data with a Sequence-to-Sequence Model
Alireza Ghods
D. Cook
HAI
AI4TS
26
17
0
12 Jul 2019
Video Question Generation via Cross-Modal Self-Attention Networks Learning
Yu-Siang Wang
Hung-Ting Su
Chen-Hsi Chang
Zhe-Yu Liu
Winston H. Hsu
32
9
0
05 Jul 2019
Expressing Visual Relationships via Language
Hao Tan
Franck Dernoncourt
Zhe-nan Lin
Trung Bui
Joey Tianyi Zhou
26
63
0
18 Jun 2019
Object-aware Aggregation with Bidirectional Temporal Graph for Video Captioning
Junchao Zhang
Yuxin Peng
24
170
0
11 Jun 2019
FASTER Recurrent Networks for Efficient Video Classification
Linchao Zhu
Laura Sevilla-Lara
Du Tran
Matt Feiszli
Yi Yang
Heng Wang
49
6
0
10 Jun 2019
Attention is all you need for Videos: Self-attention based Video Summarization using Universal Transformers
Manjot Bilkhu
Siyang Wang
Tushar Dobhal
ViT
11
15
0
06 Jun 2019
Two-Stream Region Convolutional 3D Network for Temporal Activity Detection
Huijuan Xu
Abir Das
Kate Saenko
3DPC
19
46
0
05 Jun 2019
Relational Reasoning using Prior Knowledge for Visual Captioning
Jingyi Hou
Xinxiao Wu
Yayun Qi
Wentian Zhao
Jiebo Luo
Yunde Jia
17
14
0
04 Jun 2019
Reconstruct and Represent Video Contents for Captioning via Reinforcement Learning
Wei Zhang
Bairui Wang
Lin Ma
Wei Liu
20
67
0
03 Jun 2019
Learning to Generate Grounded Visual Captions without Localization Supervision
Chih-Yao Ma
Yannis Kalantidis
Ghassan AlRegib
Peter Vajda
Marcus Rohrbach
Z. Kira
SSL
19
10
0
01 Jun 2019
AttentionRNN: A Structured Spatial Attention Mechanism
Siddhesh Khandelwal
Leonid Sigal
24
3
0
22 May 2019
Memory-Attended Recurrent Network for Video Captioning
Wenjie Pei
Jiyuan Zhang
Xiangrong Wang
Lei Ke
Xiaoyong Shen
Yu-Wing Tai
14
200
0
10 May 2019
Multimodal Semantic Attention Network for Video Captioning
Liang Sun
Bing Li
Chunfen Yuan
Zhengjun Zha
Weiming Hu
29
11
0
08 May 2019
Towards More Realistic Human-Robot Conversation: A Seq2Seq-based Body Gesture Interaction System
Minjie Hua
Fuyuan Shi
Yibing Nan
Kai Wang
Hao Chen
Shiguo Lian
8
10
0
05 May 2019
Previous
1
2
3
4
5
6
...
8
9
10
Next