Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00487
Cited By
Sequence to Sequence -- Video to Text
3 May 2015
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sequence to Sequence -- Video to Text"
50 / 459 papers shown
Title
Improved Actor Relation Graph based Group Activity Recognition
Zijian Kuang
Xinran Tie
23
5
0
24 Oct 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Yikang Shen
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
23
95
0
22 Oct 2020
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
23
6
0
19 Oct 2020
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
Qinxin Wang
Hao Tan
Sheng Shen
Michael W. Mahoney
Z. Yao
ObjD
47
11
0
12 Oct 2020
Boosting Continuous Sign Language Recognition via Cross Modality Augmentation
Junfu Pu
Wen-gang Zhou
Hezhen Hu
Houqiang Li
43
108
0
11 Oct 2020
Support-set bottlenecks for video-text representation learning
Mandela Patrick
Po-Yao (Bernie) Huang
Yuki M. Asano
Florian Metze
Alexander G. Hauptmann
João Henriques
Andrea Vedaldi
22
244
0
06 Oct 2020
Video captioning with stacked attention and semantic hard pull
Md. Mushfiqur Rahman
Thasinul Abedin
Khondokar S. S. Prottoy
Ayana Moshruba
Fazlul Hasan Siddiqui
27
2
0
15 Sep 2020
Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability
Anelise Newman
Camilo Luciano Fosco
Vincent Casser
Allen Lee
Mcnamara
A. Oliva
13
49
0
05 Sep 2020
Video Captioning Using Weak Annotation
Jingyi Hou
Yunde Jia
Xinxiao Wu
Yayun Qi
37
2
0
02 Sep 2020
Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer Learning
Yinghua Zhang
Yangqiu Song
Jian Liang
Kun Bai
Qiang Yang
AAML
34
28
0
25 Aug 2020
In-Home Daily-Life Captioning Using Radio Signals
Lijie Fan
Tianhong Li
Yuan. Yuan
Dina Katabi
40
47
0
25 Aug 2020
Identity-Aware Multi-Sentence Video Description
J. S. Park
Trevor Darrell
Anna Rohrbach
23
17
0
22 Aug 2020
Poet: Product-oriented Video Captioner for E-commerce
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Jie Liu
Jingren Zhou
Hongxia Yang
Fei Wu
14
34
0
16 Aug 2020
Domain-specific Communication Optimization for Distributed DNN Training
Hao Wang
Jingrong Chen
Xinchen Wan
Han Tian
Jiacheng Xia
Gaoxiong Zeng
Weiyan Wang
Kai Chen
Wei Bai
Junchen Jiang
AI4CE
6
15
0
16 Aug 2020
Enriching Video Captions With Contextual Text
Philipp Rimle
Pelin Dogan
Markus Gross
30
3
0
29 Jul 2020
Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos
Shaoxiang Chen
Wenhao Jiang
Wei Liu
Yu-Gang Jiang
25
101
0
28 Jul 2020
Active Learning for Video Description With Cluster-Regularized Ensemble Ranking
David M. Chan
Sudheendra Vijayanarasimhan
David A. Ross
John F. Canny
VLM
14
6
0
27 Jul 2020
Fully Convolutional Networks for Continuous Sign Language Recognition
Ka Leong Cheng
Zhaoyang Yang
Qifeng Chen
Yu-Wing Tai
SLR
44
143
0
24 Jul 2020
SBAT: Video Captioning with Sparse Boundary-Aware Transformer
Tao Jin
Siyu Huang
Ming Chen
Yingming Li
Zhongfei Zhang
32
52
0
23 Jul 2020
Learning to Discretely Compose Reasoning Module Networks for Video Captioning
Ganchao Tan
Daqing Liu
Meng Wang
Zhengjun Zha
LRM
25
73
0
17 Jul 2020
Visual Relation Grounding in Videos
Junbin Xiao
Xindi Shang
Xun Yang
Sheng Tang
Tat-Seng Chua
20
40
0
17 Jul 2020
Analyzing and Mitigating Data Stalls in DNN Training
Jayashree Mohan
Amar Phanishayee
Ashish Raniwala
Vijay Chidambaram
28
104
0
14 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Peng Gao
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Yongfeng Zhang
Hongsheng Li
A. Cherian
27
11
0
08 Jul 2020
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Yingwei Pan
Yehao Li
Jianjie Luo
Jun Xu
Ting Yao
Tao Mei
38
57
0
05 Jul 2020
Modality Shifting Attention Network for Multi-modal Video Question Answering
Junyeong Kim
Minuk Ma
T. Pham
Kyungsu Kim
Chang D. Yoo
26
72
0
04 Jul 2020
Efficient Algorithms for Device Placement of DNN Graph Operators
Jakub Tarnawski
Amar Phanishayee
Nikhil R. Devanur
Divya Mahajan
Fanny Nina Paravecino
27
66
0
29 Jun 2020
SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning
C. Sur
4
7
0
25 Jun 2020
Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals
Jing Shi
Xuankai Chang
Pengcheng Guo
Shinji Watanabe
Yusuke Fujita
Jiaming Xu
Bo Xu
Lei Xie
34
21
0
25 Jun 2020
Comprehensive Information Integration Modeling Framework for Video Titling
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Tan Jiang
Jingren Zhou
Hongxia Yang
Fei Wu
31
40
0
24 Jun 2020
Sub-Seasonal Climate Forecasting via Machine Learning: Challenges, Analysis, and Advances
Sijie He
Xinyan Li
T. DelSole
Pradeep Ravikumar
A. Banerjee
AI4Cl
29
43
0
14 Jun 2020
NITS-VC System for VATEX Video Captioning Challenge 2020
Alok Singh
Thoudam Doren Singh
Sivaji Bandyopadhyay
6
16
0
07 Jun 2020
Multi-modal Feature Fusion with Feature Attention for VATEX Captioning Challenge 2020
Ke Lin
Zhuoxin Gan
Liwei Wang
11
8
0
05 Jun 2020
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
Vladimir E. Iashin
Esa Rahtu
22
127
0
17 May 2020
Text Synopsis Generation for Egocentric Videos
Aidean Sharghi
N. Lobo
M. Shah
DiffM
EgoV
11
1
0
08 May 2020
Learning from Noisy Labels with Noise Modeling Network
Zhuolin Jiang
J. Silovský
M. Siu
William Hartmann
H. Gish
Sancar Adali
NoLa
17
2
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
46
494
0
01 May 2020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Boxiao Pan
Haoye Cai
De-An Huang
Kuan-Hui Lee
Adrien Gaidon
Ehsan Adeli
Juan Carlos Niebles
31
235
0
31 Mar 2020
Detection and Description of Change in Visual Streams
Davis Gilton
Ruotian Luo
Rebecca Willett
Gregory Shakhnarovich
AI4TS
18
4
0
27 Mar 2020
Action Localization through Continual Predictive Learning
Sathyanarayanan N. Aakur
Sudeep Sarkar
22
12
0
26 Mar 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLM
VGen
43
68
0
25 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
135
189
0
19 Mar 2020
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
22
164
0
17 Mar 2020
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
23
60
0
11 Mar 2020
Video Caption Dataset for Describing Human Actions in Japanese
Yutaro Shigeto
Yuya Yoshikawa
Jiaqing Lin
A. Takeuchi
17
3
0
10 Mar 2020
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement
Fangyi Zhu
Lei Li
Zhanyu Ma
Guang Chen
Jun Guo
19
1
0
08 Mar 2020
RODNet: Radar Object Detection Using Cross-Modal Supervision
Yizhou Wang
Zhongyu Jiang
Xiangyu Gao
Lei Li
Guanbin Xing
Hui Liu
24
9
0
03 Mar 2020
Understanding Contexts Inside Robot and Human Manipulation Tasks through a Vision-Language Model and Ontology System in a Video Stream
Chen Jiang
Masood Dehghan
Martin Jägersand
LM&Ro
41
8
0
02 Mar 2020
Matching Neuromorphic Events and Color Images via Adversarial Learning
Fang Xu
Shijie Lin
Wen Yang
Lei Yu
Dengxin Dai
Guisong Xia
25
1
0
02 Mar 2020
Hierarchical Memory Decoding for Video Captioning
Aming Wu
Yahong Han
22
2
0
27 Feb 2020
CLARA: Clinical Report Auto-completion
Siddharth Biswal
Cao Xiao
Lucas Glass
M. P. M. Brandon Westover
Jimeng Sun
24
27
0
26 Feb 2020
Previous
1
2
3
4
5
...
8
9
10
Next