Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00487
Cited By
Sequence to Sequence -- Video to Text
3 May 2015
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sequence to Sequence -- Video to Text"
50 / 170 papers shown
Title
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Asmar Nadeem
Faegheh Sardari
R. Dawes
Syed Sameed Husain
Adrian Hilton
Armin Mustafa
57
4
0
10 Jun 2024
Video ReCap: Recursive Captioning of Hour-Long Videos
Md. Mohaiminul Islam
Ngan Ho
Xitong Yang
Tushar Nagarajan
Lorenzo Torresani
Gedas Bertasius
VGen
VLM
35
47
0
20 Feb 2024
RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model
Jianhao Yuan
Shuyang Sun
Daniel Omeiza
Bo Zhao
Paul Newman
Lars Kunze
Matthew Gadd
LRM
36
49
0
16 Feb 2024
Multi Sentence Description of Complex Manipulation Action Videos
Fatemeh Ziaeetabar
Reza Safabakhsh
S. Momtazi
M. Tamosiunaite
F. Worgotter
31
1
0
13 Nov 2023
Local-Global Information Interaction Debiasing for Dynamic Scene Graph Generation
Xinyu Lyu
Jingwei Liu
Yuyu Guo
Lianli Gao
26
1
0
10 Aug 2023
MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian
Willy Fitra Hendria
29
2
0
20 Jun 2023
GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation
Ji Qi
Jifan Yu
Teng Tu
Kunyu Gao
Yifan Xu
...
Juanzi Li
Jie Tang
Weidong Guo
Hui Liu
Yu-Syuan Xu
38
19
0
26 Mar 2023
ADAPT: Action-aware Driving Caption Transformer
Bu Jin
Xinyi Liu
Yupeng Zheng
Pengfei Li
Hao Zhao
Tong Zhang
Yuhang Zheng
Guyue Zhou
Jingjing Liu
30
69
0
01 Feb 2023
Visual Commonsense-aware Representation Network for Video Captioning
Pengpeng Zeng
Haonan Zhang
Lianli Gao
Xiangpeng Li
Jin Qian
Hengtao Shen
29
16
0
17 Nov 2022
VTC: Improving Video-Text Retrieval with User Comments
Laura Hanu
James Thewlis
Yuki M. Asano
Christian Rupprecht
VGen
35
7
0
19 Oct 2022
Selective Query-guided Debiasing for Video Corpus Moment Retrieval
Sunjae Yoon
Jiajing Hong
Eunseop Yoon
Dahyun Kim
Junyeong Kim
Hee Suk Yoon
Changdong Yoo
41
21
0
17 Oct 2022
Thinking Hallucination for Video Captioning
Nasib Ullah
Partha Pratim Mohanta
VLM
36
4
0
28 Sep 2022
MuMUR : Multilingual Multimodal Universal Retrieval
Avinash Madasu
Estelle Aflalo
Gabriela Ben-Melech Stan
Shachar Rosenman
Shao-Yen Tseng
Gedas Bertasius
Vasudev Lal
44
3
0
24 Aug 2022
Innovations in Neural Data-to-text Generation: A Survey
Mandar Sharma
Ajay K. Gogineni
Naren Ramakrishnan
32
10
0
25 Jul 2022
Meta Spatio-Temporal Debiasing for Video Scene Graph Generation
Li Xu
Haoxuan Qu
Jason Kuen
Jiuxiang Gu
Jun Liu
CML
31
27
0
23 Jul 2022
GL-RG: Global-Local Representation Granularity for Video Captioning
Liqi Yan
Qifan Wang
Yiming Cui
Fuli Feng
Xiaojun Quan
Xinming Zhang
Dongfang Liu
31
59
0
22 May 2022
Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video Restoration
Jing Lin
Xiaowan Hu
Yuanhao Cai
Haoqian Wang
Youliang Yan
X. Zou
Yulun Zhang
Luc Van Gool
54
17
0
20 May 2022
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Chengpeng Dai
Fuhai Chen
Xiaoshuai Sun
Rongrong Ji
QiXiang Ye
Yongjian Wu
22
1
0
13 Mar 2022
Cross-modal Contrastive Distillation for Instructional Activity Anticipation
Zhengyuan Yang
Jingen Liu
Jing-ling Huang
Xiaodong He
Tao Mei
Chenliang Xu
Jiebo Luo
31
6
0
18 Jan 2022
InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer
Chin-Tung Lin
Mu Yang
ViT
25
1
0
31 Dec 2021
Cross Modal Retrieval with Querybank Normalisation
Simion-Vlad Bogolin
Ioana Croitoru
Hailin Jin
Yang Liu
Samuel Albanie
27
84
0
23 Dec 2021
Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices
Hariom A. Pandya
Brijesh S. Bhatt
40
27
0
07 Dec 2021
Syntax Customized Video Captioning by Imitating Exemplar Sentences
Yitian Yuan
Lin Ma
Wenwu Zhu
22
6
0
02 Dec 2021
Relational Graph Learning for Grounded Video Description Generation
Wenqiao Zhang
Qing Guo
Siliang Tang
Haizhou Shi
Haochen Shi
Jun Xiao
Yueting Zhuang
Luu Anh Tuan
27
33
0
02 Dec 2021
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter
Bang-ju Yang
Tong Zhang
Yuexian Zou
CLIP
25
20
0
30 Nov 2021
Hierarchical Modular Network for Video Captioning
Hanhua Ye
Guorong Li
Yuankai Qi
Shuhui Wang
Qingming Huang
Ming-Hsuan Yang
27
67
0
24 Nov 2021
DVCFlow: Modeling Information Flow Towards Human-like Video Captioning
Xu Yan
Zhengcong Fei
Shuhui Wang
Qingming Huang
Qi Tian
VGen
40
4
0
19 Nov 2021
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
Yaya Shi
Xu Yang
Haiyang Xu
Chunfen Yuan
Bing Li
Weiming Hu
Zhengjun Zha
39
33
0
17 Nov 2021
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
22
10
0
29 Oct 2021
Visual-aware Attention Dual-stream Decoder for Video Captioning
Zhixin Sun
Xian Zhong
Shuqin Chen
Lin Li
Luo Zhong
31
3
0
16 Oct 2021
CLIP4Caption: CLIP for Video Caption
Mingkang Tang
Zhanyu Wang
Zhenhua Liu
Fengyun Rao
Dian Li
Xiu Li
CLIP
VLM
35
150
0
13 Oct 2021
Beam Search with Bidirectional Strategies for Neural Response Generation
Pierre Colombo
Chouchang Yang
Giovanna Varni
Chloé Clavel
41
13
0
07 Oct 2021
Attention Gate in Traffic Forecasting
Anh-Phuong Lam
A. Nguyen
H. Le
25
0
0
27 Sep 2021
Scene Graph Generation for Better Image Captioning?
Maximilian Mozes
Martin Schmitt
Vladimir Golkov
Hinrich Schütze
Daniel Cremers
GNN
29
3
0
23 Sep 2021
EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery Generation
Yanjun Gao
Lulu Liu
Jason Wang
Xin Chen
Huayan Wang
Rui Zhang
31
1
0
10 Sep 2021
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention
Katsuyuki Nakamura
Hiroki Ohashi
Mitsuhiro Okada
EgoV
31
12
0
07 Sep 2021
End-to-End Dense Video Captioning with Parallel Decoding
Teng Wang
Ruimao Zhang
Zhichao Lu
Feng Zheng
Ran Cheng
Ping Luo
3DV
47
179
0
17 Aug 2021
Hybrid Reasoning Network for Video-based Commonsense Captioning
Weijiang Yu
Jian Liang
Lei Ji
Lu Li
Yuejian Fang
Nong Xiao
Nan Duan
16
10
0
05 Aug 2021
Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Chiori Hori
Takaaki Hori
Jonathan Le Roux
25
4
0
04 Aug 2021
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference
Juncheng Li
Siliang Tang
Linchao Zhu
Haochen Shi
Xuanwen Huang
Fei Wu
Yi Yang
Yueting Zhuang
25
28
0
26 Jul 2021
A comparison of LSTM and GRU networks for learning symbolic sequences
Roberto Cahuantzi
Xinye Chen
S. Güttel
28
136
0
05 Jul 2021
Towards Diverse Paragraph Captioning for Untrimmed Videos
Yuqing Song
Shizhe Chen
Qin Jin
21
37
0
30 May 2021
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions
Junbin Xiao
Xindi Shang
Angela Yao
Tat-Seng Chua
45
444
0
18 May 2021
Video Corpus Moment Retrieval with Contrastive Learning
Hao Zhang
Aixin Sun
Wei Jing
Guoshun Nan
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
44
81
0
13 May 2021
Reconstructive Sequence-Graph Network for Video Summarization
Bin Zhao
Haopeng Li
Xiaoqiang Lu
Xuelong Li
18
101
0
10 May 2021
On Semantic Similarity in Video Retrieval
Michael Wray
Hazel Doughty
Dima Damen
31
66
0
18 Mar 2021
RODNet: A Real-Time Radar Object Detection Network Cross-Supervised by Camera-Radar Fused Object 3D Localization
Yizhou Wang
Zhongyu Jiang
Yudong Li
Lei Li
Guanbin Xing
Hui Liu
39
158
0
09 Feb 2021
Understanding Action Sequences based on Video Captioning for Learning-from-Observation
Iori Yanokura
Naoki Wake
Kazuhiro Sasabuchi
Katsushi Ikeuchi
Masayuki Inaba
27
4
0
09 Dec 2020
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling
Jing Su
Qingyun Dai
Frank Guerin
Mian Zhou
30
24
0
03 Dec 2020
DORB: Dynamically Optimizing Multiple Rewards with Bandits
Ramakanth Pasunuru
Han Guo
Joey Tianyi Zhou
OffRL
32
6
0
15 Nov 2020
1
2
3
4
Next