Cross-Modal and Hierarchical Modeling of Video and Text

16 October 2018

Papers citing "Cross-Modal and Hierarchical Modeling of Video and Text"

7 / 57 papers shown

Title
Dual Encoding for Video Retrieval by Text Jianfeng Dong Xirong Li Chaoxi Xu Xun Yang Gang Yang Xun Wang Meng Wang 24 2 0 10 Sep 2020
Multi-modal Transformer for Video Retrieval Valentin Gabeur Chen Sun Alahari Karteek Cordelia Schmid ViT 427 596 0 21 Jul 2020
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning Jie Lei Liwei Wang Yelong Shen Dong Yu Tamara L. Berg Joey Tianyi Zhou 27 186 0 11 May 2020
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications Biagio Brattoli Joseph Tighe Fedor Zhdanov Pietro Perona Krzysztof Chalupka VLM 137 127 0 03 Mar 2020
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning Shizhe Chen Yida Zhao Qin Jin Qi Wu 36 310 0 01 Mar 2020
Use What You Have: Video Retrieval Using Representations From Collaborative Experts Yang Liu Samuel Albanie Arsha Nagrani Andrew Zisserman 36 387 0 31 Jul 2019
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 218 7,929 0 17 Aug 2015