Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.05146
Cited By
v1
v2 (latest)
ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation
11 October 2021
Aiden Seung Joon Lee
Hanseok Oh
Minjoon Seo
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation"
12 / 12 papers shown
Title
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li
Jie Lei
Zhe Gan
Licheng Yu
Yen-Chun Chen
...
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
97
101
0
08 Jun 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
428
810
0
18 Apr 2021
CUPID: Adaptive Curation of Pre-training Data for Video-and-Language Representation Learning
Luowei Zhou
Jingjing Liu
Yu Cheng
Zhe Gan
Lei Zhang
37
7
0
01 Apr 2021
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
74
133
0
19 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
999
29,871
0
26 Feb 2021
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
542
610
0
21 Jul 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
129
504
0
01 May 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
205
286
0
24 Jan 2020
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
122
1,208
0
07 Jun 2019
VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
Xin Eric Wang
Jiawei Wu
Junkun Chen
Lei Li
Yuan-fang Wang
William Yang Wang
101
556
0
06 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,229
0
11 Oct 2018
Towards Automatic Learning of Procedures from Web Instructional Videos
Luowei Zhou
Chenliang Xu
Jason J. Corso
EgoV
79
831
0
28 Mar 2017
1