Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2208.07664
Cited By
M2HF: Multi-level Multi-modal Hybrid Fusion for Text-Video Retrieval
16 August 2022
Shuo Liu
Weize Quan
Mingyuan Zhou
Sihong Chen
Jian Kang
Zhenlan Zhao
Chen Chen
Dong-Ming Yan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"M2HF: Multi-level Multi-modal Hybrid Fusion for Text-Video Retrieval"
4 / 4 papers shown
Title
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
Xiaohan Wang
Linchao Zhu
Yi Yang
167
170
0
20 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
317
780
0
18 Apr 2021
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
424
596
0
21 Jul 2020
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
152
1,464
0
06 Jun 2016
1