Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.15609
Cited By
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
29 October 2021
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval"
17 / 17 papers shown
Title
Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval
A. Fragomeni
Dima Damen
Michael Wray
68
0
0
02 Apr 2025
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Xuejing Liu
Liang Li
Shuhui Wang
Zhengjun Zha
Dechao Meng
Qi Tian
Qingming Huang
55
61
0
18 Jul 2022
Token Shift Transformer for Video Classification
Hao Zhang
Y. Hao
Chong-Wah Ngo
ViT
58
117
0
05 Aug 2021
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Han Fang
Pengfei Xiong
Luhui Xu
Yu Chen
CLIP
VLM
70
294
0
21 Jun 2021
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
Xiaohan Wang
Linchao Zhu
Yi Yang
170
170
0
20 Apr 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
89
651
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
303
2,016
0
09 Feb 2021
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
37
142
0
16 Jun 2020
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Shizhe Chen
Yida Zhao
Qin Jin
Qi Wu
63
311
0
01 Mar 2020
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
83
1,186
0
07 Jun 2019
Learning Actor Relation Graphs for Group Activity Recognition
Jianchao Wu
Limin Wang
Li Wang
Jie Guo
Gangshan Wu
51
241
0
23 Apr 2019
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Sijie Yan
Yuanjun Xiong
Dahua Lin
GNN
188
4,124
0
23 Jan 2018
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
95
4,201
0
25 Jul 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
182
7,961
0
22 May 2017
Enhancing Person Re-identification in a Self-trained Subspace
Xun Yang
Meng Wang
Richang Hong
Q. Tian
Yong Rui
69
85
0
20 Apr 2017
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
199
10,412
0
21 Jul 2016
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Christian Szegedy
Sergey Ioffe
Vincent Vanhoucke
Alexander A. Alemi
256
14,196
0
23 Feb 2016
1