ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.02311
  4. Cited By
HierVL: Learning Hierarchical Video-Language Embeddings
v1v2 (latest)

HierVL: Learning Hierarchical Video-Language Embeddings

5 January 2023
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
    VLMAI4TS
ArXiv (abs)PDFHTML

Papers citing "HierVL: Learning Hierarchical Video-Language Embeddings"

24 / 74 papers shown
Title
Long-Term Feature Banks for Detailed Video Understanding
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
185
480
0
12 Dec 2018
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
169
3,282
0
10 Dec 2018
TSM: Temporal Shift Module for Efficient Video Understanding
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin
Chuang Gan
Song Han
98
1,692
0
20 Nov 2018
Cross-Modal and Hierarchical Modeling of Video and Text
Cross-Modal and Hierarchical Modeling of Video and Text
Bowen Zhang
Hexiang Hu
Fei Sha
BDLAI4TS
64
191
0
16 Oct 2018
Charades-Ego: A Large-Scale Dataset of Paired Third and First Person
  Videos
Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos
Gunnar Sigurdsson
Abhinav Gupta
Cordelia Schmid
Ali Farhadi
Alahari Karteek
SLREgoV
68
171
0
25 Apr 2018
ECO: Efficient Convolutional Network for Online Video Understanding
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
185
499
0
24 Apr 2018
When will you do what? - Anticipating Temporal Occurrences of Activities
When will you do what? - Anticipating Temporal Occurrences of Activities
Yazan Abu Farha
Alexander Richard
Juergen Gall
68
192
0
03 Apr 2018
Stacked Cross Attention for Image-Text Matching
Stacked Cross Attention for Image-Text Matching
Kuang-Huei Lee
Xi Chen
G. Hua
Houdong Hu
Xiaodong He
101
1,156
0
21 Mar 2018
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
3DH
155
1,333
0
13 Dec 2017
Compressed Video Action Recognition
Compressed Video Action Recognition
Chao-Yuan Wu
Manzil Zaheer
Hexiang Hu
R. Manmatha
Alex Smola
Philipp Krahenbuhl
156
325
0
02 Dec 2017
Temporal Relational Reasoning in Videos
Temporal Relational Reasoning in Videos
Bolei Zhou
A. Andonian
Aude Oliva
Antonio Torralba
NAI
102
1,040
0
22 Nov 2017
RED: Reinforced Encoder-Decoder Networks for Action Anticipation
RED: Reinforced Encoder-Decoder Networks for Action Anticipation
J. Gao
Zhenheng Yang
Ram Nevatia
92
196
0
16 Jul 2017
Learnable pooling with Context Gating for video classification
Learnable pooling with Context Gating for video classification
Antoine Miech
Ivan Laptev
Josef Sivic
74
327
0
21 Jun 2017
Detecting Visual Relationships with Deep Relational Networks
Detecting Visual Relationships with Deep Relational Networks
Bo Dai
Yuqi Zhang
Dahua Lin
GNN
100
502
0
11 Apr 2017
Towards Automatic Learning of Procedures from Web Instructional Videos
Towards Automatic Learning of Procedures from Web Instructional Videos
Luowei Zhou
Chenliang Xu
Jason J. Corso
EgoV
75
830
0
28 Mar 2017
Overcoming catastrophic forgetting in neural networks
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
374
7,561
0
02 Dec 2016
YouTube-8M: A Large-Scale Video Classification Benchmark
YouTube-8M: A Large-Scale Video Classification Benchmark
Sami Abu-El-Haija
Nisarg Kothari
Joonseok Lee
Apostol Natsev
G. Toderici
Balakrishnan Varadarajan
Sudheendra Vijayanarasimhan
VLM
153
1,272
0
27 Sep 2016
Temporal Segment Networks: Towards Good Practices for Deep Action
  Recognition
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
120
3,840
0
02 Aug 2016
Long-term Temporal Convolutions for Action Recognition
Long-term Temporal Convolutions for Action Recognition
Gül Varol
Ivan Laptev
Cordelia Schmid
83
912
0
15 Apr 2016
MovieQA: Understanding Stories in Movies through Question-Answering
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
120
752
0
09 Dec 2015
Generation and Comprehension of Unambiguous Object Descriptions
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
136
1,357
0
07 Nov 2015
Beyond Short Snippets: Deep Networks for Video Classification
Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng
Matthew J. Hausknecht
Sudheendra Vijayanarasimhan
Oriol Vinyals
R. Monga
G. Toderici
147
2,338
0
31 Mar 2015
Long-term Recurrent Convolutional Networks for Visual Recognition and
  Description
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
167
6,056
0
17 Nov 2014
An Empirical Investigation of Catastrophic Forgetting in Gradient-Based
  Neural Networks
An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks
Ian Goodfellow
M. Berk Mirza
Xia Da
Aaron Courville
Yoshua Bengio
151
1,455
0
21 Dec 2013
Previous
12