Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.11488
Cited By
Opening the Vocabulary of Egocentric Actions
22 August 2023
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Opening the Vocabulary of Egocentric Actions"
13 / 13 papers shown
Title
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Dibyadip Chatterjee
Edoardo Remelli
Yale Song
Bugra Tekin
Abhay Mittal
...
Shreyas Hampali
Eric Sauser
Shugao Ma
Angela Yao
Fadime Sener
VLM
40
0
0
10 Apr 2025
CaRe-Ego: Contact-aware Relationship Modeling for Egocentric Interactive Hand-object Segmentation
Yuejiao Su
Yi Wang
Lap-Pui Chau
60
1
0
08 Jul 2024
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization
Anna Kukleva
Fadime Sener
Edoardo Remelli
Bugra Tekin
Eric Sauser
Bernt Schiele
Shugao Ma
VLM
EgoV
34
1
0
28 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
32
5
0
14 Mar 2024
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
Ahmad Darkhalil
Dandan Shan
Bin Zhu
Jian Ma
Amlan Kar
Richard E. L. Higgins
Sanja Fidler
David Fouhey
Dima Damen
VOS
42
98
0
26 Sep 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
224
1,018
0
13 Oct 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
247
558
0
28 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
362
0
17 Sep 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
325
2,263
0
02 Sep 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
223
898
0
28 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
309
780
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,693
0
11 Feb 2021
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
204
308
0
19 Oct 2020
1