Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.07058
Cited By
Ego4D: Around the World in 3,000 Hours of Egocentric Video
13 October 2021
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
Rohit Girdhar
Jackson Hamburger
Hao Jiang
Miao Liu
Xingyu Liu
Miguel Martin
Tushar Nagarajan
Ilija Radosavovic
Santhosh Kumar Ramakrishnan
Fiona Ryan
J. Sharma
Michael Wray
Mengmeng Xu
Eric Z. Xu
Chen Zhao
Siddhant Bansal
Dhruv Batra
Vincent Cartillier
Sean Crane
Tien Do
Morrie Doulaty
Akshay Erapalli
Christoph Feichtenhofer
A. Fragomeni
Qichen Fu
A. Gebreselasie
Cristina González
James M. Hillis
Xuhua Huang
Yifei Huang
Wenqi Jia
Weslie Khoo
J. Kolár
Satwik Kottur
Anurag Kumar
F. Landini
Chao Li
Yanghao Li
Zhenqiang Li
K. Mangalam
Raghava Modhugu
Jonathan Munro
Tullie Murrell
Takumi Nishiyasu
Will Price
Paola Ruiz Puentes
Merey Ramazanova
Leda Sari
Kiran Somasundaram
Audrey Southerland
Yusuke Sugano
Ruijie Tao
Minh Vo
Yuchen Wang
Xindi Wu
Takuma Yagi
Ziwei Zhao
Yunyi Zhu
Pablo Arbelaez
David J. Crandall
Dima Damen
G. Farinella
Christian Fuegen
Guohao Li
V. Ithapu
C. V. Jawahar
Hanbyul Joo
Kris M. Kitani
Haizhou Li
Richard Newcombe
A. Oliva
H. Park
James M. Rehg
Yoichi Sato
Jianbo Shi
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Ego4D: Around the World in 3,000 Hours of Egocentric Video"
50 / 791 papers shown
Title
Zero-Shot Robot Manipulation from Passive Human Videos
Homanga Bharadhwaj
Abhi Gupta
Shubham Tulsiani
Vikash Kumar
26
35
0
03 Feb 2023
Egocentric Video Task Translation @ Ego4D Challenge 2022
Zihui Xue
Yale Song
Kristen Grauman
Lorenzo Torresani
EgoV
14
2
0
03 Feb 2023
Towards Continual Egocentric Activity Recognition: A Multi-modal Egocentric Activity Dataset for Continual Learning
Linfeng Xu
Qingbo Wu
Lili Pan
Fanman Meng
Hongliang Li
Chiyuan He
Hanxin Wang
Shaoxu Cheng
Yunshu Dai
EgoV
HAI
31
23
0
26 Jan 2023
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
27
14
0
22 Jan 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
26
15
0
19 Jan 2023
Mephisto: A Framework for Portable, Reproducible, and Iterative Crowdsourcing
Jack Urbanek
Pratik Ringshia
FedML
6
7
0
12 Jan 2023
HyRSM++: Hybrid Relation Guided Temporal Set Matching for Few-shot Action Recognition
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Zhe Zuo
Changxin Gao
Rong Jin
Nong Sang
31
23
0
09 Jan 2023
EgoTracks: A Long-term Egocentric Visual Object Tracking Dataset
Hao Tang
Kevin J Liang
Matt Feiszli
Weiyao Wang
EgoV
30
10
0
09 Jan 2023
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
VLM
AI4TS
22
52
0
05 Jan 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding
Shuhan Tan
Tushar Nagarajan
Kristen Grauman
23
21
0
05 Jan 2023
GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition
V. V. Ramaswamy
S. Lin
Dora Zhao
Aaron B. Adcock
L. V. D. van der Maaten
Deepti Ghadiyaram
Olga Russakovsky
27
32
0
05 Jan 2023
PACO: Parts and Attributes of Common Objects
Vignesh Ramanathan
Anmol Kalia
Vladan Petrovic
Yiqian Wen
Baixue Zheng
...
Abhishek Kadian
Amir Mousavi
Yi-Zhe Song
Abhimanyu Dubey
D. Mahajan
VLM
24
94
0
04 Jan 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
35
7
0
04 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang
Mitesh Singh
Lorenzo Torresani
EgoV
72
23
0
03 Jan 2023
NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
Santhosh Kumar Ramakrishnan
Ziad Al-Halah
Kristen Grauman
117
39
0
02 Jan 2023
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Difei Gao
Luowei Zhou
Lei Ji
Linchao Zhu
Yezhou Yang
Mike Zheng Shou
44
60
0
19 Dec 2022
Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning
Zhecheng Yuan
Zhengrong Xue
Bo Yuan
Xueqian Wang
Yi Wu
Yang Gao
Huazhe Xu
SSL
OffRL
43
70
0
17 Dec 2022
Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games
Bolin Lai
Hongxin Zhang
Miao Liu
Aryan Pariani
Fiona Ryan
Wenqi Jia
Shirley Anugrah Hayati
James M. Rehg
Diyi Yang
15
8
0
16 Dec 2022
Policy Adaptation from Foundation Model Feedback
Yuying Ge
Annabella Macaluso
Erran L. Li
Ping Luo
Xiaolong Wang
LM&Ro
27
12
0
14 Dec 2022
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries
Jinjie Mai
Abdullah Hamdi
Silvio Giancola
Chen Zhao
Guohao Li
EgoV
38
14
0
14 Dec 2022
3rd Continual Learning Workshop Challenge on Egocentric Category and Instance Level Object Understanding
Lorenzo Pellegrini
Chenchen Zhu
Fanyi Xiao
Zhicheng Yan
Antonio Carta
Matthias De Lange
Vincenzo Lomonaco
Roshan Sumbaly
Pau Rodríguez López
David Vazquez
CLL
27
6
0
13 Dec 2022
Egocentric Video Task Translation
Zihui Xue
Yale Song
Kristen Grauman
Lorenzo Torresani
EgoV
29
13
0
13 Dec 2022
Doubly Right Object Recognition: A Why Prompt for Visual Rationales
Chengzhi Mao
Revant Teotia
Amrutha Sundar
Sachit Menon
Junfeng Yang
Xin Eric Wang
Carl Vondrick
18
29
0
12 Dec 2022
Breaking the "Object" in Video Object Segmentation
P. Tokmakov
Jie Li
Adrien Gaidon
VOS
29
39
0
12 Dec 2022
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
Nicklas Hansen
Zhecheng Yuan
Yanjie Ze
Tongzhou Mu
Aravind Rajeswaran
H. Su
Huazhe Xu
Xiaolong Wang
32
65
0
12 Dec 2022
CACTI: A Framework for Scalable Multi-Task Multi-Scene Visual Imitation Learning
Zhao Mandi
Homanga Bharadhwaj
Vincent Moens
Shuran Song
Aravind Rajeswaran
Vikash Kumar
LM&Ro
28
68
0
12 Dec 2022
Learning Video Representations from Large Language Models
Yue Zhao
Ishan Misra
Philipp Krahenbuhl
Rohit Girdhar
VLM
AI4TS
28
165
0
08 Dec 2022
VideoDex: Learning Dexterity from Internet Videos
Kenneth Shaw
Shikhar Bahl
Deepak Pathak
24
89
0
08 Dec 2022
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
41
16
0
08 Dec 2022
HERD: Continuous Human-to-Robot Evolution for Learning from Human Demonstration
Xingyu Liu
Deepak Pathak
Kris M. Kitani
21
7
0
08 Dec 2022
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation
Jiangyong Huang
William Zhu
Baoxiong Jia
Zan Wang
Xiaojian Ma
Qing Li
Siyuan Huang
37
5
0
28 Nov 2022
Interaction Region Visual Transformer for Egocentric Action Anticipation
Debaditya Roy
Ramanathan Rajendiran
Basura Fernando
36
15
0
25 Nov 2022
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Chen Zhao
Shuming Liu
K. Mangalam
Guohao Li
38
17
0
25 Nov 2022
Multi-Task Learning of Object State Changes from Uncurated Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
34
11
0
24 Nov 2022
Learning to Imitate Object Interactions from Internet Videos
Austin Patel
Andrew E. Wang
Ilija Radosavovic
Jitendra Malik
29
21
0
23 Nov 2022
Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
Ted Xiao
Harris Chan
P. Sermanet
Ayzaan Wahid
Anthony Brohan
Karol Hausman
Sergey Levine
Jonathan Tompson
VLM
LM&Ro
38
65
0
21 Nov 2022
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
Sun-Kyoo Hwang
Jaehong Yoon
Youngwan Lee
Sung Ju Hwang
31
6
0
19 Nov 2022
Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization
Mengmeng Xu
Yanghao Li
Cheng-Yang Fu
Guohao Li
Tao Xiang
Juan-Manuel Perez-Rua
25
13
0
18 Nov 2022
Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022
Jiachen Lei
Shuang Ma
Zhongjie Ba
Sai H. Vemprala
Ashish Kapoor
Kui Ren
EgoV
12
0
0
18 Nov 2022
Estimating more camera poses for ego-centric videos is essential for VQ3D
Jinjie Mai
Chen Zhao
Abdullah Hamdi
Silvio Giancola
Guohao Li
EgoV
19
4
0
18 Nov 2022
AVATAR submission to the Ego4D AV Transcription Challenge
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
22
0
0
18 Nov 2022
ReLER@ZJU Submission to the Ego4D Moment Queries Challenge 2022
Jiayi Shao
Xiaohan Wang
Yi Yang
20
1
0
17 Nov 2022
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges
Guo Chen
Sen Xing
Zhe Chen
Yi Wang
Kunchang Li
...
Hongjie Zhang
Tong Lu
Yali Wang
Liming Wang
Yu Qiao
38
46
0
17 Nov 2022
Exploring adaptation of VideoMAE for Audio-Visual Diarization & Social @ Ego4d Looking at me Challenge
Yinan He
Guo Chen
12
0
0
17 Nov 2022
Where a Strong Backbone Meets Strong Features -- ActionFormer for Ego4D Moment Queries Challenge
Fangzhou Mu
Sicheng Mo
Gillian Wang
Yin Li
30
3
0
16 Nov 2022
Learning Reward Functions for Robotic Manipulation by Observing Humans
Minttu Alakuijala
Gabriel Dulac-Arnold
Julien Mairal
Jean Ponce
Cordelia Schmid
OffRL
37
26
0
16 Nov 2022
An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022
Zhijian Hou
Wanjun Zhong
Lei Ji
Difei Gao
Kun Yan
W. Chan
Chong-Wah Ngo
Zheng Shou
Nan Duan
6
6
0
16 Nov 2022
Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands and Objects Challenge 2022
Yin-Dong Zheng
Guo Chen
Jiahao Wang
Tong Lu
Liming Wang
37
0
0
16 Nov 2022
A Simple Transformer-Based Model for Ego4D Natural Language Queries Challenge
Sicheng Mo
Fangzhou Mu
Yin Li
22
7
0
16 Nov 2022
Towards Long-Tailed 3D Detection
Neehar Peri
Achal Dave
Deva Ramanan
Shu Kong
3DPC
19
20
0
16 Nov 2022
Previous
1
2
3
...
13
14
15
16
Next