Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.07058
Cited By
Ego4D: Around the World in 3,000 Hours of Egocentric Video
13 October 2021
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
Rohit Girdhar
Jackson Hamburger
Hao Jiang
Miao Liu
Xingyu Liu
Miguel Martin
Tushar Nagarajan
Ilija Radosavovic
Santhosh Kumar Ramakrishnan
Fiona Ryan
J. Sharma
Michael Wray
Mengmeng Xu
Eric Z. Xu
Chen Zhao
Siddhant Bansal
Dhruv Batra
Vincent Cartillier
Sean Crane
Tien Do
Morrie Doulaty
Akshay Erapalli
Christoph Feichtenhofer
A. Fragomeni
Qichen Fu
A. Gebreselasie
Cristina González
James M. Hillis
Xuhua Huang
Yifei Huang
Wenqi Jia
Weslie Khoo
J. Kolár
Satwik Kottur
Anurag Kumar
F. Landini
Chao Li
Yanghao Li
Zhenqiang Li
K. Mangalam
Raghava Modhugu
Jonathan Munro
Tullie Murrell
Takumi Nishiyasu
Will Price
Paola Ruiz Puentes
Merey Ramazanova
Leda Sari
Kiran Somasundaram
Audrey Southerland
Yusuke Sugano
Ruijie Tao
Minh Vo
Yuchen Wang
Xindi Wu
Takuma Yagi
Ziwei Zhao
Yunyi Zhu
Pablo Arbelaez
David J. Crandall
Dima Damen
G. Farinella
Christian Fuegen
Guohao Li
V. Ithapu
C. V. Jawahar
Hanbyul Joo
Kris M. Kitani
Haizhou Li
Richard Newcombe
A. Oliva
H. Park
James M. Rehg
Yoichi Sato
Jianbo Shi
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Ego4D: Around the World in 3,000 Hours of Egocentric Video"
50 / 791 papers shown
Title
Using Human Perception to Regularize Transfer Learning
Justin Dulay
Walter J. Scheirer
27
8
0
15 Nov 2022
Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions
Gaurav Verma
Vishwa Vinay
Ryan A. Rossi
Srijan Kumar
VLM
AAML
11
8
0
04 Nov 2022
Human in the loop approaches in multi-modal conversational task guidance system development
R. Manuvinakurike
Sovan Biswas
G. Raffa
R. Beckwith
A. Rhodes
Meng Shi
Gesem Gudino Mejia
Saurav Sahay
L. Nachman
38
2
0
03 Nov 2022
IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text
Seungwhan Moon
Andrea Madotto
Zhaojiang Lin
Alireza Dirafzoon
Aparajita Saraf
Amy Bearman
Babak Damavandi
VLM
20
36
0
26 Oct 2022
Refining Action Boundaries for One-stage Detection
Hanyuan Wang
Majid Mirmehdi
Dima Damen
Toby Perrett
ObjD
32
1
0
25 Oct 2022
Learning and Retrieval from Prior Data for Skill-based Imitation Learning
Soroush Nasiriany
Tian Gao
Ajay Mandlekar
Yuke Zhu
SSL
44
47
0
20 Oct 2022
Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization
Kyle Min
VLM
14
9
0
14 Oct 2022
Retrospectives on the Embodied AI Workshop
Matt Deitke
Dhruv Batra
Yonatan Bisk
Tommaso Campari
Angel X. Chang
...
Jesse Thomason
Alexander Toshev
Joanne Truong
Luca Weihs
Jiajun Wu
LM&Ro
37
51
0
13 Oct 2022
Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks
Albert Yu
Raymond J. Mooney
LM&Ro
32
19
0
10 Oct 2022
EgoTaskQA: Understanding Human Tasks in Egocentric Videos
Baoxiong Jia
Ting Lei
Song-Chun Zhu
Siyuan Huang
EgoV
30
62
0
08 Oct 2022
GNM: A General Navigation Model to Drive Any Robot
Dhruv Shah
A. Sridhar
Arjun Bhorkar
Noriaki Hirose
Sergey Levine
24
104
0
07 Oct 2022
Real-World Robot Learning with Masked Visual Pre-training
Ilija Radosavovic
Tete Xiao
Stephen James
Pieter Abbeel
Jitendra Malik
Trevor Darrell
SSL
156
241
0
06 Oct 2022
Compressed Vision for Efficient Video Understanding
Olivia Wiles
João Carreira
Iain Barr
Andrew Zisserman
Mateusz Malinowski
27
7
0
06 Oct 2022
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
Yecheng Jason Ma
Shagun Sodhani
Dinesh Jayaraman
Osbert Bastani
Vikash Kumar
Amy Zhang
SSL
OffRL
33
284
0
30 Sep 2022
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Weicheng Kuo
Huayu Chen
Xiuye Gu
A. Piergiovanni
A. Angelova
MLLM
VLM
ObjD
51
134
0
30 Sep 2022
Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal
Pedro Morgado
Unnat Jain
Abhinav Gupta
78
23
0
27 Sep 2022
Visual Object Tracking in First Person Vision
Matteo Dunnhofer
Antonino Furnari
G. Farinella
C. Micheloni
31
33
0
27 Sep 2022
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from Egocentric Video Data
Yichen Ding
Ziming Zhang
Yanhua Li
Xun Zhou
42
3
0
27 Sep 2022
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
Ahmad Darkhalil
Dandan Shan
Bin Zhu
Jian Ma
Amlan Kar
Richard E. L. Higgins
Sanja Fidler
David Fouhey
Dima Damen
VOS
50
98
0
26 Sep 2022
T2FPV: Dataset and Method for Correcting First-Person View Errors in Pedestrian Trajectory Prediction
Ben Stoler
Meghdeep Jana
Soonmin Hwang
Jean Oh
35
3
0
22 Sep 2022
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Zhijian Hou
Wanjun Zhong
Lei Ji
Difei Gao
Kun Yan
W. Chan
Chong-Wah Ngo
Zheng Shou
Nan Duan
AI4TS
39
24
0
22 Sep 2022
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain
Francesco Ragusa
Antonino Furnari
G. Farinella
EgoV
43
24
0
19 Sep 2022
WildQA: In-the-Wild Video Question Answering
Santiago Castro
Naihao Deng
Pingxuan Huang
Mihai Burzo
Rada Mihalcea
72
7
0
14 Sep 2022
Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity Representations
Victoria Manousaki
K. Papoutsakis
Antonis Argyros
19
3
0
12 Sep 2022
Grounded Affordance from Exocentric View
Hongcheng Luo
Wei Zhai
Jing Zhang
Yang Cao
Dacheng Tao
19
17
0
28 Aug 2022
Clustering Egocentric Images in Passive Dietary Monitoring with Self-Supervised Learning
Jiachuan Peng
Peilun Shi
Jianing Qiu
Xinwei Ju
Frank P.-W. Lo
...
M. McCrory
Edward Sazonov
M. Sun
Gary Frost
Benny Lo
20
4
0
25 Aug 2022
Self-Contained Entity Discovery from Captioned Videos
M. Ayoughi
P. Mettes
Paul T. Groth
28
2
0
13 Aug 2022
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation
Bolin Lai
Miao Liu
Fiona Ryan
James M. Rehg
ViT
40
33
0
08 Aug 2022
Fine-Grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications
Lingzhi Zhang
Shenghao Zhou
Simon Stent
Jianbo Shi
EgoV
34
60
0
07 Aug 2022
Negative Frames Matter in Egocentric Visual Query 2D Localization
Mengmeng Xu
Cheng-Yang Fu
Yanghao Li
Guohao Li
Juan-Manuel Perez-Rua
Tao Xiang
EgoV
10
11
0
03 Aug 2022
UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture
Hiroyasu Akada
Jian Wang
Soshi Shimada
Masaki Takahashi
Christian Theobalt
Vladislav Golyanik
EgoV
54
44
0
02 Aug 2022
Intention-Conditioned Long-Term Human Egocentric Action Forecasting
Esteve Valls Mascaro
Hyemin Ahn
Dongheui Lee
EgoV
24
28
0
25 Jul 2022
Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism
Md. Mohaiminul Islam
Gedas Bertasius
19
7
0
24 Jul 2022
EgoEnv: Human-centric environment representations from egocentric video
Tushar Nagarajan
Santhosh Kumar Ramakrishnan
Ruta Desai
James M. Hillis
Kristen Grauman
EgoV
36
19
0
22 Jul 2022
Video Swin Transformers for Egocentric Video Understanding @ Ego4D Challenges 2022
María Escobar
Laura Alexandra Daza
Cristina González
Jordi Pont-Tuset
Pablo Arbelaez
13
8
0
22 Jul 2022
Compound Prototype Matching for Few-shot Action Recognition
Yifei Huang
Lijin Yang
Yoichi Sato
27
43
0
12 Jul 2022
Fine-grained Activities of People Worldwide
J. Byrne
Greg Castañón
Zhongheng Li
G. Ettinger
18
3
0
11 Jul 2022
Egocentric Video-Language Pretraining @ Ego4D Challenge 2022
Kevin Qinghong Lin
Alex Jinpeng Wang
Mattia Soldan
Michael Wray
Rui Yan
...
Hongfa Wang
Dima Damen
Guohao Li
Wei Liu
Mike Zheng Shou
EgoV
32
7
0
04 Jul 2022
Video + CLIP Baseline for Ego4D Long-term Action Anticipation
Srijan Das
Michael S. Ryoo
VLM
CLIP
19
17
0
01 Jul 2022
ReLER@ZJU-Alibaba Submission to the Ego4D Natural Language Queries Challenge 2022
Na Liu
Xiaohan Wang
Xiaobo Li
Yi Yang
Yueting Zhuang
24
18
0
01 Jul 2022
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
112
111
0
23 Jun 2022
Behavior Transformers: Cloning
k
k
k
modes with one stone
Nur Muhammad (Mahi) Shafiullah
Zichen Jeff Cui
Ariuntuya Altanzaya
Lerrel Pinto
OffRL
28
223
0
22 Jun 2022
Context-aware Proposal Network for Temporal Action Detection
Xiang Wang
H. Zhang
Shiwei Zhang
Changxin Gao
Yuanjie Shao
Nong Sang
17
2
0
18 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
51
352
0
17 Jun 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
36
80
0
16 Jun 2022
Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Elad Ben-Avraham
Roei Herzig
K. Mangalam
Amir Bar
Anna Rohrbach
Leonid Karlinsky
Trevor Darrell
Amir Globerson
13
3
0
15 Jun 2022
The Metaverse Data Deluge: What Can We Do About It?
Beng Chin Ooi
Gang Chen
Mike Zheng Shou
K. Tan
A. Tung
X. Xiao
J. Yip
Meihui Zhang
31
10
0
14 Jun 2022
Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation
Wouter Van Gansbeke
Simon Vandenhende
Luc Van Gool
44
55
0
13 Jun 2022
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Elad Ben-Avraham
Roei Herzig
K. Mangalam
Amir Bar
Anna Rohrbach
Leonid Karlinsky
Trevor Darrell
Amir Globerson
19
0
0
13 Jun 2022
Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey
Takehiko Ohkawa
Ryosuke Furuta
Yoichi Sato
3DH
27
20
0
05 Jun 2022
Previous
1
2
3
...
14
15
16
Next