Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.07058
Cited By
Ego4D: Around the World in 3,000 Hours of Egocentric Video
13 October 2021
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
Rohit Girdhar
Jackson Hamburger
Hao Jiang
Miao Liu
Xingyu Liu
Miguel Martin
Tushar Nagarajan
Ilija Radosavovic
Santhosh Kumar Ramakrishnan
Fiona Ryan
J. Sharma
Michael Wray
Mengmeng Xu
Eric Z. Xu
Chen Zhao
Siddhant Bansal
Dhruv Batra
Vincent Cartillier
Sean Crane
Tien Do
Morrie Doulaty
Akshay Erapalli
Christoph Feichtenhofer
A. Fragomeni
Qichen Fu
A. Gebreselasie
Cristina González
James M. Hillis
Xuhua Huang
Yifei Huang
Wenqi Jia
Weslie Khoo
J. Kolár
Satwik Kottur
Anurag Kumar
F. Landini
Chao Li
Yanghao Li
Zhenqiang Li
K. Mangalam
Raghava Modhugu
Jonathan Munro
Tullie Murrell
Takumi Nishiyasu
Will Price
Paola Ruiz Puentes
Merey Ramazanova
Leda Sari
Kiran Somasundaram
Audrey Southerland
Yusuke Sugano
Ruijie Tao
Minh Vo
Yuchen Wang
Xindi Wu
Takuma Yagi
Ziwei Zhao
Yunyi Zhu
Pablo Arbelaez
David J. Crandall
Dima Damen
G. Farinella
Christian Fuegen
Guohao Li
V. Ithapu
C. V. Jawahar
Hanbyul Joo
Kris M. Kitani
Haizhou Li
Richard Newcombe
A. Oliva
H. Park
James M. Rehg
Yoichi Sato
Jianbo Shi
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Ego4D: Around the World in 3,000 Hours of Egocentric Video"
50 / 791 papers shown
Title
Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes
Aran Nayebi
R. Rajalingham
M. Jazayeri
G. R. Yang
36
19
0
19 May 2023
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
Siyuan Huang
Zhengkai Jiang
Hao Dong
Yu Qiao
Peng Gao
Hongsheng Li
LM&Ro
33
93
0
18 May 2023
Going Denser with Open-Vocabulary Part Segmentation
Pei Sun
Shoufa Chen
Chenchen Zhu
Fanyi Xiao
Ping Luo
Saining Xie
Zhicheng Yan
ObjD
VLM
27
45
0
18 May 2023
Paxion: Patching Action Knowledge in Video-Language Foundation Models
Zhenhailong Wang
Ansel Blume
Sha Li
Genglin Liu
Jaemin Cho
Zineng Tang
Joey Tianyi Zhou
Heng Ji
KELM
VGen
25
26
0
18 May 2023
MMG-Ego4D: Multi-Modal Generalization in Egocentric Action Recognition
Xinyu Gong
S. Mohan
Naina Dhingra
Jean-Charles Bazin
Yilei Li
Zhangyang Wang
Rakesh Ranjan
EgoV
56
17
0
12 May 2023
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
ViT
VLM
33
73
0
11 May 2023
Towards a Better Understanding of the Computer Vision Research Community in Africa
Abdul-Hakeem Omotayo
Mai Gamal
E. Ehab
G. Dovonon
Zainab Akinjobi
...
Abigail Oppong
Yvan Pimi
Karim Gamal
Roýa-CV4Africa
Mennatullah Siam
33
4
0
11 May 2023
Learning Video-Conditioned Policies for Unseen Manipulation Tasks
Elliot Chane-Sane
Cordelia Schmid
Ivan Laptev
27
18
0
10 May 2023
ImageBind: One Embedding Space To Bind Them All
Rohit Girdhar
Alaaeldin El-Nouby
Zhuang Liu
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
VLM
39
850
0
09 May 2023
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai
Fiona Ryan
Wenqi Jia
Miao Liu
James M. Rehg
EgoV
32
8
0
06 May 2023
Learning Hand-Held Object Reconstruction from In-The-Wild Videos
Aditya Prakash
Matthew Chang
Matthew Jin
Saurabh Gupta
3DH
20
3
0
04 May 2023
ContactArt: Learning 3D Interaction Priors for Category-level Articulated Object and Hand Poses Estimation
Zehao Zhu
Jiashun Wang
Yuzhe Qin
Deqing Sun
Varun Jampani
Xiaolong Wang
DiffM
45
14
0
02 May 2023
AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation
Takehiko Ohkawa
Kun He
Fadime Sener
Tomás Hodan
Luan Tran
Cem Keskin
27
38
0
24 Apr 2023
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
50
274
0
24 Apr 2023
SViTT: Temporal Learning of Sparse Video-Text Transformers
Yi Li
Kyle Min
Subarna Tripathi
Nuno Vasconcelos
28
12
0
18 Apr 2023
Grounding Classical Task Planners via Vision-Language Models
Xiaohan Zhang
Yan Ding
S. Amiri
Hao Yang
Andy Kaminski
Chad Esselink
Shiqi Zhang
23
17
0
17 Apr 2023
Pretrained Language Models as Visual Planners for Human Assistance
Dhruvesh Patel
H. Eghbalzadeh
Nitin Kamra
Michael L. Iuzzolino
Unnat Jain
Ruta Desai
LM&Ro
19
24
0
17 Apr 2023
Affordances from Human Videos as a Versatile Representation for Robotics
Shikhar Bahl
Russell Mendonca
Lili Chen
Unnat Jain
Deepak Pathak
44
164
0
17 Apr 2023
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision
Jiani Huang
Ziyang Li
Mayur Naik
Ser-Nam Lim
37
3
0
15 Apr 2023
WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition
Marius Bock
Hilde Kuehne
Kristof Van Laerhoven
Michael Moeller
EgoV
40
24
0
11 Apr 2023
SoccerNet-Caption: Dense Video Captioning for Soccer Broadcasts Commentaries
Hassan Mkhallati
A. Cioppa
Silvio Giancola
Guohao Li
Marc Van Droogenbroeck
30
33
0
10 Apr 2023
StillFast: An End-to-End Approach for Short-Term Object Interaction Anticipation
Francesco Ragusa
G. Farinella
Antonino Furnari
21
18
0
08 Apr 2023
Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes
A. Patil
Supriya Gadi Patil
Manyi Li
Matthew Fisher
Manolis Savva
Haotong Zhang
3DV
32
17
0
06 Apr 2023
Boundary-Denoising for Video Activity Localization
Mengmeng Xu
Mattia Soldan
Jialin Gao
Shuming Liu
Juan-Manuel Perez-Rua
Guohao Li
34
10
0
06 Apr 2023
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
60
6,839
0
05 Apr 2023
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?
Arjun Majumdar
Karmesh Yadav
Sergio Arnaud
Yecheng Jason Ma
Claire Chen
...
Dhruv Batra
Yixin Lin
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
LM&Ro
24
173
0
31 Mar 2023
What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
Brian Chen
Nina Shvetsova
Andrew Rouditchenko
D. Kondermann
Samuel Thomas
Shih-Fu Chang
Rogerio Feris
James R. Glass
Hilde Kuehne
40
7
0
29 Mar 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
29
15
0
29 Mar 2023
XAIR: A Framework of Explainable AI in Augmented Reality
Xuhai Xu
Anna Yu
Tanya R. Jonker
Kashyap Todi
Feiyu Lu
...
Narine Kokhlikyan
Fulton Wang
P. Sorenson
Sophie Kahyun Kim
Hrvoje Benko
39
49
0
28 Mar 2023
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
29
16
0
28 Mar 2023
GeoNet: Benchmarking Unsupervised Adaptation across Geographies
Tarun Kalluri
Wangdong Xu
Manmohan Chandraker
OOD
34
15
0
27 Mar 2023
Toward Human-Like Social Robot Navigation: A Large-Scale, Multi-Modal, Social Human Navigation Dataset
Duc M. Nguyen
Mohammad Nazeri
Amirreza Payandeh
A. Datar
Xuesu Xiao
52
30
0
27 Mar 2023
A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition
Andong Deng
Taojiannan Yang
Chong Chen
AI4TS
27
13
0
23 Mar 2023
Boosting Reinforcement Learning and Planning with Demonstrations: A Survey
Tongzhou Mu
H. Su
OffRL
35
1
0
23 Mar 2023
Egocentric Audio-Visual Object Localization
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
EgoV
29
30
0
23 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the Future
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny Lo
AI4MH
LM&MA
42
127
0
21 Mar 2023
PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining
G. Thomas
Ching-An Cheng
Ricky Loynd
Felipe Vieira Frujeri
Vibhav Vineet
Mihai Jalobeanu
Andrey Kolobov
SSL
29
8
0
15 Mar 2023
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
Yulin Pan
Xiangteng He
Biao Gong
Yiliang Lv
Yujun Shen
Yuxin Peng
Deli Zhao
43
12
0
15 Mar 2023
Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations
Jianren Wang
Sudeep Dasari
Mohan Kumar Srirama
Shubham Tulsiani
Abhi Gupta
SSL
61
15
0
14 Mar 2023
Vision-Language Models as Success Detectors
Yuqing Du
Ksenia Konyushkova
Misha Denil
A. Raju
Jessica Landon
Felix Hill
Nando de Freitas
Serkan Cabi
MLLM
LRM
91
77
0
13 Mar 2023
The Audio-Visual BatVision Dataset for Research on Sight and Sound
Amandine Brunetto
Sascha Hornauer
Stella X. Yu
Fabien Moutarde
41
3
0
13 Mar 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
223
320
0
08 Mar 2023
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&Ro
OffRL
LRM
AI4CE
98
156
0
07 Mar 2023
Decoupling Human and Camera Motion from Videos in the Wild
Vickie Ye
Georgios Pavlakos
Jitendra Malik
Angjoo Kanazawa
3DH
28
83
0
24 Feb 2023
Language-Driven Representation Learning for Robotics
Siddharth Karamcheti
Suraj Nair
Annie S. Chen
Thomas Kollar
Chelsea Finn
Dorsa Sadigh
Percy Liang
LM&Ro
SSL
47
145
0
24 Feb 2023
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play
Chen Wang
Linxi Fan
Jiankai Sun
Ruohan Zhang
Li Fei-Fei
Danfei Xu
Yuke Zhu
Anima Anandkumar
44
184
0
24 Feb 2023
Connecting Vision and Language with Video Localized Narratives
P. Voigtlaender
Soravit Changpinyo
Jordi Pont-Tuset
Radu Soricut
V. Ferrari
VGen
49
21
0
22 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
26
7
0
16 Feb 2023
Anticipating Next Active Objects for Egocentric Videos
Sanket Thakur
Cigdem Beyan
Pietro Morerio
Vittorio Murino
Alessio Del Bue
EgoV
39
6
0
13 Feb 2023
Ethical Considerations for Responsible Data Curation
Jerone T. A. Andrews
Dora Zhao
William Thong
Apostolos Modas
Orestis Papakyriakopoulos
Alice Xiang
17
19
0
07 Feb 2023
Previous
1
2
3
...
12
13
14
15
16
Next