Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.07058
Cited By
Ego4D: Around the World in 3,000 Hours of Egocentric Video
13 October 2021
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
Rohit Girdhar
Jackson Hamburger
Hao Jiang
Miao Liu
Xingyu Liu
Miguel Martin
Tushar Nagarajan
Ilija Radosavovic
Santhosh Kumar Ramakrishnan
Fiona Ryan
J. Sharma
Michael Wray
Mengmeng Xu
Eric Z. Xu
Chen Zhao
Siddhant Bansal
Dhruv Batra
Vincent Cartillier
Sean Crane
Tien Do
Morrie Doulaty
Akshay Erapalli
Christoph Feichtenhofer
A. Fragomeni
Qichen Fu
A. Gebreselasie
Cristina González
James M. Hillis
Xuhua Huang
Yifei Huang
Wenqi Jia
Weslie Khoo
J. Kolár
Satwik Kottur
Anurag Kumar
F. Landini
Chao Li
Yanghao Li
Zhenqiang Li
K. Mangalam
Raghava Modhugu
Jonathan Munro
Tullie Murrell
Takumi Nishiyasu
Will Price
Paola Ruiz Puentes
Merey Ramazanova
Leda Sari
Kiran Somasundaram
Audrey Southerland
Yusuke Sugano
Ruijie Tao
Minh Vo
Yuchen Wang
Xindi Wu
Takuma Yagi
Ziwei Zhao
Yunyi Zhu
Pablo Arbelaez
David J. Crandall
Dima Damen
G. Farinella
Christian Fuegen
Guohao Li
V. Ithapu
C. V. Jawahar
Hanbyul Joo
Kris Kitani
Haizhou Li
Richard Newcombe
A. Oliva
H. Park
James M. Rehg
Yoichi Sato
Jianbo Shi
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Ego4D: Around the World in 3,000 Hours of Egocentric Video"
50 / 791 papers shown
Title
A Survey on Robotics with Foundation Models: toward Embodied AI
Zhiyuan Xu
Kun Wu
Junjie Wen
Jinming Li
Ning Liu
Zhengping Che
Jian Tang
AI4CE
LRM
LM&Ro
31
24
0
04 Feb 2024
Large Language Models for Time Series: A Survey
Xiyuan Zhang
Ranak Roy Chowdhury
Rajesh K. Gupta
Jingbo Shang
AI4TS
85
55
0
02 Feb 2024
FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
Takuma Yagi
Misaki Ohashi
Yifei Huang
Ryosuke Furuta
Shungo Adachi
Toutai Mitsuyama
Yoichi Sato
28
5
0
01 Feb 2024
RESPRECT: Speeding-up Multi-fingered Grasping with Residual Reinforcement Learning
Federico Ceola
Lorenzo Rosasco
Lorenzo Natale
40
5
0
26 Jan 2024
Incorporating simulated spatial context information improves the effectiveness of contrastive learning models
Lizhen Zhu
James Z. Wang
Wonseuk Lee
Bradley P. Wyble
40
2
0
26 Jan 2024
A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver Interaction in Los Angeles
Benjamin A.T. Grahama
Lauren Brown
Georgios Chochlakis
Morteza Dehghani
Raquel Delerme
...
Mayaguez Salinas
Michael Sierra-Arévalo
Jackson Trager
Nicholas Weller
Shrikanth Narayan
HAI
18
3
0
24 Jan 2024
On the Efficacy of Text-Based Input Modalities for Action Anticipation
Apoorva Beedu
Karan Samel
Irfan Essa
55
2
0
23 Jan 2024
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
Jiaming Zhou
Junwei Liang
Kun-Yu Lin
Jinrui Yang
Wei-Shi Zheng
VLM
21
7
0
22 Jan 2024
A Survey on African Computer Vision Datasets, Topics and Researchers
Abdul-Hakeem Omotayo
Ashery Mbilinyi
L. Ismaila
Houcemeddine Turki
Mahmoud Abdien
...
G. Dovonon
Zainab Akinjobi
Daniel A. Ajisafe
O. Adegboro
Mennatullah Siam
24
2
0
21 Jan 2024
Exploring Missing Modality in Multimodal Egocentric Datasets
Merey Ramazanova
Alejandro Pardo
Humam Alwassel
Guohao Li
EgoV
41
4
0
21 Jan 2024
General Flow as Foundation Affordance for Scalable Robot Learning
Chengbo Yuan
Chuan Wen
Tong Zhang
Yang Gao
AI4CE
26
31
0
21 Jan 2024
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke
Bert De Brabandere
DiffM
46
11
0
18 Jan 2024
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
Guangzhao Dai
Xiangbo Shu
Wenhao Wu
Rui Yan
Jiachao Zhang
VLM
29
5
0
18 Jan 2024
ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization
Weiyao Wang
Pierre Gleize
Hao Tang
Xingyu Chen
Kevin J Liang
Matt Feiszli
28
1
0
17 Jan 2024
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription
Alon Vinnikov
Amir Ivry
Aviv Hurvitz
Igor Abramovski
S. Koubi
...
S. Sivasankaran
Yifan Gong
Min Tang
Huaming Wang
Eyal Krupka
41
20
0
16 Jan 2024
EgoGen: An Egocentric Synthetic Data Generator
Gen Li
Kai Zhao
Siwei Zhang
X. Lyu
Mihai Dusmanu
Yan Zhang
Marc Pollefeys
Siyu Tang
EgoV
VGen
42
14
0
16 Jan 2024
TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding
Yun-Hai Liu
Haolin Yang
Xu Si
Ling Liu
Zipeng Li
Yuxiang Zhang
Yebin Liu
Li Yi
63
22
0
16 Jan 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
69
35
0
16 Jan 2024
FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos
S. DarshanSingh
Zeeshan Khan
Makarand Tapaswi
VLM
CLIP
36
3
0
15 Jan 2024
Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation
Yuanchen Ju
Kaizhe Hu
Guowei Zhang
Gu Zhang
Mingrun Jiang
Huazhe Xu
41
41
0
15 Jan 2024
Distilling Vision-Language Models on Millions of Videos
Yue Zhao
Long Zhao
Xingyi Zhou
Jialin Wu
Chun-Te Chu
...
Hartwig Adam
Ting Liu
Boqing Gong
Philipp Krahenbuhl
Liangzhe Yuan
VLM
34
13
0
11 Jan 2024
Dr
2
^2
2
Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
Chen Zhao
Shuming Liu
K. Mangalam
Guocheng Qian
Fatimah Zohra
Abdulmohsen Alghannam
Jitendra Malik
Guohao Li
54
3
0
08 Jan 2024
Detours for Navigating Instructional Videos
Kumar Ashutosh
Zihui Xue
Tushar Nagarajan
Kristen Grauman
34
6
0
03 Jan 2024
GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation
Zifan Wang
Junyu Chen
Ziqing Chen
Pengwei Xie
Rui Chen
Li Yi
34
9
0
01 Jan 2024
Retrieval-Augmented Egocentric Video Captioning
Jilan Xu
Yifei Huang
Junlin Hou
Guo Chen
Yue Zhang
Rui Feng
Weidi Xie
EgoV
62
30
0
01 Jan 2024
3D Human Pose Perception from Egocentric Stereo Videos
Hiroyasu Akada
Jian Wang
Vladislav Golyanik
Christian Theobalt
EgoV
40
13
0
30 Dec 2023
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
56
84
0
29 Dec 2023
A Simple LLM Framework for Long-Range Video Question-Answering
Ce Zhang
Taixi Lu
Md. Mohaiminul Islam
Ziyang Wang
Shoubin Yu
Mohit Bansal
Gedas Bertasius
110
82
0
28 Dec 2023
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Jiasen Lu
Christopher Clark
Sangho Lee
Zichen Zhang
Savya Khosla
Ryan Marten
Derek Hoiem
Aniruddha Kembhavi
VLM
MLLM
40
147
0
28 Dec 2023
HyperMix: Out-of-Distribution Detection and Classification in Few-Shot Settings
Nikhil Mehta
Kevin J Liang
Jing Huang
Fu-Jen Chu
Li Yin
Tal Hassner
OODD
38
2
0
22 Dec 2023
CaptainCook4D: A dataset for understanding errors in procedural activities
Rohith Peddi
Shivvrat Arya
B. Challa
Likhitha Pallapothula
Akshay Vyas
...
Vasundhara Komaragiri
Eric D. Ragan
Nicholas Ruozzi
Yu Xiang
Vibhav Gogate
69
8
0
22 Dec 2023
Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Hongtao Wu
Ya Jing
Chi-Hou Cheang
Guangzeng Chen
Jiafeng Xu
Xinghang Li
Minghuan Liu
Hang Li
Tao Kong
35
96
0
20 Dec 2023
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Wenqi Jia
Miao Liu
Hao Jiang
Ishwarya Ananthabhotla
James M. Rehg
V. Ithapu
Ruohan Gao
EgoV
23
6
0
20 Dec 2023
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar
Yongqin Xian
A. Tonioni
Andrew Zisserman
Federico Tombari
38
12
0
19 Dec 2023
Learning Object State Changes in Videos: An Open-World Perspective
Zihui Xue
Kumar Ashutosh
Kristen Grauman
VGen
36
18
0
19 Dec 2023
Grounded Question-Answering in Long Egocentric Videos
Shangzhe Di
Weidi Xie
37
23
0
11 Dec 2023
RGNet: A Unified Clip Retrieval and Grounding Network for Long Videos
Tanveer Hannan
Md. Mohaiminul Islam
Thomas Seidl
Gedas Bertasius
28
3
0
11 Dec 2023
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning
Yi Chen
Yuying Ge
Yixiao Ge
Mingyu Ding
Bohao Li
Rui Wang
Rui-Lan Xu
Ying Shan
Xihui Liu
LLMAG
ELM
LRM
27
10
0
11 Dec 2023
BaRiFlex: A Robotic Gripper with Versatility and Collision Robustness for Robot Learning
Gu-Cheol Jeong
Arpit Bahety
Gabriel Pedraza
Ashish D. Deshpande
Roberto Martín-Martín
24
2
0
08 Dec 2023
Reconstructing Hands in 3D with Transformers
Georgios Pavlakos
Dandan Shan
Ilija Radosavovic
Angjoo Kanazawa
David Fouhey
Jitendra Malik
3DH
27
103
0
08 Dec 2023
MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Hongjie Zhang
Yi Liu
Lu Dong
Yifei Huang
Z. Ling
Yali Wang
Limin Wang
Yu Qiao
23
25
0
08 Dec 2023
LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
Ying Wang
Yanlai Yang
Mengye Ren
49
15
0
07 Dec 2023
Instance Tracking in 3D Scenes from Egocentric Videos
Yunhan Zhao
Haoyu Ma
Shu Kong
Charless C. Fowlkes
3DPC
36
4
0
07 Dec 2023
LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Bolin Lai
Xiaoliang Dai
Lawrence Chen
Guan Pang
James M. Rehg
Miao Liu
41
15
0
06 Dec 2023
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin
Antonino Furnari
Kyle Min
Subarna Tripathi
G. Farinella
EgoV
27
12
0
06 Dec 2023
HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
Trong-Thuan Nguyen
Pha Nguyen
Khoa Luu
37
12
0
05 Dec 2023
Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection?
Rosario Leonardi
Antonino Furnari
Francesco Ragusa
G. Farinella
EgoV
23
3
0
05 Dec 2023
Zero-Shot Video Question Answering with Procedural Programs
Rohan Choudhury
Koichiro Niinuma
Kris M. Kitani
László A. Jeni
24
21
0
01 Dec 2023
Sequential Modeling Enables Scalable Learning for Large Vision Models
Yutong Bai
Xinyang Geng
K. Mangalam
Amir Bar
Alan Yuille
Trevor Darrell
Jitendra Malik
Alexei A. Efros
MLLM
VLM
29
156
0
01 Dec 2023
Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans
Homanga Bharadhwaj
Abhi Gupta
Vikash Kumar
Shubham Tulsiani
LM&Ro
38
38
0
01 Dec 2023
Previous
1
2
3
...
8
9
10
...
14
15
16
Next