Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.00343
Cited By
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines
29 April 2020
Dima Damen
Hazel Doughty
G. Farinella
Sanja Fidler
Antonino Furnari
Evangelos Kazakos
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines"
50 / 146 papers shown
Title
EgoPAT3Dv2: Predicting 3D Action Target from 2D Egocentric Vision for Human-Robot Interaction
Irving Fang
Yuzhong Chen
Yifan Wang
Jianghan Zhang
Qiushi Zhang
...
Xibo He
Weibo Gao
Hao Su
Yiming Li
Chen Feng
EgoV
61
2
0
08 Mar 2024
A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives
Simone Alberto Peirone
Francesca Pistilli
A. Alliegro
Giuseppe Averta
EgoV
115
7
0
05 Mar 2024
Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection
Pengfei Zhou
Weiqing Min
Jiajun Song
Yang Zhang
Shuqiang Jiang
76
11
0
14 Feb 2024
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
Jiaming Zhou
Junwei Liang
Kun-Yu Lin
Jinrui Yang
Wei-Shi Zheng
VLM
92
8
0
22 Jan 2024
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
Guangzhao Dai
Xiangbo Shu
Wenhao Wu
Rui Yan
Jiachao Zhang
VLM
108
7
0
18 Jan 2024
EgoGen: An Egocentric Synthetic Data Generator
Gen Li
Kai Zhao
Siwei Zhang
X. Lyu
Mihai Dusmanu
Yan Zhang
Marc Pollefeys
Siyu Tang
EgoV
VGen
109
15
0
16 Jan 2024
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
87
5
0
08 Jan 2024
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
60
4
0
08 Jan 2024
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Chenliang Xu
Jiebo Luo
Chenliang Xu
VLM
222
100
0
29 Dec 2023
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
71
0
0
24 Dec 2023
CaptainCook4D: A dataset for understanding errors in procedural activities
Rohith Peddi
Shivvrat Arya
B. Challa
Likhitha Pallapothula
Akshay Vyas
...
Vasundhara Komaragiri
Eric D. Ragan
Nicholas Ruozzi
Yu Xiang
Vibhav Gogate
102
14
0
22 Dec 2023
Early Action Recognition with Action Prototypes
G. Camporese
Alessandro Bergamo
Xunyu Lin
Joseph Tighe
Davide Modolo
EgoV
33
0
0
11 Dec 2023
PALM: Predicting Actions through Language Models
Sanghwan Kim
Daoji Huang
Yongqin Xian
Otmar Hilliges
Luc Van Gool
Xi Wang
VLM
81
14
0
29 Nov 2023
Object-based (yet Class-agnostic) Video Domain Adaptation
Dantong Niu
Amir Bar
Roei Herzig
Trevor Darrell
Anna Rohrbach
73
1
0
29 Nov 2023
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition
Jiaming Zhou
Hanjun Li
Kun-Yu Lin
Junwei Liang
70
1
0
28 Nov 2023
DiffAnt: Diffusion Models for Action Anticipation
Zeyun Zhong
Chengzhi Wu
Manuel Martin
Michael Voit
Juergen Gall
Jürgen Beyerer
DiffM
VGen
68
6
0
27 Nov 2023
D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction
Bowen Fu
Gu Wang
Chenyangguang Zhang
Yan Di
Ziqin Huang
Zhiying Leng
Fabian Manhardt
Xiangyang Ji
F. Tombari
77
2
0
23 Nov 2023
GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration
Naoki Wake
Atsushi Kanehira
Kazuhiro Sasabuchi
Jun Takamatsu
Katsushi Ikeuchi
LM&Ro
81
69
0
20 Nov 2023
Multi Sentence Description of Complex Manipulation Action Videos
Fatemeh Ziaeetabar
Reza Safabakhsh
S. Momtazi
M. Tamosiunaite
Florentin Wörgötter
77
3
0
13 Nov 2023
Aria-NeRF: Multimodal Egocentric View Synthesis
Jiankai Sun
Jianing Qiu
Chuanyang Zheng
Johnathan Tucker
Javier Yu
Mac Schwager
EgoV
96
5
0
11 Nov 2023
OtterHD: A High-Resolution Multi-modality Model
Yue Liu
Peiyuan Zhang
Jingkang Yang
Yuanhan Zhang
Fanyi Pu
Ziwei Liu
VLM
MLLM
100
66
0
07 Nov 2023
Characterizing Barriers and Technology Needs in the Kitchen for Blind and Low Vision People
Ru Wang
Nihan Zhou
Tam Nguyen
Sanbrita Mondal
Bilge Mutlu
Yuhang Zhao
72
2
0
09 Oct 2023
A Survey on Deep Learning Techniques for Action Anticipation
Zeyun Zhong
Manuel Martin
Michael Voit
Juergen Gall
Jürgen Beyerer
116
8
0
29 Sep 2023
CaSAR: Contact-aware Skeletal Action Recognition
Junan Lin
Zhichao Sun
Enjie Cao
Taein Kwon
Mahdi Rad
Marc Pollefeys
107
1
0
17 Sep 2023
Action Segmentation Using 2D Skeleton Heatmaps and Multi-Modality Fusion
Syed Waleed Hyder
Muhammad Usama
Anas Zafar
Muhammad Naufil
Fawad Javed Fateh
Andrey Konin
M. Zia
Quoc-Huy Tran
97
5
0
12 Sep 2023
SSVOD: Semi-Supervised Video Object Detection with Sparse Annotations
Tanvir Mahmud
Chun-Hao Liu
Burhaneddin Yaman
Diana Marculescu
54
4
0
04 Sep 2023
MOFO: MOtion FOcused Self-Supervision for Video Understanding
Mona Ahmadian
Frank Guerin
Andrew Gilbert
72
2
0
23 Aug 2023
Deep Metric Loss for Multimodal Learning
Sehwan Moon
Hyun-Yong Lee
55
0
0
21 Aug 2023
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
Qi Zhao
Shijie Wang
Ce Zhang
Changcheng Fu
Minh Quan Do
Nakul Agarwal
Kwonjoon Lee
Chen Sun
LM&Ro
137
51
0
31 Jul 2023
PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking
Yang Zheng
Adam W. Harley
Bokui Shen
Gordon Wetzstein
Leonidas Guibas
101
135
0
27 Jul 2023
Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting
Wentao Bao
Lele Chen
Libing Zeng
Zhong Li
Yinghao Xu
Junsong Yuan
Yubo Kong
68
16
0
17 Jul 2023
Streaming egocentric action anticipation: An evaluation scheme and approach
Antonino Furnari
G. Farinella
EgoV
62
3
0
29 Jun 2023
Action Anticipation with Goal Consistency
Olga Zatsarynna
Juergen Gall
112
10
0
26 Jun 2023
How can objects help action recognition?
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
108
18
0
20 Jun 2023
Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LRM
LM&Ro
71
3
0
26 May 2023
Learning Hand-Held Object Reconstruction from In-The-Wild Videos
Aditya Prakash
Matthew Chang
Matthew Jin
Saurabh Gupta
3DH
58
5
0
04 May 2023
MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Changxin Gao
Yingya Zhang
Deli Zhao
Nong Sang
84
45
0
03 Apr 2023
Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations
Yiwu Zhong
Licheng Yu
Yang Bai
Shangwen Li
Xueting Yan
Yin Li
AI4TS
106
34
0
31 Mar 2023
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu
Chuan Wen
Weirui Ye
Jiaming Song
Yang Gao
DiffM
VGen
64
43
0
27 Mar 2023
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification
Wenjie Zhu
M. Omar
85
22
0
19 Mar 2023
EgoViT: Pyramid Video Transformer for Egocentric Action Recognition
Chen-Ming Pan
Zhiqi Zhang
Senem Velipasalar
Yi Tian Xu
ViT
66
1
0
15 Mar 2023
HyRSM++: Hybrid Relation Guided Temporal Set Matching for Few-shot Action Recognition
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Zhe Zuo
Changxin Gao
Rong Jin
Nong Sang
103
26
0
09 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang
Mitesh Singh
Lorenzo Torresani
EgoV
126
26
0
03 Jan 2023
FEVA: Fast Event Video Annotation Tool
Snehesh Shrestha
William Sentosatio
Huiashu Peng
Cornelia Fermuller
Yiannis Aloimonos
79
5
0
01 Jan 2023
Inductive Attention for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
Oswald Lanz
81
1
0
17 Dec 2022
3rd Continual Learning Workshop Challenge on Egocentric Category and Instance Level Object Understanding
Lorenzo Pellegrini
Chenchen Zhu
Fanyi Xiao
Zhicheng Yan
Antonio Carta
Matthias De Lange
Vincenzo Lomonaco
Roshan Sumbaly
Pau Rodríguez López
David Vazquez
CLL
105
7
0
13 Dec 2022
Egocentric Video Task Translation
Zihui Xue
Yale Song
Kristen Grauman
Lorenzo Torresani
EgoV
73
13
0
13 Dec 2022
Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation
Zeyun Zhong
David Schneider
Michael Voit
Rainer Stiefelhagen
Jürgen Beyerer
121
47
0
23 Oct 2022
Robot Learning Theory of Mind through Self-Observation: Exploiting the Intentions-Beliefs Synergy
Francesca Bianco
D. Ognibene
66
2
0
17 Oct 2022
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain
Francesco Ragusa
Antonino Furnari
G. Farinella
EgoV
109
26
0
19 Sep 2022
Previous
1
2
3
Next