ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.00343
  4. Cited By
The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines

The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines

29 April 2020
Dima Damen
Hazel Doughty
G. Farinella
Sanja Fidler
Antonino Furnari
Evangelos Kazakos
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
    EgoV
ArXiv (abs)PDFHTML

Papers citing "The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines"

50 / 146 papers shown
Title
Proactive Assistant Dialogue Generation from Streaming Egocentric Videos
Proactive Assistant Dialogue Generation from Streaming Egocentric Videos
Yichi Zhang
Xin Luna Dong
Zhaojiang Lin
Andrea Madotto
Anuj Kumar
Babak Damavandi
J. Chai
Seungwhan Moon
55
0
0
06 Jun 2025
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision
Yuping He
Yifei Huang
Guo Chen
Lidong Lu
Baoqi Pei
Jilan Xu
Tong Lu
Yoichi Sato
EgoV
84
0
0
06 Jun 2025
Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360-Degree Firefighting Videos
Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360-Degree Firefighting Videos
Aditi Tiwari
Farzaneh Masoud
Dac Trong Nguyen
Jill Kraft
Heng Ji
Klara Nahrstedt
34
0
0
02 Jun 2025
Deep Temporal Reasoning in Video Language Models: A Cross-Linguistic Evaluation of Action Duration and Completion through Perfect Times
Deep Temporal Reasoning in Video Language Models: A Cross-Linguistic Evaluation of Action Duration and Completion through Perfect Times
Olga Loginova
Sofía Ortega Loguinova
LRM
35
0
0
01 Jun 2025
Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames
Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames
Sahithya Ravi
Gabriel Sarch
Vibhav Vineet
A. D. Wilson
Balasaravanan Thoravi Kumaravel
39
0
0
30 May 2025
A Probabilistic Jump-Diffusion Framework for Open-World Egocentric Activity Recognition
A Probabilistic Jump-Diffusion Framework for Open-World Egocentric Activity Recognition
Sanjoy Kundu
Shanmukha Vellamcheti
Sathyanarayanan N. Aakur
EgoV
32
0
0
28 May 2025
Predicting Implicit Arguments in Procedural Video Instructions
Predicting Implicit Arguments in Procedural Video Instructions
Anil Batra
Laura Sevilla-Lara
Marcus Rohrbach
Frank Keller
61
0
0
27 May 2025
Vision and Intention Boost Large Language Model in Long-Term Action Anticipation
Vision and Intention Boost Large Language Model in Long-Term Action Anticipation
Congqi Cao
Lanshu Hu
Yating Yu
Y. Zhang
VLM
441
0
0
03 May 2025
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
Sanjoy Kundu
Shanmukha Vellamchetti
Sathyanarayanan N. Aakur
EgoV
94
2
0
04 Apr 2025
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung
Frangil Ramirez
Juhyung Ha
Yi-Ting Chen
David J. Crandall
Yi-Hsuan Tsai
139
1
0
27 Mar 2025
EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos
EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos
Nathan Darjana
Ryo Fujii
Hideo Saito
Hiroki Kajita
114
0
0
24 Mar 2025
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Boshen Xu
Yuting Mei
Xinbi Liu
Sipeng Zheng
Qin Jin
VLMMDE
108
0
0
19 Mar 2025
DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction
DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction
Rui Wang
Q. Lohmeyer
Mirko Meboldt
Siyu Tang
3DGS
106
1
0
17 Mar 2025
EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera
EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera
Luming Wang
Hao-miao Shi
X. Yin
Kailun Yang
Kaiwei Wang
Jian Bai
EgoVSLR
132
0
0
16 Mar 2025
Quality Over Quantity? LLM-Based Curation for a Data-Efficient Audio-Video Foundation Model
Ali Vosoughi
Dimitra Emmanouilidou
H. Gamper
131
1
0
12 Mar 2025
Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding
Haoyu Zhang
Qiaohui Chu
Meng Liu
Yunxiao Wang
Bin Wen
Fan Yang
Yan Li
Di Zhang
Yaowei Wang
Liqiang Nie
EgoV
112
5
0
12 Mar 2025
EgoLife: Towards Egocentric Life Assistant
Jingkang Yang
Shuai Liu
Hongming Guo
Yuhao Dong
Xinyu Zhang
...
Joerg Widmer
Francesco Gringoli
Lei Yang
Bo Li
Ziwei Liu
EgoV
107
6
0
05 Mar 2025
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
Baoqi Pei
Yuanmin Huang
Jilan Xu
Guo Chen
Yuping He
...
Yali Wang
Weidi Xie
Yu Qiao
Leilei Gan
Limin Wang
96
2
0
02 Mar 2025
Learning Human Skill Generators at Key-Step Levels
Learning Human Skill Generators at Key-Step Levels
Yilu Wu
Chenhui Zhu
Shuai Wang
Hanlin Wang
Jing Wang
Zhaoxiang Zhang
Limin Wang
VGen
214
0
0
12 Feb 2025
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
332
2
0
18 Dec 2024
Detecting Activities of Daily Living in Egocentric Video to
  Contextualize Hand Use at Home in Outpatient Neurorehabilitation Settings
Detecting Activities of Daily Living in Egocentric Video to Contextualize Hand Use at Home in Outpatient Neurorehabilitation Settings
Adesh Kadambi
José Zariffa
EgoV
102
2
0
14 Dec 2024
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Lu Qiu
Yuying Ge
Yi Chen
Yixiao Ge
Ying Shan
Xihui Liu
LLMAGLRM
211
8
0
05 Dec 2024
Streaming Detection of Queried Event Start
Streaming Detection of Queried Event Start
Cristobal Eyzaguirre
Eric Tang
S. Buch
Adrien Gaidon
Jiajun Wu
Juan Carlos Niebles
116
0
0
04 Dec 2024
EgoCast: Forecasting Egocentric Human Pose in the Wild
EgoCast: Forecasting Egocentric Human Pose in the Wild
María Escobar
Juanita Puentes
Cristhian Forigua
Jordi Pont-Tuset
Kevis-Kokitsi Maninis
Pablo Arbeláez
EgoV
130
2
0
03 Dec 2024
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos Referring to Procedural Texts
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos Referring to Procedural Texts
Yuto Haneji
Taichi Nishimura
Hirotaka Kameko
Keisuke Shirai
Tomoya Yoshida
Keiya Kajimura
Koki Yamamoto
Taiyu Cui
Tomohiro Nishimoto
Shinsuke Mori
EgoV
84
0
0
07 Oct 2024
EgoLM: Multi-Modal Language Model of Egocentric Motions
EgoLM: Multi-Modal Language Model of Egocentric Motions
Fangzhou Hong
Vladimir Guzov
Hyo Jin Kim
Yuting Ye
Richard Newcombe
Ziwei Liu
Lingni Ma
78
4
0
26 Sep 2024
Open-Vocabulary Action Localization with Iterative Visual Prompting
Open-Vocabulary Action Localization with Iterative Visual Prompting
Naoki Wake
Atsushi Kanehira
Kazuhiro Sasabuchi
Jun Takamatsu
Katsushi Ikeuchi
VLM
84
1
0
30 Aug 2024
Unveiling Visual Biases in Audio-Visual Localization Benchmarks
Unveiling Visual Biases in Audio-Visual Localization Benchmarks
Liangyu Chen
Zihao Yue
Boshen Xu
Qin Jin
SSL
92
0
0
25 Aug 2024
Learning Precise Affordances from Egocentric Videos for Robotic
  Manipulation
Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Gen Li
Nikolaos Tsagkas
Jifei Song
Ruaridh Mon-Williams
S. Vijayakumar
Kun Shao
Laura Sevilla-Lara
80
9
0
19 Aug 2024
3D-Aware Instance Segmentation and Tracking in Egocentric Videos
3D-Aware Instance Segmentation and Tracking in Egocentric Videos
Yash Bhalgat
Vadim Tschernezki
Iro Laina
João F. Henriques
Andrea Vedaldi
Andrew Zisserman
VOS
69
2
0
19 Aug 2024
From Recognition to Prediction: Leveraging Sequence Reasoning for Action
  Anticipation
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
Xin Liu
Chao Hao
Zitong Yu
Huanjing Yue
Jingyu Yang
65
1
0
05 Aug 2024
PEAR: Phrase-Based Hand-Object Interaction Anticipation
PEAR: Phrase-Based Hand-Object Interaction Anticipation
Zichen Zhang
Hongcheng Luo
Wei Zhai
N. A. Ushakov
Yu Kang
99
6
0
31 Jul 2024
EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos
EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos
Aashish Rai
Srinath Sridhar
DiffM
75
4
0
30 Jul 2024
Decoupled Prompt-Adapter Tuning for Continual Activity Recognition
Decoupled Prompt-Adapter Tuning for Continual Activity Recognition
Di Fu
Thanh Vinh Vo
Haozhe Ma
Tze-Yun Leong
54
1
0
20 Jul 2024
QuIIL at T3 challenge: Towards Automation in Life-Saving Intervention
  Procedures from First-Person View
QuIIL at T3 challenge: Towards Automation in Life-Saving Intervention Procedures from First-Person View
T. Vuong
Doanh C. Bui
Jin Tae Kwak
54
0
0
18 Jul 2024
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation
Olga Zatsarynna
Emad Bahrami
Yazan Abu Farha
Gianpiero Francesca
Juergen Gall
132
2
0
16 Jul 2024
Open-Event Procedure Planning in Instructional Videos
Open-Event Procedure Planning in Instructional Videos
Yilu Wu
Hanlin Wang
Jing Wang
Limin Wang
93
1
0
06 Jul 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou
Teli Ma
Kun-Yu Lin
Ronghe Qiu
Zifan Wang
Junwei Liang
149
7
0
20 Jun 2024
Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in
  the Wild
Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild
Lingni Ma
Yuting Ye
Fangzhou Hong
Vladimir Guzov
Yifeng Jiang
...
C. Karen Liu
Ziwei Liu
Jakob Engel
R. D. Nardi
Richard Newcombe
94
25
0
14 Jun 2024
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric
  Videos
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos
Steven Abreu
Tiffany D. Do
Ruofei Du
Eric J. Gonzalez
Lee Payne
Daniel J. McDuff
Mar Gonzalez-Franco
76
2
0
14 Jun 2024
OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow
  Understanding
OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding
Ming Hu
Peng Xia
Lin Wang
Siyuan Yan
Feilong Tang
...
Xuelian Cheng
Jun Cheng
Chi Liu
Kaijing Zhou
Zongyuan Ge
98
21
0
11 Jun 2024
FoodSky: A Food-oriented Large Language Model that Passes the Chef and
  Dietetic Examination
FoodSky: A Food-oriented Large Language Model that Passes the Chef and Dietetic Examination
Pengfei Zhou
Weiqing Min
Chaoran Fu
Ying Jin
Mingyu Huang
Xiangyang Li
Shuhuan Mei
Shuqiang Jiang
88
10
0
11 Jun 2024
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World
  Egocentric Action Recognition
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World Egocentric Action Recognition
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LM&RoLRM
59
1
0
09 Jun 2024
Latent Logic Tree Extraction for Event Sequence Explanation from LLMs
Latent Logic Tree Extraction for Event Sequence Explanation from LLMs
Zitao Song
Chao Yang
Chaojie Wang
Bo An
Shuang Li
125
7
0
03 Jun 2024
EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric
  Views
EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views
Yuhang Yang
Wei Zhai
Chengfeng Wang
Chengjun Yu
Yang Cao
Zheng-jun Zha
108
6
0
22 May 2024
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual
  and Action Representations
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations
Puhao Li
Tengyu Liu
Yuyang Li
Muzhi Han
Haoran Geng
Shu Wang
Yixin Zhu
Song-Chun Zhu
Siyuan Huang
106
18
0
26 Apr 2024
CrossScore: Towards Multi-View Image Evaluation and Scoring
CrossScore: Towards Multi-View Image Evaluation and Scoring
Zirui Wang
Wenjing Bian
Omkar M. Parkhi
Yuheng Ren
V. Prisacariu
99
1
0
22 Apr 2024
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video
  Understanding
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Bo He
Hengduo Li
Young Kyun Jang
Menglin Jia
Xuefei Cao
Ashish Shah
Abhinav Shrivastava
Ser-Nam Lim
MLLM
133
101
0
08 Apr 2024
TE-TAD: Towards Full End-to-End Temporal Action Detection via
  Time-Aligned Coordinate Expression
TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression
Ho-Joong Kim
Jung-Ho Hong
Heejo Kong
Seong-Whan Lee
78
5
0
03 Apr 2024
LITA: Language Instructed Temporal-Localization Assistant
LITA: Language Instructed Temporal-Localization Assistant
De-An Huang
Shijia Liao
Subhashree Radhakrishnan
Hongxu Yin
Pavlo Molchanov
Zhiding Yu
Jan Kautz
VLM
97
56
0
27 Mar 2024
123
Next