ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14712
  4. Cited By
Assembly101: A Large-Scale Multi-View Video Dataset for Understanding
  Procedural Activities

Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

28 March 2022
Fadime Sener
Dibyadip Chatterjee
Daniel Shelepov
Kun He
Dipika Singhania
Robert Y. Wang
Angela Yao
    VGen
ArXivPDFHTML

Papers citing "Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities"

50 / 132 papers shown
Title
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
Yunseok Jang
Yeda Song
Sungryull Sohn
Lajanugen Logeswaran
Tiange Luo
Dong-Ki Kim
Kyunghoon Bae
Honglak Lee
VGen
0
0
0
19 May 2025
Thoughts without Thinking: Reconsidering the Explanatory Value of Chain-of-Thought Reasoning in LLMs through Agentic Pipelines
Thoughts without Thinking: Reconsidering the Explanatory Value of Chain-of-Thought Reasoning in LLMs through Agentic Pipelines
R. Manuvinakurike
Emanuel Moss
E. A. Watkins
Saurav Sahay
G. Raffa
L. Nachman
LRM
31
0
0
01 May 2025
Hierarchical and Multimodal Data for Daily Activity Understanding
Hierarchical and Multimodal Data for Daily Activity Understanding
Ghazal Kaviani
Yavuz Yarici
Seulgi Kim
Mohit Prabhushankar
Ghassan AlRegib
Mashhour Solh
Ameya Patil
54
0
0
24 Apr 2025
Action Anticipation from SoccerNet Football Video Broadcasts
Action Anticipation from SoccerNet Football Video Broadcasts
Mohamad Dalal
Artur Xarles
A. Cioppa
Silvio Giancola
Marc Van Droogenbroeck
Bernard Ghanem
Albert Clapés
Sergio Escalera
T. Moeslund
AI4TS
36
0
0
16 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
J. Xu
Y. Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Y. Zhang
Rui Feng
Weidi Xie
DiffM
51
1
0
16 Apr 2025
HUMOTO: A 4D Dataset of Mocap Human Object Interactions
HUMOTO: A 4D Dataset of Mocap Human Object Interactions
Jiaxin Lu
Chun-Hao Paul Huang
Uttaran Bhattacharya
Qixing Huang
Yi Zhou
42
0
0
14 Apr 2025
The Invisible EgoHand: 3D Hand Forecasting through EgoBody Pose Estimation
The Invisible EgoHand: 3D Hand Forecasting through EgoBody Pose Estimation
Masashi Hatano
Zhifan Zhu
Hideo Saito
Dima Damen
EgoV
26
0
0
11 Apr 2025
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Dibyadip Chatterjee
Edoardo Remelli
Yale Song
Bugra Tekin
Abhay Mittal
...
Shreyas Hampali
Eric Sauser
Shugao Ma
Angela Yao
Fadime Sener
VLM
46
0
0
10 Apr 2025
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
Sanjoy Kundu
Shanmukha Vellamchetti
Sathyanarayanan N. Aakur
EgoV
52
0
0
04 Apr 2025
Towards Generalizing Temporal Action Segmentation to Unseen Views
Towards Generalizing Temporal Action Segmentation to Unseen Views
Emad Bahrami
Olga Zatsarynna
Gianpiero Francesca
Juergen Gall
EgoV
46
0
0
03 Apr 2025
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Wei-Jin Huang
Yuan-Ming Li
Zhi-Wei Xia
Yu-Ming Tang
Kun-Yu Lin
Jian-Fang Hu
Wei-Shi Zheng
47
0
0
28 Mar 2025
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung
Frangil Ramirez
Juhyung Ha
Yi-Ting Chen
David J. Crandall
Yi-Hsuan Tsai
45
0
0
27 Mar 2025
ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning
ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning
Kailin Li
Puhao Li
Tengyu Liu
Yuyang Li
Siyuan Huang
48
3
0
27 Mar 2025
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang
Fadime Sener
Angela Yao
OffRL
56
1
0
24 Mar 2025
LLaVAction: evaluating and training multi-modal large language models for action recognition
LLaVAction: evaluating and training multi-modal large language models for action recognition
Shaokai Ye
Haozhe Qi
Alexander Mathis
Mackenzie W. Mathis
68
1
0
24 Mar 2025
Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation
Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation
Zhanzhong Pang
Fadime Sener
Shrinivas Ramasubramanian
Angela Yao
56
1
0
24 Mar 2025
The CASTLE 2024 Dataset: Advancing the Art of Multimodal Understanding
The CASTLE 2024 Dataset: Advancing the Art of Multimodal Understanding
Luca Rossetto
Werner Bailer
Duc-Tien Dang-Nguyen
Graham Healy
Björn Þór Jónsson
...
Florian Spiess
Allie Tran
Minh-Triet Tran
Quang-Linh Tran
C. Gurrin
VGen
35
0
0
21 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
68
25
0
18 Mar 2025
EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera
EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera
Luming Wang
Hao-miao Shi
X. Yin
Kailun Yang
Kaiwei Wang
Jian Bai
EgoV
SLR
82
0
0
16 Mar 2025
Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding
Haoyu Zhang
Qiaohui Chu
Meng Liu
Yunxiao Wang
Bin Wen
Fan Yang
Tingting Gao
Di Zhang
Yaowei Wang
Liqiang Nie
EgoV
75
0
0
12 Mar 2025
CLAD: Constrained Latent Action Diffusion for Vision-Language Procedure Planning
Lei Shi
Andreas Bulling
DiffM
52
1
0
09 Mar 2025
SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic
Yi Yang
Wei Wang
Yifei Liu
Linfeng Dong
Hao Wu
Mingxin Zhang
Zhihang Zhong
Xiao-Fu Sun
54
1
0
09 Mar 2025
End-to-End Action Segmentation Transformer
Tieqiao Wang
Sinisa Todorovic
ViT
39
0
0
08 Mar 2025
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Mark YU
Wenbo Hu
Jinbo Xing
Ying Shan
VGen
87
3
0
07 Mar 2025
EgoLife: Towards Egocentric Life Assistant
Jingkang Yang
Shuai Liu
Hongming Guo
Yuhao Dong
X. Zhang
...
Joerg Widmer
Francesco Gringoli
Lei Yang
Bo Li
Z. Liu
EgoV
54
2
0
05 Mar 2025
Data Augmentation for Instruction Following Policies via Trajectory Segmentation
Niklas Höpner
Ilaria Tiddi
H. V. Hoof
47
0
0
25 Feb 2025
Task Graph Maximum Likelihood Estimation for Procedural Activity Understanding in Egocentric Videos
Task Graph Maximum Likelihood Estimation for Procedural Activity Understanding in Egocentric Videos
Luigi Seminara
G. Farinella
Antonino Furnari
77
0
0
25 Feb 2025
Learning Human Skill Generators at Key-Step Levels
Learning Human Skill Generators at Key-Step Levels
Yilu Wu
Chenhui Zhu
Shuai Wang
Hanlin Wang
Jing Wang
Zhaoxiang Zhang
Limin Wang
VGen
119
0
0
12 Feb 2025
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos
Luigi Seminara
G. Farinella
Antonino Furnari
64
7
0
10 Jan 2025
Graph-Based Multimodal and Multi-view Alignment for Keystep Recognition
Graph-Based Multimodal and Multi-view Alignment for Keystep Recognition
Julia Lee Romero
Kyle Min
Subarna Tripathi
Morteza Karimzadeh
35
0
0
07 Jan 2025
CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition
CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition
Yuhang Wen
Mengyuan Liu
Songtao Wu
Beichen Ding
45
0
0
31 Dec 2024
Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Y. Huang
Jilan Xu
Baoqi Pei
Yuping He
Guo Chen
...
Kunpeng Li
C. Yuan
Yidan Wang
Yu Qiao
L. Wang
78
4
0
31 Dec 2024
emg2pose: A Large and Diverse Benchmark for Surface Electromyographic
  Hand Pose Estimation
emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose Estimation
Sasha Salter
Richard Warren
Collin Schlager
Adrian Spurr
Shangchen Han
...
Robert Y. Wang
Nathan Danielson
Josh Merel
Eftychios Pnevmatikakis
Jesse Marshall
63
1
0
02 Dec 2024
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
Rundi Wu
Ruiqi Gao
Ben Poole
Alex Trevithick
Changxi Zheng
Jonathan T. Barron
Aleksander Holyñski
VGen
85
24
0
27 Nov 2024
ACE: Action Concept Enhancement of Video-Language Models in Procedural
  Videos
ACE: Action Concept Enhancement of Video-Language Models in Procedural Videos
Reza Ghoddoosian
Nakul Agarwal
Isht Dwivedi
Behzad Darisuh
68
0
0
23 Nov 2024
Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation
Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation
Rohith Peddi
Saurabh
Ayush Abhay Shrivastava
Parag Singla
Vibhav Gogate
82
0
0
20 Nov 2024
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos
Yunong Liu
Cristobal Eyzaguirre
Manling Li
Shubh Khanna
Juan Carlos Niebles
Vineeth Ravi
Saumitra Mishra
Weiyu Liu
Jiajun Wu
78
1
0
18 Nov 2024
TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake
  Detection in PRocedural EGOcentric Videos
TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos
Leonardo Plini
Luca Scofano
Edoardo De Matteis
Guido Maria DÁmely di Melendugno
Alessandro Flaborea
Andrea Sanchietti
G. Farinella
Fabio Galasso
Antonino Furnari
EgoV
LRM
48
1
0
04 Nov 2024
Egocentric and Exocentric Methods: A Short Survey
Egocentric and Exocentric Methods: A Short Survey
Anirudh Thatipelli
Shao-Yuan Lo
Amit K. Roy-Chowdhury
EgoV
42
2
0
27 Oct 2024
Human Action Anticipation: A Survey
Human Action Anticipation: A Survey
Bolin Lai
Sam Toyer
Tushar Nagarajan
Rohit Girdhar
S. Zha
James M. Rehg
Kris M. Kitani
Kristen Grauman
Ruta Desai
Miao Liu
AI4TS
41
1
0
17 Oct 2024
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos Referring to Procedural Texts
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos Referring to Procedural Texts
Yuto Haneji
Taichi Nishimura
Hirotaka Kameko
Keisuke Shirai
Tomoya Yoshida
Keiya Kajimura
Koki Yamamoto
Taiyu Cui
Tomohiro Nishimoto
Shinsuke Mori
EgoV
49
0
0
07 Oct 2024
VEDIT: Latent Prediction Architecture For Procedural Video
  Representation Learning
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
Han Lin
Tushar Nagarajan
Nicolas Ballas
Mido Assran
Mojtaba Komeili
Joey Tianyi Zhou
Koustuv Sinha
AI4TS
54
3
0
04 Oct 2024
EAGLE: Egocentric AGgregated Language-video Engine
EAGLE: Egocentric AGgregated Language-video Engine
Jing Bi
Yunlong Tang
Luchuan Song
A. Vosoughi
Nguyen Nguyen
Chenliang Xu
45
8
0
26 Sep 2024
A vision-based framework for human behavior understanding in industrial
  assembly lines
A vision-based framework for human behavior understanding in industrial assembly lines
Konstantinos Papoutsakis
Nikolaos Bakalos
Konstantinos Fragkoulis
Athena Zacharia
Georgia Kapetadimitri
Maria Pateraki
31
1
0
25 Sep 2024
Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on
  Large-Scale Hand Images in the Wild
Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild
Nie Lin
Takehiko Ohkawa
Mingfang Zhang
Yifei Huang
Ryosuke Furuta
Yoichi Sato
3DH
28
2
0
15 Sep 2024
ChildPlay-Hand: A Dataset of Hand Manipulations in the Wild
ChildPlay-Hand: A Dataset of Hand Manipulations in the Wild
Arya Farkhondeh
Samy Tafasca
J. Odobez
27
0
0
14 Sep 2024
Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit
  Adjustment
Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit Adjustment
Zhanzhong Pang
Fadime Sener
Shrinivas Ramasubramanian
Angela Yao
35
2
0
19 Aug 2024
COM Kitchens: An Unedited Overhead-view Video Dataset as a
  Vision-Language Benchmark
COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark
Koki Maeda
Tosho Hirasawa
Atsushi Hashimoto
Jun Harashima
Leszek Rybicki
Yusuke Fukasawa
Yoshitaka Ushiku
48
0
0
05 Aug 2024
PrISM-Observer: Intervention Agent to Help Users Perform Everyday
  Procedures Sensed using a Smartwatch
PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch
Riku Arakawa
Hiromu Yakura
Mayank Goel
19
5
0
23 Jul 2024
Probing Fine-Grained Action Understanding and Cross-View Generalization
  of Foundation Models
Probing Fine-Grained Action Understanding and Cross-View Generalization of Foundation Models
Thinesh Thiyakesan Ponbagavathi
Kunyu Peng
Alina Roitberg
42
1
0
22 Jul 2024
123
Next