In this report, we address the task of online mistake detection, which is vital in domains like industrial automation and education, where real-time video analysis allows human operators to correct errors as they occur. While previous work focuses on procedural errors involving action order, broader error types must be addressed for real-world use. We introduce an online mistake detection framework that handles both procedural and execution errors (e.g., motor slips or tool misuse). Upon detecting an error, we use a large language model (LLM) to generate explanatory feedback. Experiments on the HoloAssist benchmark confirm the effectiveness of our approach, where our approach is placed second on the mistake detection task.
View on arXiv@article{patsch2025_2506.06174, title={ Technical Report for Egocentric Mistake Detection for the HoloAssist Challenge }, author={ Constantin Patsch and Marsil Zakour and Yuankai Wu and Eckehard Steinbach }, journal={arXiv preprint arXiv:2506.06174}, year={ 2025 } }