Balancing Accuracy and Training Time in Federated Learning for Violence Detection in Surveillance Videos: A Study of Neural Network Architectures

This paper presents an investigation into machine learning techniques for violence detection in videos and their adaptation to a federated learning context. The study includes experiments with spatio-temporal features extracted from benchmark video datasets, comparison of different methods, and proposal of a modified version of the "Flow-Gated" architecture called "Diff-Gated." Additionally, various machine learning techniques, including super-convergence and transfer learning, are explored, and a method for adapting centralized datasets to a federated learning context is developed. The research achieves better accuracy results compared to state-of-the-art models by training the best violence detection model in a federated learning context.
View on arXiv