Artificial neural network has achieved the state-of-art performance in fault detection on the Tennessee Eastman process, but it often requires enormous memory to fund its massive parameters. In order to implement online real-time fault detection, three deep compression techniques (pruning, clustering, and quantization) are applied to reduce the computational burden. We have extensively studied 7 different combinations of compression techniques, all methods achieve high model compression rates over 64% while maintain high fault detection accuracy. The best result is applying all three techniques, which reduces the model sizes by 91.5% and remains a high accuracy over 94%. This result leads to a smaller storage requirement in production environments, and makes the deployment smoother in real world.
View on arXiv