Synthetic generation of 2D data records based on Autoencoders

Gas Chromatography coupled with Ion Mobility Spectrometry (GC-IMS) is a dual-separation analytical technique widely used for identifying components in gaseous samples by separating and analysing the arrival times of their constituent species. Data generated by GC-IMS is typically represented as two-dimensional spectra, providing rich information but posing challenges for data-driven analysis due to limited labelled datasets. This study introduces a novel method for generating synthetic 2D spectra using a deep learning framework based on Autoencoders. Although applied here to GC-IMS data, the approach is broadly applicable to any two-dimensional spectral measurements where labelled data are scarce. While performing component classification over a labelled dataset of GC-IMS records, the addition of synthesized records significantly has improved the classification performance, demonstrating the method's potential for overcoming dataset limitations in machine learning frameworks.
View on arXiv@article{couchard2025_2502.13183, title={ Synthetic generation of 2D data records based on Autoencoders }, author={ Darius Couchard and Oscar Olarte and Rob Haelterman }, journal={arXiv preprint arXiv:2502.13183}, year={ 2025 } }