Machine learning approach for identification of release sources in advection-diffusion systems

Mathematically, the contaminant transport in an aquifer is described by an advection-diffusion equation and the identification of the contamination sources relies on solving a complex ill posed inverse model against the available observed data. The contaminant migration is usually monitored by spatially discrete detectors (e.g. monitoring wells) providing temporal records representing sampling events. These records are then used to estimate properties of the contaminant sources, e.g., locations, release strengths and model parameters representing contaminant migration (e.g., velocity, dispersivity, etc.). These estimates are essential for a reliable assessment of the contamination hazards and risks. If there are more than one contaminant sources (with different locations and strengths), the observed records represent contaminant mixtures; typically, the number of sources is unknown. The mixing ratios of the different contaminant sources at the detectors are also unknown; this further hinders the reliability and complexity of the inverse-model analyses. To circumvent some of these challenges, we have developed a novel hybrid source identification method coupling machine learning and inverse analysis methods, and called Green-NMFk. Our method is capable of identifying the unknown number, locations, and properties of a set of contaminant sources from measured contaminant-source mixtures with unknown mixing ratios, without any additional information. It also estimates the contaminant transport properties, such as velocity and dispersivity.
View on arXiv