ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.03794
74
0

Synthetic Data Augmentation for Enhancing Harmful Algal Bloom Detection with Machine Learning

5 March 2025
Tianyi Huang
ArXivPDFHTML
Abstract

Harmful Algal Blooms (HABs) pose severe threats to aquatic ecosystems and public health, resulting in substantial economic losses globally. Early detection is crucial but often hindered by the scarcity of high-quality datasets necessary for training reliable machine learning (ML) models. This study investigates the use of synthetic data augmentation using Gaussian Copulas to enhance ML-based HAB detection systems. Synthetic datasets of varying sizes (100-1,000 samples) were generated using relevant environmental features\unicodex2015\unicode{x2015}\unicodex2015water temperature, salinity, and UVB radiation\unicodex2015\unicode{x2015}\unicodex2015with corrected Chlorophyll-a concentration as the target variable. Experimental results demonstrate that moderate synthetic augmentation significantly improves model performance (RMSE reduced from 0.4706 to 0.1850; p<0.001p < 0.001p<0.001). However, excessive synthetic data introduces noise and reduces predictive accuracy, emphasizing the need for a balanced approach to data augmentation. These findings highlight the potential of synthetic data to enhance HAB monitoring systems, offering a scalable and cost-effective method for early detection and mitigation of ecological and public health risks.

View on arXiv
@article{huang2025_2503.03794,
  title={ Synthetic Data Augmentation for Enhancing Harmful Algal Bloom Detection with Machine Learning },
  author={ Tianyi Huang },
  journal={arXiv preprint arXiv:2503.03794},
  year={ 2025 }
}
Comments on this paper