ECOSoundSet: a finely annotated dataset for the automated acoustic identification of Orthoptera and Cicadidae in North, Central and temperate Western Europe

Currently available tools for the automated acoustic recognition of European insects in natural soundscapes are limited in scope. Large and ecologically heterogeneous acoustic datasets are currently needed for these algorithms to cross-contextually recognize the subtle and complex acoustic signatures produced by each species, thus making the availability of such datasets a key requisite for their development. Here we present ECOSoundSet (European Cicadidae and Orthoptera Sound dataSet), a dataset containing 10,653 recordings of 200 orthopteran and 24 cicada species (217 and 26 respective taxa when including subspecies) present in North, Central, and temperate Western Europe (Andorra, Belgium, Denmark, mainland France and Corsica, Germany, Ireland, Luxembourg, Monaco, Netherlands, United Kingdom, Switzerland), collected partly through targeted fieldwork in South France and Catalonia and partly through contributions from various European entomologists. The dataset is composed of a combination of coarsely labeled recordings, for which we can only infer the presence, at some point, of their target species (weak labeling), and finely annotated recordings, for which we know the specific time and frequency range of each insect sound present in the recording (strong labeling). We also provide a train/validation/test split of the strongly labeled recordings, with respective approximate proportions of 0.8, 0.1 and 0.1, in order to facilitate their incorporation in the training and evaluation of deep learning algorithms. This dataset could serve as a meaningful complement to recordings already available online for the training of deep learning algorithms for the acoustic classification of orthopterans and cicadas in North, Central, and temperate Western Europe.
View on arXiv@article{funosas2025_2504.20776, title={ ECOSoundSet: a finely annotated dataset for the automated acoustic identification of Orthoptera and Cicadidae in North, Central and temperate Western Europe }, author={ David Funosas and Elodie Massol and Yves Bas and Svenja Schmidt and Dominik Arend and Alexander Gebhard and Luc Barbaro and Sebastian König and Rafael Carbonell Font and David Sannier and Fernand Deroussen and Jérôme Sueur and Christian Roesti and Tomi Trilar and Wolfgang Forstmeier and Lucas Roger and Eloïsa Matheu and Piotr Guzik and Julien Barataud and Laurent Pelozuelo and Stéphane Puissant and Sandra Mueller and Björn Schuller and Jose M. Montoya and Andreas Triantafyllopoulos and Maxime Cauchoix }, journal={arXiv preprint arXiv:2504.20776}, year={ 2025 } }