1
|
Williams B, Balvanera SM, Sethi SS, Lamont TA, Jompa J, Prasetya M, Richardson L, Chapuis L, Weschke E, Hoey A, Beldade R, Mills SC, Haguenauer A, Zuberer F, Simpson SD, Curnick D, Jones KE. Unlocking the soundscape of coral reefs with artificial intelligence: pretrained networks and unsupervised learning win out. PLoS Comput Biol 2025; 21:e1013029. [PMID: 40294093 PMCID: PMC12064026 DOI: 10.1371/journal.pcbi.1013029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 05/09/2025] [Accepted: 04/07/2025] [Indexed: 04/30/2025] Open
Abstract
Passive acoustic monitoring can offer insights into the state of coral reef ecosystems at low-costs and over extended temporal periods. Comparison of whole soundscape properties can rapidly deliver broad insights from acoustic data, in contrast to detailed but time-consuming analysis of individual bioacoustic events. However, a lack of effective automated analysis for whole soundscape data has impeded progress in this field. Here, we show that machine learning (ML) can be used to unlock greater insights from reef soundscapes. We showcase this on a diverse set of tasks using three biogeographically independent datasets, each containing fish community (high or low), coral cover (high or low) or depth zone (shallow or mesophotic) classes. We show supervised learning can be used to train models that can identify ecological classes and individual sites from whole soundscapes. However, we report unsupervised clustering achieves this whilst providing a more detailed understanding of ecological and site groupings within soundscape data. We also compare three different approaches for extracting feature embeddings from soundscape recordings for input into ML algorithms: acoustic indices commonly used by soundscape ecologists, a pretrained convolutional neural network (P-CNN) trained on 5.2 million hrs of YouTube audio, and CNN's which were trained on each individual task (T-CNN). Although the T-CNN performs marginally better across tasks, we reveal that the P-CNN offers a powerful tool for generating insights from marine soundscape data as it requires orders of magnitude less computational resources whilst achieving near comparable performance to the T-CNN, with significant performance improvements over the acoustic indices. Our findings have implications for soundscape ecology in any habitat.
Collapse
Affiliation(s)
- Ben Williams
- Centre for Biodiversity and Environment Research, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
- Zoological Society of London, Regents Park, London, United Kingdom
| | - Santiago M. Balvanera
- Centre for Biodiversity and Environment Research, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Sarab S. Sethi
- Department of Life Sciences, Imperial College London, London, United Kingdom
| | - Timothy A.C. Lamont
- Lancaster Environment Centre, Lancaster University, Lancaster, United Kingdom
| | | | | | - Laura Richardson
- School of Ocean Sciences, Bangor University, Askew Street, Menai Bridge, Anglesey, United Kingdom
| | - Lucille Chapuis
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Emma Weschke
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - Andrew Hoey
- Australian Research Council Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Queensland, Australia
| | - Ricardo Beldade
- Australian Research Council Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Queensland, Australia
- Estación Costera de Investigaciones Marinas, Millennium Nucleus for Ecology and Conservation of Temperate Mesophotic Reef Ecosystems, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Suzanne C. Mills
- CRIOBE, PSL Research University, Moorea, French Polynesia
- Laboratoire d’Excellence “CORAIL”, Perpignan, France
| | | | | | - Stephen D. Simpson
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
| | - David Curnick
- Zoological Society of London, Regents Park, London, United Kingdom
| | - Kate E. Jones
- Centre for Biodiversity and Environment Research, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| |
Collapse
|
2
|
Sethi SS, Bick A, Chen MY, Crouzeilles R, Hillier BV, Lawson J, Lee CY, Liu SH, Parruco CHDF, Rosten CM, Somveille M, Tuanmu MN, Banks-Leite C. Reply to Araújo: Good science requires focus. Proc Natl Acad Sci U S A 2024; 121:e2420476121. [PMID: 39661057 DOI: 10.1073/pnas.2420476121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2024] Open
Affiliation(s)
- Sarab S Sethi
- Department of Life Sciences, Imperial College London, London W12 0BZ, United Kingdom
| | - Avery Bick
- Norwegian Institute for Nature Research, Trondheim 7034, Norway
| | - Ming-Yuan Chen
- Department of Forestry and Resources Conservation, National Taiwan University, Taipei 106, Taiwan
| | | | - Ben V Hillier
- Division of Biosciences, University College London, London, United Kingdom
| | - Jenna Lawson
- Department of Life Sciences, Imperial College London, London W12 0BZ, United Kingdom
| | - Chia-Yun Lee
- Biodiversity Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Shih-Hao Liu
- Biodiversity Research Center, Academia Sinica, Taipei 11529, Taiwan
| | | | | | - Marius Somveille
- Division of Biosciences, University College London, London, United Kingdom
| | - Mao-Ning Tuanmu
- Biodiversity Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Cristina Banks-Leite
- Department of Life Sciences, Imperial College London, London W12 0BZ, United Kingdom
| |
Collapse
|
3
|
Rasmussen JH, Stowell D, Briefer EF. Sound evidence for biodiversity monitoring. Science 2024; 385:138-140. [PMID: 38991079 DOI: 10.1126/science.adh2716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Bioacoustics and artificial intelligence facilitate ecological studies of animal populations.
Collapse
Affiliation(s)
- Jeppe H Rasmussen
- Behavioural Ecology Group, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Center for Coastal Research and Center for Artificial Intelligence Research, University of Agder, Kristiansand, Norway
| | - Dan Stowell
- Tilburg University, Tilburg, Netherlands
- Naturalis Biodiversity Center, Leiden, Netherlands
| | - Elodie F Briefer
- Behavioural Ecology Group, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
4
|
Ghani B, Denton T, Kahl S, Klinck H. Global birdsong embeddings enable superior transfer learning for bioacoustic classification. Sci Rep 2023; 13:22876. [PMID: 38129622 PMCID: PMC10739890 DOI: 10.1038/s41598-023-49989-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 12/14/2023] [Indexed: 12/23/2023] Open
Abstract
Automated bioacoustic analysis aids understanding and protection of both marine and terrestrial animals and their habitats across extensive spatiotemporal scales, and typically involves analyzing vast collections of acoustic data. With the advent of deep learning models, classification of important signals from these datasets has markedly improved. These models power critical data analyses for research and decision-making in biodiversity monitoring, animal behaviour studies, and natural resource management. However, deep learning models are often data-hungry and require a significant amount of labeled training data to perform well. While sufficient training data is available for certain taxonomic groups (e.g., common bird species), many classes (such as rare and endangered species, many non-bird taxa, and call-type) lack enough data to train a robust model from scratch. This study investigates the utility of feature embeddings extracted from audio classification models to identify bioacoustic classes other than the ones these models were originally trained on. We evaluate models on diverse datasets, including different bird calls and dialect types, bat calls, marine mammals calls, and amphibians calls. The embeddings extracted from the models trained on bird vocalization data consistently allowed higher quality classification than the embeddings trained on general audio datasets. The results of this study indicate that high-quality feature embeddings from large-scale acoustic bird classifiers can be harnessed for few-shot transfer learning, enabling the learning of new classes from a limited quantity of training data. Our findings reveal the potential for efficient analyses of novel bioacoustic tasks, even in scenarios where available training data is limited to a few samples.
Collapse
Affiliation(s)
- Burooj Ghani
- Naturalis Biodiversity Center, Leiden, The Netherlands.
| | - Tom Denton
- Google Research, San Francisco, California, USA.
| | - Stefan Kahl
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, USA
- Chemnitz University of Technology, Chemnitz, Germany
| | - Holger Klinck
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, USA
| |
Collapse
|
5
|
Müller J, Mitesser O, Schaefer HM, Seibold S, Busse A, Kriegel P, Rabl D, Gelis R, Arteaga A, Freile J, Leite GA, de Melo TN, LeBien J, Campos-Cerqueira M, Blüthgen N, Tremlett CJ, Böttger D, Feldhaar H, Grella N, Falconí-López A, Donoso DA, Moriniere J, Buřivalová Z. Soundscapes and deep learning enable tracking biodiversity recovery in tropical forests. Nat Commun 2023; 14:6191. [PMID: 37848442 PMCID: PMC10582010 DOI: 10.1038/s41467-023-41693-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 09/07/2023] [Indexed: 10/19/2023] Open
Abstract
Tropical forest recovery is fundamental to addressing the intertwined climate and biodiversity loss crises. While regenerating trees sequester carbon relatively quickly, the pace of biodiversity recovery remains contentious. Here, we use bioacoustics and metabarcoding to measure forest recovery post-agriculture in a global biodiversity hotspot in Ecuador. We show that the community composition, and not species richness, of vocalizing vertebrates identified by experts reflects the restoration gradient. Two automated measures - an acoustic index model and a bird community composition derived from an independently developed Convolutional Neural Network - correlated well with restoration (adj-R² = 0.62 and 0.69, respectively). Importantly, both measures reflected composition of non-vocalizing nocturnal insects identified via metabarcoding. We show that such automated monitoring tools, based on new technologies, can effectively monitor the success of forest recovery, using robust and reproducible data.
Collapse
Affiliation(s)
- Jörg Müller
- Field Station Fabrikschleichach, Department of Animal Ecology and Tropical Biology, Biocenter, University of Würzburg, Glashüttenstr. 5, 96181, Rauhenebrach, Germany.
- Bavarian Forest National Park, Freyungerstr. 2, 94481, Grafenau, Germany.
| | - Oliver Mitesser
- Field Station Fabrikschleichach, Department of Animal Ecology and Tropical Biology, Biocenter, University of Würzburg, Glashüttenstr. 5, 96181, Rauhenebrach, Germany
| | - H Martin Schaefer
- Fundación Jocotoco, Valladolid N24-414 y Luis Cordero, Quito, Ecuador
| | - Sebastian Seibold
- Technical University of Munich, School of Life Sciences, Ecosystem Dynamics and Forest Management Research Group, Hans-Carl-von-Carlowitz-Platz 2, 85354, Freising, Germany
- Berchtesgaden National Park, Doktorberg 6, Berchtesgaden, 83471, Germany
| | - Annika Busse
- Saxon-Switzerland National Park, An der Elbe 4, 01814, Bad Schandau, Germany
| | - Peter Kriegel
- Field Station Fabrikschleichach, Department of Animal Ecology and Tropical Biology, Biocenter, University of Würzburg, Glashüttenstr. 5, 96181, Rauhenebrach, Germany
| | - Dominik Rabl
- Field Station Fabrikschleichach, Department of Animal Ecology and Tropical Biology, Biocenter, University of Würzburg, Glashüttenstr. 5, 96181, Rauhenebrach, Germany
| | - Rudy Gelis
- Yanayacu Research Center, Cosanga, Ecuador
| | | | - Juan Freile
- Pasaje El Moro E4-216 y Norberto Salazar, EC 170902, Tumbaco, DMQ, Ecuador
| | - Gabriel Augusto Leite
- Rainforest Connection, Science Department, 440 Cobia Drive, Suite 1902, Katy, TX, 77494, USA
| | | | - Jack LeBien
- Rainforest Connection, Science Department, 440 Cobia Drive, Suite 1902, Katy, TX, 77494, USA
| | | | - Nico Blüthgen
- Ecological Networks Lab, Department of Biology, Technische Universität Darmstadt, Schnittspahnstr. 3, 64287, Darmstadt, Germany
| | - Constance J Tremlett
- Ecological Networks Lab, Department of Biology, Technische Universität Darmstadt, Schnittspahnstr. 3, 64287, Darmstadt, Germany
| | - Dennis Böttger
- Phyletisches Museum, Institute for Zoology and Evolutionary Research, Friedrich-Schiller-University Jena, Jena, Germany
| | - Heike Feldhaar
- Animal Population Ecology, Bayreuth Center for Ecology and Environmental Research (BayCEER), University of Bayreuth, 95440, Bayreuth, Germany
| | - Nina Grella
- Animal Population Ecology, Bayreuth Center for Ecology and Environmental Research (BayCEER), University of Bayreuth, 95440, Bayreuth, Germany
| | - Ana Falconí-López
- Field Station Fabrikschleichach, Department of Animal Ecology and Tropical Biology, Biocenter, University of Würzburg, Glashüttenstr. 5, 96181, Rauhenebrach, Germany
- Grupo de Investigación en Biodiversidad, Medio Ambiente y Salud-BIOMAS-Universidad de las Américas, Quito, Ecuador
| | - David A Donoso
- Grupo de Investigación en Biodiversidad, Medio Ambiente y Salud-BIOMAS-Universidad de las Américas, Quito, Ecuador
- Departamento de Biología, Facultad de Ciencias, Escuela Politécnica Nacional, Av. Ladrón de Guevara E11-253, CP 17-01-2759, Quito, Ecuador
| | - Jerome Moriniere
- AIM - Advanced Identification Methods GmbH, Niemeyerstr. 1, 04179, Leipzig, Germany
| | - Zuzana Buřivalová
- University of Wisconsin-Madison, Department of Forest and Wildlife Ecology and The Nelson Institute for Environmental Studies, 1630 Linden Drive, Madison, WI, 53706, USA
| |
Collapse
|
6
|
Sethi SS, Bick A, Ewers RM, Klinck H, Ramesh V, Tuanmu MN, Coomes DA. Limits to the accurate and generalizable use of soundscapes to monitor biodiversity. Nat Ecol Evol 2023; 7:1373-1378. [PMID: 37524796 PMCID: PMC10482675 DOI: 10.1038/s41559-023-02148-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 07/03/2023] [Indexed: 08/02/2023]
Abstract
Although eco-acoustic monitoring has the potential to deliver biodiversity insight on vast scales, existing analytical approaches behave unpredictably across studies. We collated 8,023 audio recordings with paired manual avifaunal point counts to investigate whether soundscapes could be used to monitor biodiversity across diverse ecosystems. We found that neither univariate indices nor machine learning models were predictive of species richness across datasets but soundscape change was consistently indicative of community change. Our findings indicate that there are no common features of biodiverse soundscapes and that soundscape monitoring should be used cautiously and in conjunction with more reliable in-person ecological surveys.
Collapse
Affiliation(s)
- Sarab S Sethi
- Conservation Research Institute and Department of Plant Sciences, University of Cambridge, Cambridge, UK.
- Centre for Biodiversity and Environment Research, University College London, London, UK.
| | - Avery Bick
- Norwegian Institute for Nature Research, Trondheim, Norway
| | - Robert M Ewers
- Georgina Mace Centre for the Living Planet, Department of Life Sciences, Imperial College London, London, UK
| | - Holger Klinck
- K Lisa Yang Center for Conservation Bioacoustics, Cornell University, Ithaca, NY, USA
| | - Vijay Ramesh
- K Lisa Yang Center for Conservation Bioacoustics, Cornell University, Ithaca, NY, USA
- Project Dhvani, Bangalore, India
| | - Mao-Ning Tuanmu
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - David A Coomes
- Conservation Research Institute and Department of Plant Sciences, University of Cambridge, Cambridge, UK
| |
Collapse
|
7
|
Clink DJ, Kier I, Ahmad AH, Klinck H. A workflow for the automated detection and classification of female gibbon calls from long-term acoustic recordings. Front Ecol Evol 2023. [DOI: 10.3389/fevo.2023.1071640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023] Open
Abstract
Passive acoustic monitoring (PAM) allows for the study of vocal animals on temporal and spatial scales difficult to achieve using only human observers. Recent improvements in recording technology, data storage, and battery capacity have led to increased use of PAM. One of the main obstacles in implementing wide-scale PAM programs is the lack of open-source programs that efficiently process terabytes of sound recordings and do not require large amounts of training data. Here we describe a workflow for detecting, classifying, and visualizing female Northern grey gibbon calls in Sabah, Malaysia. Our approach detects sound events using band-limited energy summation and does binary classification of these events (gibbon female or not) using machine learning algorithms (support vector machine and random forest). We then applied an unsupervised approach (affinity propagation clustering) to see if we could further differentiate between true and false positives or the number of gibbon females in our dataset. We used this workflow to address three questions: (1) does this automated approach provide reliable estimates of temporal patterns of gibbon calling activity; (2) can unsupervised approaches be applied as a post-processing step to improve the performance of the system; and (3) can unsupervised approaches be used to estimate how many female individuals (or clusters) there are in our study area? We found that performance plateaued with >160 clips of training data for each of our two classes. Using optimized settings, our automated approach achieved a satisfactory performance (F1 score ~ 80%). The unsupervised approach did not effectively differentiate between true and false positives or return clusters that appear to correspond to the number of females in our study area. Our results indicate that more work needs to be done before unsupervised approaches can be reliably used to estimate the number of individual animals occupying an area from PAM data. Future work applying these methods across sites and different gibbon species and comparisons to deep learning approaches will be crucial for future gibbon conservation initiatives across Southeast Asia.
Collapse
|
8
|
Alcocer I, Lima H, Sugai LSM, Llusia D. Acoustic indices as proxies for biodiversity: a meta-analysis. Biol Rev Camb Philos Soc 2022; 97:2209-2236. [PMID: 35978471 DOI: 10.1111/brv.12890] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 07/08/2022] [Accepted: 07/08/2022] [Indexed: 01/07/2023]
Abstract
As biodiversity decreases worldwide, the development of effective techniques to track changes in ecological communities becomes an urgent challenge. Together with other emerging methods in ecology, acoustic indices are increasingly being used as novel tools for rapid biodiversity assessment. These indices are based on mathematical formulae that summarise the acoustic features of audio samples, with the aim of extracting meaningful ecological information from soundscapes. However, the application of this automated method has revealed conflicting results across the literature, with conceptual and empirical controversies regarding its primary assumption: a correlation between acoustic and biological diversity. After more than a decade of research, we still lack a statistically informed synthesis of the power of acoustic indices that elucidates whether they effectively function as proxies for biological diversity. Here, we reviewed studies testing the relationship between diversity metrics (species abundance, species richness, species diversity, abundance of sounds, and diversity of sounds) and the 11 most commonly used acoustic indices. From 34 studies, we extracted 364 effect sizes that quantified the magnitude of the direct link between acoustic and biological estimates and conducted a meta-analysis. Overall, acoustic indices had a moderate positive relationship with the diversity metrics (r = 0.33, CI [0.23, 0.43]), and showed an inconsistent performance, with highly variable effect sizes both within and among studies. Over time, studies have been increasingly disregarding the validation of the acoustic estimates and those examining this link have been progressively reporting smaller effect sizes. Some of the studied indices [acoustic entropy index (H), normalised difference soundscape index (NDSI), and acoustic complexity index (ACI)] performed better in retrieving biological information, with abundance of sounds (number of sounds from identified or unidentified species) being the best estimated diversity facet of local communities. We found no effect of the type of monitored environment (terrestrial versus aquatic) and the procedure for extracting biological information (acoustic versus non-acoustic) on the performance of acoustic indices, suggesting certain potential to generalise their application across research contexts. We also identified common statistical issues and knowledge gaps that remain to be addressed in future research, such as a high rate of pseudoreplication and multiple unexplored combinations of metrics, taxa, and regions. Our findings confirm the limitations of acoustic indices to efficiently quantify alpha biodiversity and highlight that caution is necessary when using them as surrogates of diversity metrics, especially if employed as single predictors. Although these tools are able partially to capture changes in diversity metrics, endorsing to some extent the rationale behind acoustic indices and suggesting them as promising bases for future developments, they are far from being direct proxies for biodiversity. To guide more efficient use and future research, we review their principal theoretical and practical shortcomings, as well as prospects and challenges of acoustic indices in biodiversity assessment. Altogether, we provide the first comprehensive and statistically based overview on the relation between acoustic indices and biodiversity and pave the way for a more standardised and informed application for biodiversity monitoring.
Collapse
Affiliation(s)
- Irene Alcocer
- Terrestrial Ecology Group, Departamento de Ecología, Universidad Autónoma de Madrid, C/ Darwin, 2, Ciudad Universitaria de Cantoblanco, Facultad de Ciencias, Edificio de Biología, 28049, Madrid, Spain.,Centro de Investigación en Biodiversidad y Cambio Global, Universidad Autónoma de Madrid, C/ Darwin 2, Ciudad Universitaria de Cantoblanco, 28049, Madrid, Spain
| | - Herlander Lima
- Department of Life Sciences, GloCEE Global Change Ecology and Evolution Research Group, University of Alcalá, Alcalá de Henares, 28805, Madrid, Spain
| | - Larissa Sayuri Moreira Sugai
- Terrestrial Ecology Group, Departamento de Ecología, Universidad Autónoma de Madrid, C/ Darwin, 2, Ciudad Universitaria de Cantoblanco, Facultad de Ciencias, Edificio de Biología, 28049, Madrid, Spain.,Centro de Investigación en Biodiversidad y Cambio Global, Universidad Autónoma de Madrid, C/ Darwin 2, Ciudad Universitaria de Cantoblanco, 28049, Madrid, Spain.,K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, 159 Sapsucker Woods Road, Ithaca, NY, 14850, USA
| | - Diego Llusia
- Terrestrial Ecology Group, Departamento de Ecología, Universidad Autónoma de Madrid, C/ Darwin, 2, Ciudad Universitaria de Cantoblanco, Facultad de Ciencias, Edificio de Biología, 28049, Madrid, Spain.,Centro de Investigación en Biodiversidad y Cambio Global, Universidad Autónoma de Madrid, C/ Darwin 2, Ciudad Universitaria de Cantoblanco, 28049, Madrid, Spain.,Laboratório de Herpetologia e Comportamento Animal, Departamento de Ecologia, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Campus Samambaia, CEP 74001-970, Goiânia, Goiás, Brazil
| |
Collapse
|
9
|
Lauha P, Somervuo P, Lehikoinen P, Geres L, Richter T, Seibold S, Ovaskainen O. Domain‐specific neural networks improve automated bird sound recognition already with small amount of local data. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.14003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Patrik Lauha
- Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences University of Helsinki Helsinki Finland
| | - Panu Somervuo
- Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences University of Helsinki Helsinki Finland
| | - Petteri Lehikoinen
- Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences University of Helsinki Helsinki Finland
| | - Lisa Geres
- Berchtesgaden National Park Berchtesgaden Germany
- Goethe University Frankfurt, Faculty of Biological Sciences Institute for Ecology, Evolution and Diversity, Conservation Biology Frankfurt am Main Germany
| | - Tobias Richter
- Berchtesgaden National Park Berchtesgaden Germany
- TUM School of Life Sciences, Ecosystem Dynamics and Forest Management Technical University of Munich Freising Germany
| | - Sebastian Seibold
- Berchtesgaden National Park Berchtesgaden Germany
- TUM School of Life Sciences, Ecosystem Dynamics and Forest Management Technical University of Munich Freising Germany
| | - Otso Ovaskainen
- Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences University of Helsinki Helsinki Finland
- Department of Biological and Environmental Science University of Jyväskylä Jyväskylä Finland
- Department of Biology, Centre for Biodiversity Dynamics Norwegian University of Science and Technology Trondheim Norway
| |
Collapse
|
10
|
Stowell D. Computational bioacoustics with deep learning: a review and roadmap. PeerJ 2022; 10:e13152. [PMID: 35341043 PMCID: PMC8944344 DOI: 10.7717/peerj.13152] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 03/01/2022] [Indexed: 01/20/2023] Open
Abstract
Animal vocalisations and natural soundscapes are fascinating objects of study, and contain valuable evidence about animal behaviours, populations and ecosystems. They are studied in bioacoustics and ecoacoustics, with signal processing and analysis an important component. Computational bioacoustics has accelerated in recent decades due to the growth of affordable digital sound recording devices, and to huge progress in informatics such as big data, signal processing and machine learning. Methods are inherited from the wider field of deep learning, including speech and image processing. However, the tasks, demands and data characteristics are often different from those addressed in speech or music analysis. There remain unsolved problems, and tasks for which evidence is surely present in many acoustic signals, but not yet realised. In this paper I perform a review of the state of the art in deep learning for computational bioacoustics, aiming to clarify key concepts and identify and analyse knowledge gaps. Based on this, I offer a subjective but principled roadmap for computational bioacoustics with deep learning: topics that the community should aim to address, in order to make the most of future developments in AI and informatics, and to use audio data in answering zoological and ecological questions.
Collapse
Affiliation(s)
- Dan Stowell
- Department of Cognitive Science and Artificial Intelligence, Tilburg University, Tilburg, The Netherlands,Naturalis Biodiversity Center, Leiden, The Netherlands
| |
Collapse
|