1
|
Odom KJ, Araya-Salas M, Morano JL, Ligon RA, Leighton GM, Taff CC, Dalziell AH, Billings AC, Germain RR, Pardo M, de Andrade LG, Hedwig D, Keen SC, Shiu Y, Charif RA, Webster MS, Rice AN. Comparative bioacoustics: a roadmap for quantifying and comparing animal sounds across diverse taxa. Biol Rev Camb Philos Soc 2021; 96:1135-1159. [PMID: 33652499 DOI: 10.1111/brv.12695] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 02/03/2021] [Accepted: 02/05/2021] [Indexed: 12/12/2022]
Abstract
Animals produce a wide array of sounds with highly variable acoustic structures. It is possible to understand the causes and consequences of this variation across taxa with phylogenetic comparative analyses. Acoustic and evolutionary analyses are rapidly increasing in sophistication such that choosing appropriate acoustic and evolutionary approaches is increasingly difficult. However, the correct choice of analysis can have profound effects on output and evolutionary inferences. Here, we identify and address some of the challenges for this growing field by providing a roadmap for quantifying and comparing sound in a phylogenetic context for researchers with a broad range of scientific backgrounds. Sound, as a continuous, multidimensional trait can be particularly challenging to measure because it can be hard to identify variables that can be compared across taxa and it is also no small feat to process and analyse the resulting high-dimensional acoustic data using approaches that are appropriate for subsequent evolutionary analysis. Additionally, terminological inconsistencies and the role of learning in the development of acoustic traits need to be considered. Phylogenetic comparative analyses also have their own sets of caveats to consider. We provide a set of recommendations for delimiting acoustic signals into discrete, comparable acoustic units. We also present a three-stage workflow for extracting relevant acoustic data, including options for multivariate analyses and dimensionality reduction that is compatible with phylogenetic comparative analysis. We then summarize available phylogenetic comparative approaches and how they have been used in comparative bioacoustics, and address the limitations of comparative analyses with behavioural data. Lastly, we recommend how to apply these methods to acoustic data across a range of study systems. In this way, we provide an integrated framework to aid in quantitative analysis of cross-taxa variation in animal sounds for comparative phylogenetic analysis. In addition, we advocate the standardization of acoustic terminology across disciplines and taxa, adoption of automated methods for acoustic feature extraction, and establishment of strong data archival practices for acoustic recordings and data analyses. Combining such practices with our proposed workflow will greatly advance the reproducibility, biological interpretation, and longevity of comparative bioacoustic studies.
Collapse
Affiliation(s)
- Karan J Odom
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A
| | - Marcelo Araya-Salas
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A.,Sede del Sur, Universidad de Costa Rica, Golfito, 60701, Costa Rica
| | - Janelle L Morano
- Macaulay Library, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Natural Resources and the Environment, Cornell University, Ithaca, NY, 14853, U.S.A
| | - Russell A Ligon
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A
| | - Gavin M Leighton
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A.,Department of Biology, SUNY Buffalo State, Buffalo, NY, 14222, U.S.A
| | - Conor C Taff
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853, U.S.A
| | - Anastasia H Dalziell
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A.,Centre for Sustainable Ecosystem Solutions, University of Wollongong, Northfields Ave, Wollongong, NSW, 2522, Australia
| | - Alexis C Billings
- Division of Biological Sciences, University of Montana, Missoula, MT, 59812, U.S.A.,Department of Environmental, Science, Policy and Management, University of California, Berkeley, Berkeley, CA, 94709, U.S.A
| | - Ryan R Germain
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A.,Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, DK-2100, Denmark
| | - Michael Pardo
- Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A.,Department of Fish, Wildlife, and Conservation Biology, Colorado State University, Fort Collins, CO, 80523, U.S.A
| | - Luciana Guimarães de Andrade
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853, U.S.A.,Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A
| | - Daniela Hedwig
- Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A
| | - Sara C Keen
- Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A.,Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A.,Department of Geological Sciences, Stanford University, Stanford, CA, 94305, U.S.A
| | - Yu Shiu
- Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A
| | - Russell A Charif
- Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A
| | - Michael S Webster
- Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, 14853, U.S.A.,Macaulay Library, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A
| | - Aaron N Rice
- Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, U.S.A
| |
Collapse
|
2
|
Malfante M, Mars JI, Dalla Mura M, Gervaise C. Automatic fish sounds classification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:2834. [PMID: 29857733 DOI: 10.1121/1.5036628] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The work presented in this paper focuses on the use of acoustic systems for passive acoustic monitoring of ocean vitality for fish populations. Specifically, it focuses on the use of acoustic systems for passive acoustic monitoring of ocean vitality for fish populations. To this end, various indicators can be used to monitor marine areas such as both the geographical and temporal evolution of fish populations. A discriminative model is built using supervised machine learning (random-forest and support-vector machines). Each acquisition is represented in a feature space, in which the patterns belonging to different semantic classes are as separable as possible. The set of features proposed for describing the acquisitions come from an extensive state of the art in various domains in which classification of acoustic signals is performed, including speech, music, and environmental acoustics. Furthermore, this study proposes to extract features from three representations of the data (time, frequency, and cepstral domains). The proposed classification scheme is tested on real fish sounds recorded on several areas, and achieves 96.9% correct classification compared to 72.5% when using reference state of the art features as descriptors. The classification scheme is also validated on continuous underwater recordings, thereby illustrating that it can be used to both detect and classify fish sounds in operational scenarios.
Collapse
Affiliation(s)
- Marielle Malfante
- Institute of Engineering University Grenoble Alpes, CNRS, Grenoble INP, GIPSA-Lab, 38000 Grenoble, France
| | - Jérôme I Mars
- Institute of Engineering University Grenoble Alpes, CNRS, Grenoble INP, GIPSA-Lab, 38000 Grenoble, France
| | - Mauro Dalla Mura
- Institute of Engineering University Grenoble Alpes, CNRS, Grenoble INP, GIPSA-Lab, 38000 Grenoble, France
| | | |
Collapse
|
3
|
A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds. APPLIED SCIENCES-BASEL 2016. [DOI: 10.3390/app6050143] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
4
|
Comazzi C, Mattiello S, Friard O, Filacorda S, Gamba M. Acoustic monitoring of golden jackals in Europe: setting the frame for future analyses. BIOACOUSTICS 2016. [DOI: 10.1080/09524622.2016.1152564] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Carlo Comazzi
- Università degli Studi di Milano, Dipartimento di Scienze Veterinarie e Sanità Pubblica, via Celoria 10, Milano, Italy
- Università di Torino, Dipartimento di Scienze della Vita e Biologia dei Sistemi, via Accademia Albertina 13, Torino, Italy
| | - Silvana Mattiello
- Università degli Studi di Milano, Dipartimento di Scienze Veterinarie e Sanità Pubblica, via Celoria 10, Milano, Italy
| | - Olivier Friard
- Università di Torino, Dipartimento di Scienze della Vita e Biologia dei Sistemi, via Accademia Albertina 13, Torino, Italy
| | - Stefano Filacorda
- Università degli Studi di Udine, Dipartimento di Scienze AgroAlimentari, Ambientali e Animali, via Sondrio 2/a, Udine, Italy
| | - Marco Gamba
- Università di Torino, Dipartimento di Scienze della Vita e Biologia dei Sistemi, via Accademia Albertina 13, Torino, Italy
| |
Collapse
|
5
|
Kershenbaum A, Blumstein DT, Roch MA, Akçay Ç, Backus G, Bee MA, Bohn K, Cao Y, Carter G, Cäsar C, Coen M, DeRuiter SL, Doyle L, Edelman S, Ferrer-i-Cancho R, Freeberg TM, Garland EC, Gustison M, Harley HE, Huetz C, Hughes M, Bruno JH, Ilany A, Jin DZ, Johnson M, Ju C, Karnowski J, Lohr B, Manser MB, McCowan B, Mercado E, Narins PM, Piel A, Rice M, Salmi R, Sasahara K, Sayigh L, Shiu Y, Taylor C, Vallejo EE, Waller S, Zamora-Gutierrez V. Acoustic sequences in non-human animals: a tutorial review and prospectus. Biol Rev Camb Philos Soc 2016; 91:13-52. [PMID: 25428267 PMCID: PMC4444413 DOI: 10.1111/brv.12160] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2014] [Revised: 10/02/2014] [Accepted: 10/15/2014] [Indexed: 11/30/2022]
Abstract
Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well-known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise - let alone understand - the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near-future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, 'Analysing vocal sequences in animals'. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial-style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality.
Collapse
Affiliation(s)
- Arik Kershenbaum
- National Institute for Mathematical and Biological Synthesis, 1122 Volunteer Blvd., Suite 106, University of Tennessee, Knoxville, TN 37996-3410, USA
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Daniel T. Blumstein
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, 621 Charles E. Young Drive South, Los Angeles, CA 90095-1606, USA
| | - Marie A. Roch
- Department of Computer Science, San Diego State University, 5500 Campanile Dr, San Diego, CA 92182, USA
| | - Çağlar Akçay
- Lab of Ornithology, Cornell University, 159 Sapsucker Woods Rd, Ithaca, NY 14850, USA
| | - Gregory Backus
- Department of Biomathematics, North Carolina State University, Raleigh, NC 27607, USA
| | - Mark A. Bee
- Department of Ecology, Evolution and Behavior, University of Minnesota, 100 Ecology Building, 1987 Upper Buford Cir, Falcon Heights, MN 55108, USA
| | - Kirsten Bohn
- Integrated Science, Florida International University, Modesto Maidique Campus, 11200 SW 8th Street, AHC-4, 351, Miami, FL 33199, USA
| | - Yan Cao
- Department of Mathematical Sciences, University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX 75080, USA
| | - Gerald Carter
- Biological Sciences Graduate Program, University of Maryland, College Park, MD 20742, USA
| | - Cristiane Cäsar
- Department of Psychology & Neuroscience, University of St. Andrews, St Mary’s Quad South Street, St Andrews, Fife, KY16 9JP, UK
| | - Michael Coen
- Department of Biostatistics and Medical Informatics, University of Wisconsin, K6/446 Clinical Sciences Center, 600 Highland Avenue, Madison, WI 53792-4675, USA
| | - Stacy L. DeRuiter
- School of Mathematics and Statistics, University of St. Andrews, St Andrews, KY16 9SS, UK
| | - Laurance Doyle
- Carl Sagan Center for the Study of Life in the Universe, SETI Institute, 189 Bernardo Ave, Suite 100, Mountain View, CA 94043, USA
| | - Shimon Edelman
- Department of Psychology, Cornell University, 211 Uris Hall, Ithaca, NY 14853-7601, USA
| | - Ramon Ferrer-i-Cancho
- Department of Computer Science, Universitat Politecnica de Catalunya, (Catalonia), Calle Jordi Girona, 31, 08034 Barcelona, Spain
| | - Todd M. Freeberg
- Department of Psychology, University of Tennessee, Austin Peay Building, Knoxville, Tennessee 37996, USA
| | - Ellen C. Garland
- National Marine Mammal Laboratory, AFSC/NOAA, 7600 Sand Point Way N.E., Seattle, Washington 98115, USA
| | - Morgan Gustison
- Department of Psychology, University of Michigan, 530 Church St, Ann Arbor, MI 48109, USA
| | - Heidi E. Harley
- Division of Social Sciences, New College of Florida, 5800 Bay Shore Rd, Sarasota, FL 34243, USA
| | - Chloé Huetz
- CNPS, CNRS UMR 8195, Université Paris-Sud, UMR 8195, Batiments 440-447, Rue Claude Bernard, 91405 Orsay, France
| | - Melissa Hughes
- Department of Biology, College of Charleston, 66 George St, Charleston, SC 29424, USA
| | - Julia Hyland Bruno
- Department of Psychology, Hunter College and the Graduate Center, The City University of New York, 365 Fifth Avenue, New York, NY 10016, USA
| | - Amiyaal Ilany
- National Institute for Mathematical and Biological Synthesis, 1122 Volunteer Blvd., Suite 106, University of Tennessee, Knoxville, TN 37996-3410, USA
| | - Dezhe Z. Jin
- Department of Physics, Pennsylvania State University, 104 Davey Lab, University Park, PA 16802-6300, USA
| | - Michael Johnson
- Department of Electrical and Computer Engineering, Marquette University, 1515 W. Wisconsin Ave., Milwaukee, WI 53233, USA
| | - Chenghui Ju
- Department of Biology, Queen College, The City Univ. of New York, 65-30 Kissena Blvd., Flushing, New York 11367, USA
| | - Jeremy Karnowski
- Department of Cognitive Science, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0515, USA
| | - Bernard Lohr
- Department of Biological Sciences, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
| | - Marta B. Manser
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - Brenda McCowan
- Department of Veterinary Medicine, University of California Davis, 1 Peter J Shields Ave, Davis, CA 95616, USA
| | - Eduardo Mercado
- Department of Psychology; Evolution, Ecology, & Behavior, University at Buffalo, The State University of New York, Park Hall Room 204, Buffalo, NY 14260-4110, USA
| | - Peter M. Narins
- Department of Integrative Biology & Physiology, University of California Los Angeles, 612 Charles E. Young Drive East, Los Angeles, CA 90095-7246, USA
| | - Alex Piel
- Division of Biological Anthropology, University of Cambridge, Pembroke Street Cambridge, CB2 3QG, UK
| | - Megan Rice
- Department of Psychology, California State University San Marcos, 333 S. Twin Oaks Valley Rd., San Marcos, CA 92096-0001, USA
| | - Roberta Salmi
- Department of Anthropology, University of Georgia at Athens, 355 S Jackson St, Athens, GA 30602, USA
| | - Kazutoshi Sasahara
- Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan
| | - Laela Sayigh
- Biology Department, Woods Hole Oceanographic Institution, 86 Water St, Woods Hole, MA 02543, USA
| | - Yu Shiu
- Lab of Ornithology, Cornell University, 159 Sapsucker Woods Rd, Ithaca, NY 14850, USA
| | - Charles Taylor
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, 621 Charles E. Young Drive South, Los Angeles, CA 90095-1606, USA
| | - Edgar E. Vallejo
- Department of Computer Science, Monterrey Institute of Technology, Ave. Eugenio Garza Sada 2501 Sur Col. Tecnológico C.P. 64849, Monterrey, Nuevo León, Mexico
| | - Sara Waller
- Department of Philosophy, Montana State University, 2-155 Wilson Hall, Bozeman, Montana 59717, USA
| | - Veronica Zamora-Gutierrez
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
- Centre for Biodiversity and Environmental Research, University College London, London WC1H 0AG, UK
| |
Collapse
|
6
|
Comparative Analysis of the Vocal Repertoire of Eulemur: A Dynamic Time Warping Approach. INT J PRIMATOL 2015. [DOI: 10.1007/s10764-015-9861-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
7
|
Scheifele PM, Johnson MT, Fry M, Hamel B, Laclede K. Vocal classification of vocalizations of a pair of Asian small-clawed otters to determine stress. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:EL105-EL109. [PMID: 26233050 DOI: 10.1121/1.4922768] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Asian Small-Clawed Otters (Aonyx cinerea) are a small, protected but threatened species living in freshwater. They are gregarious and live in monogamous pairs for their lifetimes, communicating via scent and acoustic vocalizations. This study utilized a hidden Markov model (HMM) to classify stress versus non-stress calls from a sibling pair under professional care. Vocalizations were expertly annotated by keepers into seven contextual categories. Four of these-aggression, separation anxiety, pain, and prefeeding-were identified as stressful contexts, and three of them-feeding, training, and play-were identified as non-stressful contexts. The vocalizations were segmented, manually categorized into broad vocal type call types, and analyzed to determine signal to noise ratios. From this information, vocalizations from the most common contextual categories were used to implement HMM-based automatic classification experiments, which included individual identification, stress vs non-stress, and individual context classification. Results indicate that both individual identity and stress vs non-stress were distinguishable, with accuracies above 90%, but that individual contexts within the stress category were not easily separable.
Collapse
Affiliation(s)
- Peter M Scheifele
- FETCHLAB, Department of Audiology, University of Cincinnati, 3202 Eden Avenue, Cincinnati, Ohio 45267, USA
| | - Michael T Johnson
- Electrical and Computer Engineering Department, Marquette University, 1515 West Wisconsin Avenue, Milwaukee, Wisconsin 53233, USA
| | - Michelle Fry
- Newport Aquarium, 1 Aquarium Way, Newport, Kentucky 41071, USA , , , ,
| | - Benjamin Hamel
- Electrical and Computer Engineering Department, Marquette University, 1515 West Wisconsin Avenue, Milwaukee, Wisconsin 53233, USA
| | - Kathryn Laclede
- FETCHLAB, Department of Audiology, University of Cincinnati, 3202 Eden Avenue, Cincinnati, Ohio 45267, USA
| |
Collapse
|
8
|
Ranjard L, Withers SJ, Brunton DH, Ross HA, Parsons S. Integration over song classification replicates: song variant analysis in the hihi. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:2542-2551. [PMID: 25994687 DOI: 10.1121/1.4919329] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Human expert analyses are commonly used in bioacoustic studies and can potentially limit the reproducibility of these results. In this paper, a machine learning method is presented to statistically classify avian vocalizations. Automated approaches were applied to isolate bird songs from long field recordings, assess song similarities, and classify songs into distinct variants. Because no positive controls were available to assess the true classification of variants, multiple replicates of automatic classification of song variants were analyzed to investigate clustering uncertainty. The automatic classifications were more similar to the expert classifications than expected by chance. Application of these methods demonstrated the presence of discrete song variants in an island population of the New Zealand hihi (Notiomystis cincta). The geographic patterns of song variation were then revealed by integrating over classification replicates. Because this automated approach considers variation in song variant classification, it reduces potential human bias and facilitates the reproducibility of the results.
Collapse
Affiliation(s)
- Louis Ranjard
- Bioinformatics Institute, The University of Auckland, Private Bag 92019, Auckland Mail Centre, Auckland 1142, New Zealand
| | - Sarah J Withers
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland Mail Centre, Auckland 1142, New Zealand
| | - Dianne H Brunton
- The Institute of Natural and Mathematical Sciences, Massey University, Albany Campus, Private Bag 102 904, North Shore Mail Centre, Auckland 0745, New Zealand
| | - Howard A Ross
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland Mail Centre, Auckland 1142, New Zealand
| | - Stuart Parsons
- School of Biological Sciences, The University of Auckland, Private Bag 92019, Auckland Mail Centre, Auckland 1142, New Zealand
| |
Collapse
|
9
|
Modeling the utility of binaural cues for underwater sound localization. Hear Res 2014; 312:103-13. [PMID: 24727491 DOI: 10.1016/j.heares.2014.03.011] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Revised: 03/10/2014] [Accepted: 03/24/2014] [Indexed: 11/24/2022]
Abstract
The binaural cues used by terrestrial animals for sound localization in azimuth may not always suffice for accurate sound localization underwater. The purpose of this research was to examine the theoretical limits of interaural timing and level differences available underwater using computational and physical models. A paired-hydrophone system was used to record sounds transmitted underwater and recordings were analyzed using neural networks calibrated to reflect the auditory capabilities of terrestrial mammals. Estimates of source direction based on temporal differences were most accurate for frequencies between 0.5 and 1.75 kHz, with greater resolution toward the midline (2°), and lower resolution toward the periphery (9°). Level cues also changed systematically with source azimuth, even at lower frequencies than expected from theoretical calculations, suggesting that binaural mechanical coupling (e.g., through bone conduction) might, in principle, facilitate underwater sound localization. Overall, the relatively limited ability of the model to estimate source position using temporal and level difference cues underwater suggests that animals such as whales may use additional cues to accurately localize conspecifics and predators at long distances.
Collapse
|
10
|
Ji A, Johnson MT, Walsh EJ, McGee J, Armstrong DL. Discrimination of individual tigers (Panthera tigris) from long distance roars. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:1762-1769. [PMID: 23464045 DOI: 10.1121/1.4789936] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
This paper investigates the extent of tiger (Panthera tigris) vocal individuality through both qualitative and quantitative approaches using long distance roars from six individual tigers at Omaha's Henry Doorly Zoo in Omaha, NE. The framework for comparison across individuals includes statistical and discriminant function analysis across whole vocalization measures and statistical pattern classification using a hidden Markov model (HMM) with frame-based spectral features comprised of Greenwood frequency cepstral coefficients. Individual discrimination accuracy is evaluated as a function of spectral model complexity, represented by the number of mixtures in the underlying Gaussian mixture model (GMM), and temporal model complexity, represented by the number of sequential states in the HMM. Results indicate that the temporal pattern of the vocalization is the most significant factor in accurate discrimination. Overall baseline discrimination accuracy for this data set is about 70% using high level features without complex spectral or temporal models. Accuracy increases to about 80% when more complex spectral models (multiple mixture GMMs) are incorporated, and increases to a final accuracy of 90% when more detailed temporal models (10-state HMMs) are used. Classification accuracy is stable across a relatively wide range of configurations in terms of spectral and temporal model resolution.
Collapse
Affiliation(s)
- An Ji
- Department of Electrical and Computer Engineering, Marquette University, 1515 West Wisconsin Avenue, Milwaukee, Wisconsin 53233, USA
| | | | | | | | | |
Collapse
|
11
|
Gingras B, Fitch WT. A three-parameter model for classifying anurans into four genera based on advertisement calls. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:547-559. [PMID: 23297926 DOI: 10.1121/1.4768878] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The vocalizations of anurans are innate in structure and may therefore contain indicators of phylogenetic history. Thus, advertisement calls of species which are more closely related phylogenetically are predicted to be more similar than those of distant species. This hypothesis was evaluated by comparing several widely used machine-learning algorithms. Recordings of advertisement calls from 142 species belonging to four genera were analyzed. A logistic regression model, using mean values for dominant frequency, coefficient of variation of root-mean square energy, and spectral flux, correctly classified advertisement calls with regard to genus with an accuracy above 70%. Similar accuracy rates were obtained using these parameters with a support vector machine model, a K-nearest neighbor algorithm, and a multivariate Gaussian distribution classifier, whereas a Gaussian mixture model performed slightly worse. In contrast, models based on mel-frequency cepstral coefficients did not fare as well. Comparable accuracy levels were obtained on out-of-sample recordings from 52 of the 142 original species. The results suggest that a combination of low-level acoustic attributes is sufficient to discriminate efficiently between the vocalizations of these four genera, thus supporting the initial premise and validating the use of high-throughput algorithms on animal vocalizations to evaluate phylogenetic hypotheses.
Collapse
Affiliation(s)
- Bruno Gingras
- Department of Cognitive Biology, University of Vienna, Althanstrasse 14, Vienna A-1090, Austria.
| | | |
Collapse
|
12
|
Blumstein DT, Mennill DJ, Clemins P, Girod L, Yao K, Patricelli G, Deppe JL, Krakauer AH, Clark C, Cortopassi KA, Hanser SF, McCowan B, Ali AM, Kirschel ANG. Acoustic monitoring in terrestrial environments using microphone arrays: applications, technological considerations and prospectus. J Appl Ecol 2011. [DOI: 10.1111/j.1365-2664.2011.01993.x] [Citation(s) in RCA: 384] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
13
|
Giret N, Roy P, Albert A, Pachet F, Kreutzer M, Bovet D. Finding good acoustic features for parrot vocalizations: the feature generation approach. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:1089-99. [PMID: 21361465 DOI: 10.1121/1.3531953] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
A crucial step in the understanding of vocal behavior of birds is to be able to classify calls in the repertoire into meaningful types. Methods developed to this aim are limited either because of human subjectivity or because of methodological issues. The present study investigated whether a feature generation system could categorize vocalizations of a bird species automatically and effectively. This procedure was applied to vocalizations of African gray parrots, known for their capacity to reproduce almost any sound of their environment. Outcomes of the feature generation approach agreed well with a much more labor-intensive process of a human expert classifying based on spectrographic representation, while clearly out-performing other automated methods. The method brings significant improvements in precision over commonly used bioacoustical analyses. As such, the method enlarges the scope of automated, acoustics-based sound classification.
Collapse
Affiliation(s)
- Nicolas Giret
- Laboratoire d'Ethologie et Cognition Comparées, Université Paris Ouest Nanterre La Défense, 200 avenue de la République, 92000 Nanterre, France.
| | | | | | | | | | | |
Collapse
|
14
|
Bioacoustic distances between the begging calls of brood parasites and their host species: a comparison of metrics and techniques. Behav Ecol Sociobiol 2010. [DOI: 10.1007/s00265-010-1065-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
15
|
Adi K, Johnson MT, Osiejuk TS. Acoustic censusing using automatic vocalization classification and identity recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:874-883. [PMID: 20136210 DOI: 10.1121/1.3273887] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This paper presents an advanced method to acoustically assess animal abundance. The framework combines supervised classification (song-type and individual identity recognition), unsupervised classification (individual identity clustering), and the mark-recapture model of abundance estimation. The underlying algorithm is based on clustering using hidden Markov models (HMMs) and Gaussian mixture models (GMMs) similar to methods used in the speech recognition community for tasks such as speaker identification and clustering. Initial experiments using a Norwegian ortolan bunting (Emberiza hortulana) data set show the feasibility and effectiveness of the approach. Individually distinct acoustic features have been observed in a wide range of animal species, and this combined with the widespread success of speaker identification and verification methods for human speech suggests that robust automatic identification of individuals from their vocalizations is attainable. Only a few studies, however, have yet attempted to use individual acoustic distinctiveness to directly assess population density and structure. The approach introduced here offers a direct mechanism for using individual vocal variability to create simpler and more accurate population assessment tools in vocally active species.
Collapse
Affiliation(s)
- Kuntoro Adi
- Santa Dharma University, Mrican, Yogyakarta 55002, Indonesia
| | | | | |
Collapse
|
16
|
Mouy X, Bahoura M, Simard Y. Automatic recognition of fin and blue whale calls for real-time monitoring in the St. Lawrence. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:2918-2928. [PMID: 20000904 DOI: 10.1121/1.3257588] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Monitoring blue and fin whales summering in the St. Lawrence Estuary with passive acoustics requires call recognition algorithms that can cope with the heavy shipping noise of the St. Lawrence Seaway and with multipath propagation characteristics that generate overlapping copies of the calls. In this paper, the performance of three time-frequency methods aiming at such automatic detection and classification is tested on more than 2000 calls and compared at several levels of signal-to-noise ratio using typical recordings collected in this area. For all methods, image processing techniques are used to reduce the noise in the spectrogram. The first approach consists in matching the spectrogram with binary time-frequency templates of the calls (coincidence of spectrograms). The second approach is based on the extraction of the frequency contours of the calls and their classification using dynamic time warping (DTW) and the vector quantization (VQ) algorithms. The coincidence of spectrograms was the fastest method and performed better for blue whale A and B calls. VQ detected more 20 Hz fin whale calls but with a higher false alarm rate. DTW and VQ outperformed for the more variable blue whale D calls.
Collapse
Affiliation(s)
- Xavier Mouy
- Marine Sciences Institute, University of Quebec at Rimouski, 310 Allee des Ursulines, Rimouski, Quebec G5L-3A1, Canada.
| | | | | |
Collapse
|
17
|
Brown JC, Smaragdis P. Hidden Markov and Gaussian mixture models for automatic call classification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:EL221-EL224. [PMID: 19507925 DOI: 10.1121/1.3124659] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Automatic methods of classification of animal sounds offer many advantages including speed and consistency in processing massive quantities of data. Calculations have been carried out on a set of 75 calls of Northern Resident killer whales, previously classified perceptually (human classification) into seven call types, using, hidden Markov models (HMMs) and Gaussian mixture models (GMMs). Neither of these methods has been used previously for classification of marine mammal call types. With cepstral coefficients as features both HMMs and GMMs give over 90% agreement with the perceptual classification, with the HMM over 95% for some cases.
Collapse
Affiliation(s)
- Judith C Brown
- Physics Department, Wellesley College, Wellesley, Massachusetts 02481, USA.
| | | |
Collapse
|
18
|
Ren Y, Johnson MT, Tao J. Perceptually motivated wavelet packet transform for bioacoustic signal enhancement. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:316-327. [PMID: 18646979 DOI: 10.1121/1.2932070] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical speech enhancement methods such as spectral subtraction, Wiener filtering, and Ephraim-Malah filtering. Vocalizations recorded from several species are used for evaluation, including the ortolan bunting (Emberiza hortulana), rhesus monkey (Macaca mulatta), and humpback whale (Megaptera novaeanglia), with both additive white Gaussian noise and environment recording noise added across a range of signal-to-noise ratios (SNRs). Results, measured by both SNR and segmental SNR of the enhanced wave forms, indicate that the proposed method outperforms other approaches for a wide range of noise conditions.
Collapse
Affiliation(s)
- Yao Ren
- Speech and Signal Processing Laboratory, Marquette University, P.O. Box 1881, Milwaukee, Wisconsin 53233-1881, USA.
| | | | | |
Collapse
|
19
|
A new perspective on acoustic individual recognition in animals with limited call sharing or changing repertoires. Anim Behav 2008. [DOI: 10.1016/j.anbehav.2007.11.003] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
20
|
Tao J, Johnson MT, Osiejuk TS. Acoustic model adaptation for ortolan bunting (Emberiza hortulana L.) song-type classification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:1582-1590. [PMID: 18345846 DOI: 10.1121/1.2837487] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Automatic systems for vocalization classification often require fairly large amounts of data on which to train models. However, animal vocalization data collection and transcription is a difficult and time-consuming task, so that it is expensive to create large data sets. One natural solution to this problem is the use of acoustic adaptation methods. Such methods, common in human speech recognition systems, create initial models trained on speaker independent data, then use small amounts of adaptation data to build individual-specific models. Since, as in human speech, individual vocal variability is a significant source of variation in bioacoustic data, acoustic model adaptation is naturally suited to classification in this domain as well. To demonstrate and evaluate the effectiveness of this approach, this paper presents the application of maximum likelihood linear regression adaptation to ortolan bunting (Emberiza hortulana L.) song-type classification. Classification accuracies for the adapted system are computed as a function of the amount of adaptation data and compared to caller-independent and caller-dependent systems. The experimental results indicate that given the same amount of data, supervised adaptation significantly outperforms both caller-independent and caller-dependent systems.
Collapse
Affiliation(s)
- Jidong Tao
- Speech and Signal Processing Laboratory, Marquette University, PO Box 1881, Milwaukee, Wisconsin 53233-1881, USA.
| | | | | |
Collapse
|
21
|
Roch MA, Soldevilla MS, Burtenshaw JC, Henderson EE, Hildebrand JA. Gaussian mixture model classification of odontocetes in the Southern California Bight and the Gulf of California. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:1737-48. [PMID: 17407910 DOI: 10.1121/1.2400663] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
A method for the automatic classification of free-ranging delphinid vocalizations is presented. The vocalizations of short-beaked and long-beaked common (Delphinus delphis and Delphinus capensis), Pacific white-sided (Lagenorhynchus obliquidens), and bottlenose (Tursiops truncatus) dolphins were recorded in a pelagic environment of the Southern California Bight and the Gulf of California over a period of 4 years. Cepstral feature vectors are extracted from call data which contain simultaneous overlapping whistles, burst-pulses, and clicks from a single species. These features are grouped into multisecond segments. A portion of the data is used to train Gaussian mixture models of varying orders for each species. The remaining call data are used to test the performance of the models. Species are predicted based upon probabilistic measures of model similarity with test segment groups having durations between 1 and 25 s. For this data set, 256 mixture Gaussian mixture models and segments of at least 10 s of call data resulted in the best classification results. The classifier predicts the species of groups with 67%-75% accuracy depending upon the partitioning of the training and test data.
Collapse
Affiliation(s)
- Marie A Roch
- Department of Computer Science, San Diego State University, 5500 Campanile Drive, San Diego, California 92182-7720, USA.
| | | | | | | | | |
Collapse
|