1
|
Wu X, Zhou S, Chen M, Zhao Y, Wang Y, Zhao X, Li D, Pu H. Combined spectral and speech features for pig speech recognition. PLoS One 2022; 17:e0276778. [PMID: 36454724 PMCID: PMC9714723 DOI: 10.1371/journal.pone.0276778] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 10/13/2022] [Indexed: 12/03/2022] Open
Abstract
The sound of the pig is one of its important signs, which can reflect various states such as hunger, pain or emotional state, and directly indicates the growth and health status of the pig. Existing speech recognition methods usually start with spectral features. The use of spectrograms to achieve classification of different speech sounds, while working well, may not be the best approach for solving such tasks with single-dimensional feature input. Based on the above assumptions, in order to more accurately grasp the situation of pigs and take timely measures to ensure the health status of pigs, this paper proposes a pig sound classification method based on the dual role of signal spectrum and speech. Spectrograms can visualize information about the characteristics of the sound under different time periods. The audio data are introduced, and the spectrogram features of the model input as well as the audio time-domain features are complemented with each other and passed into a pre-designed parallel network structure. The network model with the best results and the classifier were selected for combination. An accuracy of 93.39% was achieved on the pig speech classification task, while the AUC also reached 0.99163, demonstrating the superiority of the method. This study contributes to the direction of computer vision and acoustics by recognizing the sound of pigs. In addition, a total of 4,000 pig sound datasets in four categories are established in this paper to provide a research basis for later research scholars.
Collapse
Affiliation(s)
- Xuan Wu
- College of Information Engineering, Sichuan Agricultural University, Ya’an, Sichuan, China
| | - Silong Zhou
- College of Information Engineering, Sichuan Agricultural University, Ya’an, Sichuan, China
| | - Mingwei Chen
- College of Information Engineering, Sichuan Agricultural University, Ya’an, Sichuan, China
| | - Yihang Zhao
- College of Information Engineering, Sichuan Agricultural University, Ya’an, Sichuan, China
| | - Yifei Wang
- Department of Economics, University of Calgary, Calgary, AB, Canada
| | - Xianmeng Zhao
- College of Information Engineering, Sichuan Agricultural University, Ya’an, Sichuan, China
| | - Danyang Li
- College of Information Engineering, Sichuan Agricultural University, Ya’an, Sichuan, China
| | - Haibo Pu
- College of Information Engineering, Sichuan Agricultural University, Ya’an, Sichuan, China
| |
Collapse
|
2
|
Monedero Í, Barbancho J, Márquez R, Beltrán JF. Cyber-Physical System for Environmental Monitoring Based on Deep Learning. SENSORS 2021; 21:s21113655. [PMID: 34073979 PMCID: PMC8197376 DOI: 10.3390/s21113655] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/06/2021] [Accepted: 05/13/2021] [Indexed: 11/18/2022]
Abstract
Cyber-physical systems (CPS) constitute a promising paradigm that could fit various applications. Monitoring based on the Internet of Things (IoT) has become a research area with new challenges in which to extract valuable information. This paper proposes a deep learning classification sound system for execution over CPS. This system is based on convolutional neural networks (CNNs) and is focused on the different types of vocalization of two species of anurans. CNNs, in conjunction with the use of mel-spectrograms for sounds, are shown to be an adequate tool for the classification of environmental sounds. The classification results obtained are excellent (97.53% overall accuracy) and can be considered a very promising use of the system for classifying other biological acoustic targets as well as analyzing biodiversity indices in the natural environment. The paper concludes by observing that the execution of this type of CNN, involving low-cost and reduced computing resources, are feasible for monitoring extensive natural areas. The use of CPS enables flexible and dynamic configuration and deployment of new CNN updates over remote IoT nodes.
Collapse
Affiliation(s)
- Íñigo Monedero
- Tecnología Electrónica, Escuela Politéncia Superior, Universidad de Sevilla, Calle Virgen de África 7, 41012 Sevilla, Spain;
| | - Julio Barbancho
- Tecnología Electrónica, Escuela Politéncia Superior, Universidad de Sevilla, Calle Virgen de África 7, 41012 Sevilla, Spain;
- Correspondence: ; Tel.: +34-955-55-28-38
| | - Rafael Márquez
- Fonoteca Zoológica, Departamento de Biodiversidad y Biología Evolutiva, Museo Nacional de Ciencias Naturales (CSIC), Calle José Gutiérrez Abascal, 2, 28006 Madrid, Spain;
| | - Juan F. Beltrán
- Departamento de Zoología, Facultad de Biología, Universidad de Sevilla, Avenida de la Reina Mercedes, s/n, 41012 Sevilla, Spain;
| |
Collapse
|
3
|
García S, Larios DF, Barbancho J, Personal E, Mora-Merchán JM, León C. Heterogeneous LoRa-Based Wireless Multimedia Sensor Network Multiprocessor Platform for Environmental Monitoring. SENSORS 2019; 19:s19163446. [PMID: 31394731 PMCID: PMC6720635 DOI: 10.3390/s19163446] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 08/02/2019] [Accepted: 08/05/2019] [Indexed: 11/21/2022]
Abstract
The acquisition of data in protected natural environments is subordinated to actions that do not stress the life-forms present in that environment. This is why researchers face two conflicting interests: autonomous and robust systems that minimize the physical interaction with sensors once installed, and complex enough ones to capture and process higher volumes of data. On the basis of this situation, this paper analyses the current state-of-the-art of wireless multimedia sensor networks, identifying the limitations and needs of these solutions. In this sense, in order to improve the trade-off between autonomous and computational capabilities, this paper proposes a heterogeneous multiprocessor sensor platform, consisting of an ultra-low power microcontroller and a high-performance processor, which transfers control between processors as needed. This architecture allows the shutdown of idle systems and fail-safe remote reprogramming. The sensor equipment can be adapted to the needs of the project. The deployed equipment incorporates, in addition to environmental meteorological variables, a microphone input and two cameras (visible and thermal) to capture multimedia data. In addition to the hardware description, the paper provides a brief description of how long-range (LoRa) can be used for sending large messages (such as an image or a new firmware), an economic analysis of the platform, and a study on energy consumption of the platform according to different use cases.
Collapse
Affiliation(s)
- Sebastián García
- Department of Electronic Technology, Escuela Politécnica Superior, University of Seville, 41011 Seville, Spain.
| | - Diego F Larios
- Department of Electronic Technology, Escuela Politécnica Superior, University of Seville, 41011 Seville, Spain
| | - Julio Barbancho
- Department of Electronic Technology, Escuela Politécnica Superior, University of Seville, 41011 Seville, Spain
| | - Enrique Personal
- Department of Electronic Technology, Escuela Politécnica Superior, University of Seville, 41011 Seville, Spain
| | - Javier M Mora-Merchán
- Department of Electronic Technology, Escuela Politécnica Superior, University of Seville, 41011 Seville, Spain
| | - Carlos León
- Department of Electronic Technology, Escuela Politécnica Superior, University of Seville, 41011 Seville, Spain
| |
Collapse
|
4
|
Improving Classification Algorithms by Considering Score Series in Wireless Acoustic Sensor Networks. SENSORS 2018; 18:s18082465. [PMID: 30061506 PMCID: PMC6111609 DOI: 10.3390/s18082465] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 07/22/2018] [Accepted: 07/27/2018] [Indexed: 11/17/2022]
Abstract
The reduction in size, power consumption and price of many sensor devices has enabled the deployment of many sensor networks that can be used to monitor and control several aspects of various habitats. More specifically, the analysis of sounds has attracted a huge interest in urban and wildlife environments where the classification of the different signals has become a major issue. Various algorithms have been described for this purpose, a number of which frame the sound and classify these frames, while others take advantage of the sequential information embedded in a sound signal. In the paper, a new algorithm is proposed that, while maintaining the frame-classification advantages, adds a new phase that considers and classifies the score series derived after frame labelling. These score series are represented using cepstral coefficients and classified using standard machine-learning classifiers. The proposed algorithm has been applied to a dataset of anuran calls and its results compared to the performance obtained in previous experiments on sensor networks. The main outcome of our research is that the consideration of score series strongly outperforms other algorithms and attains outstanding performance despite the noisy background commonly encountered in this kind of application.
Collapse
|
5
|
Luque A, Gómez-Bellido J, Carrasco A, Barbancho J. Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks. SENSORS 2018; 18:s18061803. [PMID: 29865290 PMCID: PMC6022039 DOI: 10.3390/s18061803] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2018] [Revised: 05/31/2018] [Accepted: 06/01/2018] [Indexed: 02/05/2023]
Abstract
The analysis and classification of the sounds produced by certain animal species, notably anurans, have revealed these amphibians to be a potentially strong indicator of temperature fluctuations and therefore of the existence of climate change. Environmental monitoring systems using Wireless Sensor Networks are therefore of interest to obtain indicators of global warming. For the automatic classification of the sounds recorded on such systems, the proper representation of the sound spectrum is essential since it contains the information required for cataloguing anuran calls. The present paper focuses on this process of feature extraction by exploring three alternatives: the standardized MPEG-7, the Filter Bank Energy (FBE), and the Mel Frequency Cepstral Coefficients (MFCC). Moreover, various values for every option in the extraction of spectrum features have been considered. Throughout the paper, it is shown that representing the frame spectrum with pure FBE offers slightly worse results than using the MPEG-7 features. This performance can easily be increased, however, by rescaling the FBE in a double dimension: vertically, by taking the logarithm of the energies; and, horizontally, by applying mel scaling in the filter banks. On the other hand, representing the spectrum in the cepstral domain, as in MFCC, has shown additional marginal improvements in classification performance.
Collapse
Affiliation(s)
- Amalia Luque
- Ingeniería del Diseño, Escuela Politécnica Superior, Universidad de Sevilla, 41004 Sevilla, Spain.
| | - Jesús Gómez-Bellido
- Ingeniería del Diseño, Escuela Politécnica Superior, Universidad de Sevilla, 41004 Sevilla, Spain.
| | - Alejandro Carrasco
- Tecnología Electrónica, Escuela Ingeniería Informática, Universidad de Sevilla, 41004 Sevilla, Spain.
| | - Julio Barbancho
- Tecnología Electrónica, Escuela Politécnica Superior, Universidad de Sevilla, 41004 Sevilla, Spain.
| |
Collapse
|
6
|
Luque A, Romero-Lemos J, Carrasco A, Gonzalez-Abril L. Temporally-aware algorithms for the classification of anuran sounds. PeerJ 2018; 6:e4732. [PMID: 29740517 PMCID: PMC5937479 DOI: 10.7717/peerj.4732] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 04/18/2018] [Indexed: 11/20/2022] Open
Abstract
Several authors have shown that the sounds of anurans can be used as an indicator of climate change. Hence, the recording, storage and further processing of a huge number of anuran sounds, distributed over time and space, are required in order to obtain this indicator. Furthermore, it is desirable to have algorithms and tools for the automatic classification of the different classes of sounds. In this paper, six classification methods are proposed, all based on the data-mining domain, which strive to take advantage of the temporal character of the sounds. The definition and comparison of these classification methods is undertaken using several approaches. The main conclusions of this paper are that: (i) the sliding window method attained the best results in the experiments presented, and even outperformed the hidden Markov models usually employed in similar applications; (ii) noteworthy overall classification performance has been obtained, which is an especially striking result considering that the sounds analysed were affected by a highly noisy background; (iii) the instance selection for the determination of the sounds in the training dataset offers better results than cross-validation techniques; and (iv) the temporally-aware classifiers have revealed that they can obtain better performance than their non-temporally-aware counterparts.
Collapse
Affiliation(s)
- Amalia Luque
- Departamento de Ingeniería del Diseño, Universidad de Sevilla, Sevilla, Spain
| | - Javier Romero-Lemos
- Departamento de Ingeniería del Diseño, Universidad de Sevilla, Sevilla, Spain
| | - Alejandro Carrasco
- Departamento de Tecnología Electrónica, Universidad de Sevilla, Sevilla, Spain
| | | |
Collapse
|