Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Lalitha S, Geyasruti D, Narayanan R, M S. Emotion Detection Using MFCC and Cepstrum Features. ACTA ACUST UNITED AC 2015. [DOI: 10.1016/j.procs.2015.10.020] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Number

Cited by Other Article(s)

Li M, Erickson IM. It's Not Only What You Say, But Also How You Say It: Machine Learning Approach to Estimate Trust from Conversation. HUMAN FACTORS 2024;66:1724-1741. [PMID: 37116009 PMCID: PMC11044523 DOI: 10.1177/00187208231166624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Humayun MA, Shuja J, Abas PE. A review of social background profiling of speakers from speech accents. PeerJ Comput Sci 2024;10:e1984. [PMID: 38660189 PMCID: PMC11042007 DOI: 10.7717/peerj-cs.1984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 03/18/2024] [Indexed: 04/26/2024]

Deng M, Chen J, Wu Y, Ma S, Li H, Yang Z, Shen Y. Using voice recognition to measure trust during interactions with automated vehicles. APPLIED ERGONOMICS 2024;116:104184. [PMID: 38048717 DOI: 10.1016/j.apergo.2023.104184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 11/10/2023] [Accepted: 11/20/2023] [Indexed: 12/06/2023]

Pulatov I, Oteniyazov R, Makhmudov F, Cho YI. Enhancing Speech Emotion Recognition Using Dual Feature Extraction Encoders. SENSORS (BASEL, SWITZERLAND) 2023;23:6640. [PMID: 37514933 PMCID: PMC10383041 DOI: 10.3390/s23146640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 07/21/2023] [Accepted: 07/21/2023] [Indexed: 07/30/2023]

Abstract

Understanding and identifying emotional cues in human speech is a crucial aspect of human-computer communication. The application of computer technology in dissecting and deciphering emotions, along with the extraction of relevant emotional characteristics from speech, forms a significant part of this process. The objective of this study was to architect an innovative framework for speech emotion recognition predicated on spectrograms and semantic feature transcribers, aiming to bolster performance precision by acknowledging the conspicuous inadequacies in extant methodologies and rectifying them. To procure invaluable attributes for speech detection, this investigation leveraged two divergent strategies. Primarily, a wholly convolutional neural network model was engaged to transcribe speech spectrograms. Subsequently, a cutting-edge Mel-frequency cepstral coefficient feature abstraction approach was adopted and integrated with Speech2Vec for semantic feature encoding. These dual forms of attributes underwent individual processing before they were channeled into a long short-term memory network and a comprehensive connected layer for supplementary representation. By doing so, we aimed to bolster the sophistication and efficacy of our speech emotion detection model, thereby enhancing its potential to accurately recognize and interpret emotion from human speech. The proposed mechanism underwent a rigorous evaluation process employing two distinct databases: RAVDESS and EMO-DB. The outcome displayed a predominant performance when juxtaposed with established models, registering an impressive accuracy of 94.8% on the RAVDESS dataset and a commendable 94.0% on the EMO-DB dataset. This superior performance underscores the efficacy of our innovative system in the realm of speech emotion recognition, as it outperforms current frameworks in accuracy metrics.

Collapse

Kumar MR, Vekkot S, Lalitha S, Gupta D, Govindraj VJ, Shaukat K, Alotaibi YA, Zakariah M. Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures. SENSORS (BASEL, SWITZERLAND) 2022;22:s22239311. [PMID: 36502013 PMCID: PMC9740675 DOI: 10.3390/s22239311] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/14/2022] [Accepted: 11/15/2022] [Indexed: 06/01/2023]

Early recognition of a caller's emotion in out-of-hospital cardiac arrest dispatching: An artificial intelligence approach. Resuscitation 2021;167:144-150. [PMID: 34461203 DOI: 10.1016/j.resuscitation.2021.08.032] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 08/11/2021] [Accepted: 08/18/2021] [Indexed: 11/23/2022]

Analysis of speech MEL scale and its classification as big data by parameterized KNN. ARTIF INTELL 2021. [DOI: 10.15407/jai2021.01.042] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Abstract Recognizing emotions and human speech has always been an exciting challenge for scientists. In our work the parameterization of the vector is obtained and realized from the sentence divided into the containing emotional-informational part and the informational part is effectively applied. The expressiveness of human speech is improved by the emotion it conveys. There are several characteristics and features of speech that differentiate it among utterances, i.e. various prosodic features like pitch, timbre, loudness and vocal tone which categorize speech into several emotions. They were supplemented by us with a new classification feature of speech, which consists in dividing a sentence into an emotionally loaded part of the sentence and a part that carries only informational load. Therefore, the sample speech is changed when it is subjected to various emotional environments. As the identification of the speaker’s emotional states can be done based on the Mel scale, MFCC is one such variant to study the emotional aspects of a speaker’s utterances. In this work, we implement a model to identify several emotional states from MFCC for two datasets, classify emotions for them on the basis of MFCC features and give the correspondent comparison of them. Overall, this work implements the classification model based on dataset minimization that is done by taking the mean of features for the improvement of the classification accuracy rate in different machine learning algorithms. In addition to the static analysis of the author's tonal portrait, which is used in particular in MFFC, we propose a new method for the dynamic analysis of the phrase in processing and studying as a new linguistic-emotional entity pronounced by the same author. Due to the ranking by the importance of the MEL scale features, we are able to parameterize the vectors coordinates be processed by the parametrized KNN method. Language recognition is a multi-level task of pattern recognition. Here acoustic signals are analyzed and structured in a hierarchy of structural elements, words, phrases and sentences. Each level of such a hierarchy may provide some temporal constants: possible word sequences or known types of pronunciation that reduce the number of recognition errors at a lower level. An analysis of voice and speech dynamics is appropriate for improving the quality of human perception and the formation of human speech by a machine and is within the capabilities of artificial intelligence. Emotion results can be widely applied in e-learning platforms, vehicle on-board systems, medicine, etc Collapse

Yerigeri VV, Ragha L. Speech stress recognition using semi-eager learning. COGN SYST RES 2021. [DOI: 10.1016/j.cogsys.2020.10.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Pramod Reddy A, V V. Recognition of human emotion with spectral features using multi layer-perceptron. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS 2020. [DOI: 10.3233/kes-200044] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Yang N, Dey N, Sherratt RS, Shi F. Recognize basic emotional statesin speech by machine learning techniques using mel-frequency cepstral coefficient features. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-179963] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Reduction of the Multipath Propagation Effect in a Hydroacoustic Channel Using Filtration in Cepstrum. SENSORS 2020;20:s20030751. [PMID: 32013243 PMCID: PMC7038370 DOI: 10.3390/s20030751] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Revised: 01/22/2020] [Accepted: 01/26/2020] [Indexed: 11/24/2022]

Classifying Heart Sounds Using Images of Motifs, MFCC and Temporal Features. J Med Syst 2019;43:168. [DOI: 10.1007/s10916-019-1286-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Accepted: 04/09/2019] [Indexed: 10/26/2022]

Zerari N, Abdelhamid S, Bouzgou H, Raymond C. Bidirectional deep architecture for Arabic speech recognition. OPEN COMPUTER SCIENCE 2019. [DOI: 10.1515/comp-2019-0004] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Nogueira DM, Ferreira CA, Jorge AM. Classifying Heart Sounds Using Images of MFCC and Temporal Features. PROGRESS IN ARTIFICIAL INTELLIGENCE 2017. [DOI: 10.1007/978-3-319-65340-2_16] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]