1
|
Huang Z, Siniscalchi SM, Lee CH. Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation. Pattern Recognit Lett 2017. [DOI: 10.1016/j.patrec.2017.08.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
2
|
Chaudhary G, Srivastava S, Bhardwaj S. Feature Extraction Methods for Speaker Recognition: A Review. INT J PATTERN RECOGN 2017. [DOI: 10.1142/s0218001417500410] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
This paper presents main paradigms of research for feature extraction methods to further augment the state of art in speaker recognition (SR) which has been recognized extensively in person identification for security and protection applications. Speaker recognition system (SRS) has become a widely researched topic for the last many decades. The basic concept of feature extraction methods is derived from the biological model of human auditory/vocal tract system. This work provides a classification-oriented review of feature extraction methods for SR over the last 55 years that are proven to be successful and have become the new stone to further research. Broadly, the review work is dichotomized into feature extraction methods with and without noise compensation techniques. Feature extraction methods without noise compensation techniques are divided into following categories: On the basis of high/low level of feature extraction; type of transform; speech production/auditory system; type of feature extraction technique; time variability; speech processing techniques. Further, feature extraction methods with noise compensation techniques are classified into noise-screened features, feature normalization methods, feature compensation methods. This classification-oriented review would endow the clear vision of readers to choose among different techniques and will be helpful in future research in this field.
Collapse
Affiliation(s)
- Gopal Chaudhary
- Division of Instrumentation and Control Engineering, Netaji Subhas Institute of Technology, University of Delhi, New Delhi, India
| | - Smriti Srivastava
- Division of Instrumentation and Control Engineering, Netaji Subhas Institute of Technology, University of Delhi, New Delhi, India
| | | |
Collapse
|
3
|
Huang Z, Siniscalchi SM, Lee CH. A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.09.018] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
4
|
Tsao Y, Lu X, Dixon P, Hu TY, Matsuda S, Hori C. Incorporating local information of the acoustic environments to MAP-based feature compensation and acoustic model adaptation. COMPUT SPEECH LANG 2014. [DOI: 10.1016/j.csl.2013.11.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
5
|
Siniscalchi SM, Li J, Lee CH. Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems. ACTA ACUST UNITED AC 2013. [DOI: 10.1109/tasl.2013.2270370] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
6
|
|
7
|
Gaussian Selection Using Self-Organizing Map for Automatic Speech Recognition. ADVANCES IN SELF-ORGANIZING MAPS 2011. [DOI: 10.1007/978-3-642-21566-7_22] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
8
|
Yamagishi J, Nose T, Zen H, Ling ZH, Toda T, Tokuda K, King S, Renals S. Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis. ACTA ACUST UNITED AC 2009. [DOI: 10.1109/tasl.2009.2016394] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
9
|
Yamagishi J, Kobayashi T, Nakano Y, Ogata K, Isogai J. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm. ACTA ACUST UNITED AC 2009. [DOI: 10.1109/tasl.2008.2006647] [Citation(s) in RCA: 187] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
10
|
Janev M, Pekar D, Jakovljevic N, Delic V. Eigenvalues Driven Gaussian Selection in continuous speech recognition using HMMs with full covariance matrices. APPL INTELL 2008. [DOI: 10.1007/s10489-008-0152-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
11
|
Tian Y, Zhou JL, Lin H, Jiang H. Tree-Based Covariance Modeling of Hidden Markov Models. ACTA ACUST UNITED AC 2006. [DOI: 10.1109/tsa.2005.863210] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
12
|
Yu Tsao, Shang-Ming Lee, Lin-Shan Lee. Segmental eigenvoice with delicate eigenspace for improved speaker adaptation. ACTA ACUST UNITED AC 2005. [DOI: 10.1109/tsa.2005.845819] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
13
|
Bing Xiang, Berger T. Efficient text-independent speaker verification with structural gaussian mixture models and neural network. ACTA ACUST UNITED AC 2003. [DOI: 10.1109/tsa.2003.815822] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
14
|
|