1
|
|
2
|
Ramos-Garcia RI, Muth ER, Gowdy JN, Hoover AW. Improving the recognition of eating gestures using intergesture sequential dependencies. IEEE J Biomed Health Inform 2014; 19:825-31. [PMID: 24919205 DOI: 10.1109/jbhi.2014.2329137] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This paper considers the problem of recognizing eating gestures by tracking wrist motion. Eating gestures are activities commonly undertaken during the consumption of a meal, such as sipping a drink of liquid or using utensils to cut food. Each of these gestures causes a pattern of wrist motion that can be tracked to automatically identify the activity. Previous works have studied this problem at the level of a single gesture. In this paper, we demonstrate that individual gestures have sequential dependence. To study this, three types of classifiers were built: 1) a K-nearest neighbor classifier which uses no sequential context, 2) a hidden Markov model (HMM) which captures the sequential context of subgesture motions, and 3) HMMs that model intergesture sequential dependencies. We built first-order to sixth-order HMMs to evaluate the usefulness of increasing amounts of sequential dependence to aid recognition. On a dataset of 25 meals, we found that the baseline accuracies for the KNN and the subgesture HMM classifiers were 75.8% and 84.3%, respectively. Using HMMs that model intergesture sequential dependencies, we were able to increase accuracy to up to 96.5%. These results demonstrate that sequential dependencies exist between eating gestures and that they can be exploited to improve recognition accuracy.
Collapse
|
3
|
|
4
|
Rabiner LR, Pan KC, Soong FK. On the Performance of Isolated Word Speech Recognizers Using Vector Quantization and Temporal Energy Contours. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/j.1538-7305.1984.tb00035.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
5
|
Rabiner LR, Wilpon JG. Application of Clustering Techniques to Speaker-Trained Isolated Word Recognition. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/j.1538-7305.1979.tb02964.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
6
|
Rabiner LR, Levinson SE, Sondhi MM. On the Application of Vector Quantization and Hidden Markov Models to Speaker-Independent, Isolated Word Recognition. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/j.1538-7305.1983.tb03115.x] [Citation(s) in RCA: 176] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
7
|
Wilpon JG. A Study on the Ability to Automatically Recognize Telephone-Quality Speech From Large Customer Populations. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/j.1538-7305.1985.tb00441.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
8
|
Rabiner LR, Wilpon JG, Juang BH. A Segmentalk-Means Training Procedure for Connected Word Recognition. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/j.1538-7305.1986.tb00368.x] [Citation(s) in RCA: 146] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
9
|
Wilpon JG, Rabiner LR, Martin T. An Improved Word-Detection Algorithm for Telephone-Quality Speech Incorporating Both Syntactic and Semantic Constraints. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/j.1538-7305.1984.tb00016.x] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
10
|
Rabiner LR, Juang BH, Levinson SE, Sondhi MM. Recognition of Isolated Digits Using Hidden Markov Models With Continuous Mixture Densities. ACTA ACUST UNITED AC 2013. [DOI: 10.1002/j.1538-7305.1985.tb00272.x] [Citation(s) in RCA: 135] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
11
|
Dermatas E, Kokkinakis G. Algorithm for clustering continuous density HMM by recognition error. ACTA ACUST UNITED AC 1996. [DOI: 10.1109/89.496219] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
12
|
|
13
|
Gopalakrishnan M, Sridhar V, Krishnamurthy H. Some applications of clustering in the design of neural networks. Pattern Recognit Lett 1995. [DOI: 10.1016/0167-8655(94)00064-a] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
14
|
|
15
|
Sy BK, Horowitz DM. A statistical causal model for the assessment of dysarthric speech and the utility of computer-based speech recognition. IEEE Trans Biomed Eng 1993; 40:1282-98. [PMID: 8125504 DOI: 10.1109/10.250584] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The evaluation of the degree of speech impairment and the utility of computer recognition of impaired speech are separately and independently performed. Particular attention is paid to the question concerning whether or not there is a relationship between naive listeners' subjective judgements of impaired speech and the performance of a laboratory version of a speech recognition system. It is a difficult task to relate a speech impairment rating with speech recognition accuracy. Towards this end, a statistical causal model is proposed. This model is very appealing in its structure to support inference, and thus can be applied to perform various assessments such as the success of automatic recognition of dysarthric speech. The application of this model is illustrated with a case study of a dysarthric speaker compared against a normal speaker serving as a control.
Collapse
Affiliation(s)
- B K Sy
- Department of Computer Science, Queens College, Flushing, NY 11367
| | | |
Collapse
|
16
|
|
17
|
Furui S. Unsupervised speaker adaptation based on hierarchical spectral clustering. ACTA ACUST UNITED AC 1989. [DOI: 10.1109/29.45538] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
18
|
Larar J, Schroeter J, Sondhi M. Vector quantization of the articulatory space. ACTA ACUST UNITED AC 1988. [DOI: 10.1109/29.9026] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
19
|
|
20
|
Rabiner L, Wilpon J. Some performance benchmarks for isolated work speech recognition systems. COMPUT SPEECH LANG 1987. [DOI: 10.1016/0885-2308(87)90016-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
21
|
Biing-Hwang Juang, Rabiner L, Wilpon J. On the use of bandpass liftering in speech recognition. ACTA ACUST UNITED AC 1987. [DOI: 10.1109/tassp.1987.1165237] [Citation(s) in RCA: 122] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
22
|
|
23
|
Choukri K, Chollet G. Adaptation of automatic speech recognizers to new speakers using canonical correlation analysis techniques. COMPUT SPEECH LANG 1986. [DOI: 10.1016/s0885-2308(86)80017-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
24
|
Bocchieri E, Doddington G. Frame-specific statistical features for speaker independent speech recognition. ACTA ACUST UNITED AC 1986. [DOI: 10.1109/tassp.1986.1164911] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
25
|
Biing-Hwang Juang, Rabiner L. Mixture autoregressive hidden Markov models for speech signals. ACTA ACUST UNITED AC 1985. [DOI: 10.1109/tassp.1985.1164727] [Citation(s) in RCA: 178] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
26
|
Rabiner L, Levinson S. A speaker-independent, syntax-directed, connected word recognition system based on hidden Markov models and level building. ACTA ACUST UNITED AC 1985. [DOI: 10.1109/tassp.1985.1164586] [Citation(s) in RCA: 49] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
27
|
Wilpon J, Rabiner L. A modified K-means clustering algorithm for use in isolated work recognition. ACTA ACUST UNITED AC 1985. [DOI: 10.1109/tassp.1985.1164581] [Citation(s) in RCA: 117] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
28
|
Dautrich B, Rabiner L, Martin T. On the effects of varying filter bank parameters on isolated word recognition. ACTA ACUST UNITED AC 1983. [DOI: 10.1109/tassp.1983.1164172] [Citation(s) in RCA: 75] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
29
|
Abstract
This paper provides a review of the state of the art in speech recognition by machine. Although many speech recognition systems have been demonstrated and several commercial products are currently being used in selected environments, computer speech recognition has still very limited capabilities when compared with human performance. Typical component subsystems at the acoustic, phonetic, syntactic, and semantic level are described. The major problems in developing recognition systems and their performance under various conditions are presented.
Collapse
|
30
|
Paliwal KK, Rao PV. Application of k-Nearest-Neighbor Decision Rule in Vowel Recognition. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1983; 5:229-231. [PMID: 21869107 DOI: 10.1109/tpami.1983.4767378] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The k-nearest-neighbor decision rule is known to provide a useful nonparametric procedure for pattern classification. This rule is applied here to a vowel recognition problem and the effect of the number (k) of nearest neighbors, the size of the trained set and the type of the distance measure on vowel recognition performance is studied. It is shown that the vowel recognition performance remains approximately constant for all the values of k. The recognition performance initially improves with the size of the training set and then converges to an asymptotic value. Selection of a better distance measure leads to a significant improvement in vowel recognition performance.
Collapse
Affiliation(s)
- K K Paliwal
- Speech and Digital Systems Group, Tata Institute of Fundamental Research, Bombay 400005, India
| | | |
Collapse
|
31
|
Brown M, Rabiner L. An adaptive, ordered, graph search technique for dynamic time warping for isolated word recognition. ACTA ACUST UNITED AC 1982. [DOI: 10.1109/tassp.1982.1163916] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
32
|
Flanagan JL. Talking with computers: synthesis and recognition of speech by machines. IEEE Trans Biomed Eng 1982; 29:223-32. [PMID: 7068162 DOI: 10.1109/tbme.1982.325030] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
33
|
De Souza P, Thomson P. LPC distance measures and statistical tests with particular reference to the likelihood ratio. ACTA ACUST UNITED AC 1982. [DOI: 10.1109/tassp.1982.1163867] [Citation(s) in RCA: 37] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
34
|
Lamel L, Rabiner L, Rosenberg A, Wilpon J. An improved endpoint detector for isolated word recognition. ACTA ACUST UNITED AC 1981. [DOI: 10.1109/tassp.1981.1163642] [Citation(s) in RCA: 180] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
35
|
Myers C, Rabiner L. Connected digit recognition using a level-building DTW algorithm. ACTA ACUST UNITED AC 1981. [DOI: 10.1109/tassp.1981.1163586] [Citation(s) in RCA: 65] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
36
|
Myers C, Rabiner L. A level building dynamic time warping algorithm for connected word recognition. ACTA ACUST UNITED AC 1981. [DOI: 10.1109/tassp.1981.1163527] [Citation(s) in RCA: 137] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
37
|
Myers C, Rabiner L, Rosenberg A. Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. ACTA ACUST UNITED AC 1980. [DOI: 10.1109/tassp.1980.1163491] [Citation(s) in RCA: 345] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
38
|
|
39
|
Rabiner L, Wilpon J. Speaker-independent isolated word recognition for a moderate size(54 word)vocabulary. ACTA ACUST UNITED AC 1979. [DOI: 10.1109/tassp.1979.1163323] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|