1
|
Zhou Z, Yang Y, Liu J, Zeng J, Wang X, Liu H. Electrotactile Perception Properties and Its Applications: A Review. IEEE TRANSACTIONS ON HAPTICS 2022; 15:464-478. [PMID: 35476571 DOI: 10.1109/toh.2022.3170723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
With the increased demands of human-machine interaction, haptic feedback is becoming increasingly critical. However, the high cost, large size and low efficiency of current haptic systems severely hinder further development. As a portable and efficient technology, cutaneous electrotactile stimulation has shown promising potential for these issues. This paper presents a review on and insight into cutaneous electrotactile perception and its applications. Research results on perceptual properties and evaluation methods have been summarized and discussed to understand the effects of electrotactile stimulation on humans. Electrotactile applications are presented in categories to understand the methods and progress in various fields such as prostheses control, sensory substitution, sensory restoration and sensorimotor restoration. State of the art has demonstrated the superiority of electrotactile feedback, its efficiency and its flexibility. However, the complex factors and the limitations of evaluation methods made it challenging for precise electrotactile control. Groundbreaking innovation in electrotactile theory is expected to overcome challenges such as precise perception control, information capacity increasing, comprehension burden reducing and implementation costs.
Collapse
|
2
|
Scarborough R, Keating P, Mattys SL, Cho T, Alwan A. Optical phonetics and visual perception of lexical and phrasal stress in English. LANGUAGE AND SPEECH 2009; 52:135-175. [PMID: 19624028 DOI: 10.1177/0023830909103165] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
In a study of optical cues to the visual perception of stress, three American English talkers spoke words that differed in lexical stress and sentences that differed in phrasal stress, while video and movements of the face were recorded. The production of stressed and unstressed syllables from these utterances was analyzed along many measures of facial movement, which were generally larger and faster in the stressed condition. In a visual perception experiment, 16 perceivers identified the location of stress in forced-choice judgments of video clips of these utterances (without audio). Phrasal stress was better perceived than lexical stress. The relation of the visual intelligibility of the prosody of these utterances to the optical characteristics of their production was analyzed to determine which cues are associated with successful visual perception. While most optical measures were correlated with perception performance, chin measures, especially Chin Opening Displacement, contributed the most to correct perception independently of the other measures. Thus, our results indicate that the information for visual stress perception is mainly associated with mouth opening movements.
Collapse
|
3
|
Yuan H, Reed CM, Durlach NI. Tactual display of consonant voicing as a supplement to lipreading. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:1003-15. [PMID: 16158656 DOI: 10.1121/1.1945787] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
This research is concerned with the development and evaluation of a tactual display of consonant voicing to supplement the information available through lipreading for persons with profound hearing impairment. The voicing cue selected is based on the envelope onset asynchrony derived from two different filtered bands (a low-pass band and a high-pass band) of speech. The amplitude envelope of each of the two bands was used to modulate a different carrier frequency which in turn was delivered to one of the two fingers of a tactual stimulating device. Perceptual evaluations of speech reception through this tactual display included the pairwise discrimination of consonants contrasting voicing and identification of a set of 16 consonants under conditions of the tactual cue alone (T), lipreading alone (L), and the combined condition (L + T). The tactual display was highly effective for discriminating voicing at the segmental level and provided a substantial benefit to lipreading on the consonant-identification task. No such benefits of the tactual cue were observed, however, for lipreading of words in sentences due perhaps to difficulties in integrating the tactual and visual cues and to insufficient training on the more difficult task of connected-speech reception.
Collapse
Affiliation(s)
- Hanfeng Yuan
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | | | |
Collapse
|
4
|
Lansing CR, McConkie GW. Attention to facial regions in segmental and prosodic visual speech perception tasks. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 1999; 42:526-539. [PMID: 10391620 DOI: 10.1044/jslhr.4203.526] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Two experiments were conducted to test the hypothesis that visual information related to segmental versus prosodic aspects of speech is distributed differently on the face of the talker. In the first experiment, eye gaze was monitored for 12 observers with normal hearing. Participants made decisions about segmental and prosodic categories for utterances presented without sound. The first experiment found that observers spend more time looking at and direct more gazes toward the upper part of the talker's face in making decisions about intonation patterns than about the words being spoken. The second experiment tested the Gaze Direction Assumption underlying Experiment 1--that is, that people direct their gaze to the stimulus region containing information required for their task. In this experiment, 18 observers with normal hearing made decisions about segmental and prosodic categories under conditions in which face motion was restricted to selected areas of the face. The results indicate that information in the upper part of the talker's face is more critical for intonation pattern decisions than for decisions about word segments or primary sentence stress, thus supporting the Gaze Direction Assumption. Visual speech perception proficiency requires learning where to direct visual attention for cues related to different aspects of speech.
Collapse
Affiliation(s)
- C R Lansing
- University of Illinois at Urbana-Champaign, Champaign 61821, USA.
| | | |
Collapse
|
5
|
Auer ET, Bernstein LE, Coulter DC. Temporal and spatio-temporal vibrotactile displays for voice fundamental frequency: an initial evaluation of a new vibrotactile speech perception aid with normal-hearing and hearing-impaired individuals. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 1998; 104:2477-2489. [PMID: 10491709 DOI: 10.1121/1.423909] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Four experiments were performed to evaluate a new wearable vibrotactile speech perception aid that extracts fundamental frequency (F0) and displays the extracted F0 as a single-channel temporal or an eight-channel spatio-temporal stimulus. Specifically, we investigated the perception of intonation (i.e., question versus statement) and emphatic stress (i.e., stress on the first, second, or third word) under Visual-Alone (VA), Visual-Tactile (VT), and Tactile-Alone (TA) conditions and compared performance using the temporal and spatio-temporal vibrotactile display. Subjects were adults with normal hearing in experiments I-III and adults with severe to profound hearing impairments in experiment IV. Both versions of the vibrotactile speech perception aid successfully conveyed intonation. Vibrotactile stress information was successfully conveyed, but vibrotactile stress information did not enhance performance in VT conditions beyond performance in VA conditions. In experiment III, which involved only intonation identification, a reliable advantage for the spatio-temporal display was obtained. Differences between subject groups were obtained for intonation identification, with more accurate VT performance by those with normal hearing. Possible effects of long-term hearing status are discussed.
Collapse
Affiliation(s)
- E T Auer
- Spoken Language Processes Laboratory, House Ear Institute, Los Angeles, California 90057, USA.
| | | | | |
Collapse
|
6
|
Grant KW, Walden BE. Spectral distribution of prosodic information. JOURNAL OF SPEECH AND HEARING RESEARCH 1996; 39:228-38. [PMID: 8729913 DOI: 10.1044/jshr.3902.228] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Prosodic speech cues for rhythm, stress, and intonation are related primarily to variations in intensity, duration, and fundamental frequency. Because these cues make use of temporal properties of the speech waveform they are likely to be represented broadly across the speech spectrum. In order to determine the relative importance of different frequency regions for the recognition of prosodic cues, identification of four prosodic features, syllable number, syllabic stress, sentence intonation, and phrase boundary location, was evaluated under six filter conditions spanning the range from 200-6100 Hz. Each filter condition had equal articulation index (AI) weights, AI = 0.01; p(C)isolated words approximately equal to 0.40. Results obtained with normally hearing subjects showed that there was an interaction between filter condition and the identification of specific prosodic features. For example, information from high-frequency regions of speech was particularly useful in the identification of syllable number and stress, whereas information from low-frequency regions was helpful in identifying intonation patterns. In spite of these spectral differences, overall listeners performed remarkably well in identifying prosodic patterns, although individual differences were apparent. For some subjects, equivalent levels of performance across the six filter conditions were achieved. These results are discussed in relation to auditory and auditory-visual speech recognition.
Collapse
Affiliation(s)
- K W Grant
- Walter Reed Army Medical Center, Washington, DC, USA
| | | |
Collapse
|
7
|
Rönnberg J, Samuelsson S, Lyxell B, Arlinger S. Lipreading with auditory low-frequency information. Contextual constraints. SCANDINAVIAN AUDIOLOGY 1996; 25:127-32. [PMID: 8738638 DOI: 10.3109/01050399609047994] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The present experimental study investigated potential relations among three variables: (1) an audiovisual speech signal (i.e., low-frequency supplemented lipreading as opposed to pure lipreading), (2) typical, as opposed to atypical, sentences in a particular script (e.g., in a restaurant), and (3) the presence/absence of additional context (in the particular script) in 60 normal hearing subjects. All three variables revealed significant main effects, but no interactions were observed. The general facilitatory effect for the audiovisual signal is in line with previous research, but this effect was relatively weak compared to the main effect of typicality, which relies on cognitive activation of scripts. In a separate analysis, the typicality variable was also the only variable that interacted significantly with speechreading skill, typical sentences being perceived relatively easier by the skilled as opposed to the less skilled individual. The clinical implications of cognitive factors in hearing-aid fitting procedures, the construction of speech materials, and selection of individuals for rehabilitation were discussed.
Collapse
Affiliation(s)
- J Rönnberg
- Department of Education and Psychology, Linköping University, Sweden
| | | | | | | |
Collapse
|
8
|
Waldstein RS, Boothroyd A. Speechreading supplemented by single-channel and multichannel tactile displays of voice fundamental frequency. JOURNAL OF SPEECH AND HEARING RESEARCH 1995; 38:690-705. [PMID: 7674660 DOI: 10.1044/jshr.3803.690] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
The benefits of two tactile codes of voice fundamental frequency (F0) were evaluated as supplements to the speechreading of sentences in two short-term training studies, each using 12 adults with normal hearing. In Experiment 1, a multichannel spatiotemporal display of F0, known as Portapitch, was used to stimulate the index finger. In an attempt to improve on past performance with this display, the coding scheme was modified to better cover the F0 range of the talker in the training materials. For Experiment 2, to engage kinesthetic/proprioceptive pathways, a novel single-channel positional display was built, in which F0 was coded as the vertical displacement of a small finger-rest. Input to both displays consisted of synthesized replicas of the F0 contours of the sentences, prepared and perfected off-line. Training with the two tactile F0 displays included auditory presentation of the synthesized F0 contours in conjunction with the tactile patterns on alternate trials. Speechreading enhancement by the two tactile F0 displays was compared to the enhancement provided when auditory F0 information was available in conjunction with the tactile patterns, by auditory presentation of a sinusoidal indication of the presence or absence of voicing, and by a single-channel tactile display of the speech waveform presented to the index finger. Despite the modified coding strategy, the multichannel Portapitch provided a mean tactile speechreading enhancement of 7 percentage points, which was no greater than that found in previous studies. The novel positional F0 display provided only a 4 percentage point enhancement. Neither F0 display was better than the simple single-channel tactile transform of the full speech waveform, which gave a 7 percentage point enhancement effect. Auditory speechreading enhancement effects were 17 percentage points with the voicing indicator and approximately 35 percentage points when the auditory F0 contour was provided in conjunction with the tactile displays. The findings are consistent with the hypothesis that subjects were not taking full advantage of the F0 variation information available in the outputs of the two experimental tactile displays.
Collapse
Affiliation(s)
- R S Waldstein
- Center for Research in Speech and Hearing Sciences, Graduate School, City University of New York, USA
| | | |
Collapse
|
9
|
|
10
|
Mathijssen RW, Leliveld WH. Proposed use of a digital signal processor in an experimental tactile hearing aid for the profoundly deaf: preliminary communication. J Med Eng Technol 1989; 13:84-6. [PMID: 2733017 DOI: 10.3109/03091908909030201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
An experimental system for a tactile hearing aid using a digital signal processor (DSP) is being developed. This system can be used to test and evaluate not only the familiar techniques for a tactile hearing aid, such as energy level display, filterbank analysis, etc., but also novel techniques. The system is being developed especially to try out new recognition strategies, because the currently available strategies are not satisfactory. A portable tactile hearing aid that can recognize certain environmental sounds (alarm sounds) and certain features from the speech signal (such as pitch, voiced/voiceless, or even complete phonemes), being a good support for lipreading, should be the final result of the experiments.
Collapse
Affiliation(s)
- R W Mathijssen
- Eindhoven University of Technology, Division of Medical Electrical Engineering, The Netherlands
| | | |
Collapse
|