1
|
Pörschmann C, Arend JM. Phoneme dependence of horizontal asymmetries in voice directivity. JASA EXPRESS LETTERS 2024; 4:025205. [PMID: 38350076 DOI: 10.1121/10.0024878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 01/24/2024] [Indexed: 02/15/2024]
Abstract
Human voice directivity shows horizontal asymmetries caused by the shape of the lips or the position of the tooth and tongue during vocalization. This study presents and analyzes the asymmetries of voice directivity datasets of 23 different phonemes. The asymmetries were determined from datasets obtained in previous measurements with 13 subjects in a surrounding spherical microphone array. The results show that asymmetries are inherent to human voice production and that they differ between the phoneme groups with the strongest effect on the [s], the [l], and the nasals [m], [n], and [ŋ]. The least asymmetries were found for the plosives.
Collapse
Affiliation(s)
- Christoph Pörschmann
- Institute of Computer and Communication Technology, TH Köln-University of Applied Sciences, Betzdorfer Str. 2, D-50679 Cologne, Germany
| | - Johannes M Arend
- Audio Communication Group, Technische Universität Berlin, Einsteinufer 17c, D-10587 Berlin, ,
| |
Collapse
|
2
|
Ferguson SH, Morgan SD, Hunter EJ. Within-talker and within-session stability of acoustic characteristics of conversational and clear speaking stylesa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:44-55. [PMID: 38174965 PMCID: PMC10990565 DOI: 10.1121/10.0024241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 12/01/2023] [Accepted: 12/13/2023] [Indexed: 01/05/2024]
Abstract
In speech production research, talkers often perform a speech task several times per recording session with different speaking styles or in different environments. For example, Lombard speech studies typically have talkers speak in several different noise conditions. However, it is unknown to what degree simple repetition of a speech task affects speech acoustic characteristics or whether repetition effects might offset or exaggerate effects of speaking style or environment. The present study assessed speech acoustic changes over four within-session repetitions of a speech production taskset performed with two speaking styles recorded in separate sessions: conversational and clear speech. In each style, ten talkers performed a set of three speech tasks four times. Speaking rate, median fundamental frequency, fundamental frequency range, and mid-frequency spectral energy for read sentences were measured and compared across test blocks both within-session and between the two styles. Results indicate that statistically significant changes can occur from one repetition of a speech task to the next, even with a brief practice set and especially in the conversational style. While these changes were smaller than speaking style differences, these findings support using a complete speech set for training while talkers acclimate to the task and to the laboratory environment.
Collapse
Affiliation(s)
- Sarah Hargus Ferguson
- Department of Communication Sciences and Disorders, University of Utah, Salt Lake City, Utah 84112, USA
| | - Shae D Morgan
- Department of Communication Sciences and Disorders, University of Utah, Salt Lake City, Utah 84112, USA
| | - Eric J Hunter
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan 48824, USA
| |
Collapse
|
3
|
Kharlamov V, Brenner D, Tucker BV. Examining the effect of high-frequency information on the classification of conversationally produced English fricativesa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:1896-1902. [PMID: 37756577 DOI: 10.1121/10.0021067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 08/28/2023] [Indexed: 09/29/2023]
Abstract
This study examines the role of frequencies above 8 kHz in the classification of conversational speech fricatives [f, v, θ, ð, s, z, ʃ, ʒ, h] in random forest modeling. Prior research has mostly focused on spectral measures for fricative categorization using frequency information below 8 kHz. The contribution of higher frequencies has received only limited attention, especially for non-laboratory speech. In the present study, we use a corpus of sociolinguistic interview recordings from Western Canadian English sampled at 44.1 and 16 kHz. For both sampling rates, we analyze spectral measures obtained using Fourier analysis and the multitaper method, and we also compare models without and with amplitudinal measures. Results show that while frequency information above 8 kHz does not improve classification accuracy in random forest analyses, inclusion of such frequencies can affect the relative importance of specific measures. This includes a decreased contribution of center of gravity and an increased contribution of spectral standard deviation for the higher sampling rate. We also find no major differences in classification accuracy between Fourier and multitaper measures. The inclusion of power measures improves model accuracy but does not change the overall importance of spectral measures.
Collapse
Affiliation(s)
- Viktor Kharlamov
- Department of Languages, Linguistics, and Comparative Literature, Florida Atlantic University, Boca Raton, Florida 33431, USA
| | | | - Benjamin V Tucker
- Department of Communication Sciences and Disorders, Northern Arizona University, Flagstaff, Arizona 86011, USA
| |
Collapse
|
4
|
Monson BB, Ananthanarayana RM, Trine A, Delaram V, Christopher Stecker G, Buss E. Differential benefits of unmasking extended high-frequency content of target or background speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:454-462. [PMID: 37489913 PMCID: PMC10371353 DOI: 10.1121/10.0020175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 06/14/2023] [Accepted: 06/29/2023] [Indexed: 07/26/2023]
Abstract
Current evidence supports the contribution of extended high frequencies (EHFs; >8 kHz) to speech recognition, especially for speech-in-speech scenarios. However, it is unclear whether the benefit of EHFs is due to phonetic information in the EHF band, EHF cues to access phonetic information at lower frequencies, talker segregation cues, or some other mechanism. This study investigated the mechanisms of benefit derived from a mismatch in EHF content between target and masker talkers for speech-in-speech recognition. EHF mismatches were generated using full band (FB) speech and speech low-pass filtered at 8 kHz. Four filtering combinations with independently filtered target and masker speech were used to create two EHF-matched and two EHF-mismatched conditions for one- and two-talker maskers. Performance was best with the FB target and the low-pass masker in both one- and two-talker masker conditions, but the effect was larger for the two-talker masker. No benefit of an EHF mismatch was observed for the low-pass filtered target. A word-by-word analysis indicated higher recognition odds with increasing EHF energy level in the target word. These findings suggest that the audibility of target EHFs provides target phonetic information or target segregation and selective attention cues, but that the audibility of masker EHFs does not confer any segregation benefit.
Collapse
Affiliation(s)
- Brian B Monson
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA
| | - Rohit M Ananthanarayana
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA
| | - Allison Trine
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA
| | - Vahid Delaram
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA
| | - G Christopher Stecker
- Spatial Hearing Laboratory, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - Emily Buss
- Department of Otolaryngology/HNS, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| |
Collapse
|
5
|
Baydur C, Pu B, Xu X. How to hide your voice: noise-cancelling bird photography blind. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:68227-68240. [PMID: 37119486 DOI: 10.1007/s11356-023-27119-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Accepted: 04/15/2023] [Indexed: 05/27/2023]
Abstract
Getting close to birds is a great challenge in wildlife photography. Bird photography blinds may be the most effective and least intrusive way if properly designed. However, the acoustic design of the blinds has been overlooked so far. Herein, we present noise-cancelling blinds which allow photographing birds at close range. First, we conducted a questionnaire in the eco-tourism centre located in Yunnan, China. Thus, the birders' expectations of the indoor sound environment are determined. We then identify diverse variables to examine the impact of architectural and acoustic decisions on noise propagation. Finally, the acoustic performances of the blinds by considering the birds' hearing threshold are examined. The numerical simulations are performed in the acoustics module of Comsol MultiPhysics. Our study demonstrated that photography blinds require a strong and thorough acoustic design for both human and bird well-being.
Collapse
Affiliation(s)
- Caner Baydur
- Landscape Architecture Department, College of Architecture and Urban Planning, Tongji University, 200092, Shanghai, China
| | - Baojing Pu
- Landscape Architecture Department, College of Architecture and Urban Planning, Tongji University, 200092, Shanghai, China
| | - Xiaoqing Xu
- Landscape Architecture Department, College of Architecture and Urban Planning, Tongji University, 200092, Shanghai, China.
| |
Collapse
|
6
|
Monson BB, Trine A. Extending the High-Frequency Bandwidth and Predicting Speech-in-Noise Recognition: Building on the Work of Pat Stelmachowicz. Semin Hear 2023; 44:S64-S74. [PMID: 36970650 PMCID: PMC10033195 DOI: 10.1055/s-0043-1764133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023] Open
Abstract
Recent work has demonstrated that high-frequency (>6 kHz) and extended high-frequency (EHF; >8 kHz) hearing is valuable for speech-in-noise recognition. Several studies also indicate that EHF pure-tone thresholds predict speech-in-noise performance. These findings contradict the broadly accepted "speech bandwidth" that has historically been limited to below 8 kHz. This growing body of work is a tribute to the work of Pat Stelmachowicz, whose research was instrumental in revealing the limitations of the prior speech bandwidth work, particularly for female talkers and child listeners. Here, we provide a historical review that demonstrates how the work of Stelmachowicz and her colleagues paved the way for subsequent research to measure effects of extended bandwidths and EHF hearing. We also present a reanalysis of previous data collected in our lab, the results of which suggest that 16-kHz pure-tone thresholds are consistent predictors of speech-in-noise performance, regardless of whether EHF cues are present in the speech signal. Based on the work of Stelmachowicz, her colleagues, and those who have come afterward, we argue that it is time to retire the notion of a limited speech bandwidth for speech perception for both children and adults.
Collapse
Affiliation(s)
- Brian B. Monson
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois
- Department of Biomedical and Translational Sciences, Carle Illinois College of Medicine, Urbana, Illinois
- Neuroscience Program, University of Illinois Urbana-Champaign, Champaign, Illinois
| | - Allison Trine
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois
| |
Collapse
|
7
|
Pörschmann C, Arend JM. Investigating phoneme-dependencies of spherical voice directivity patterns II: Various groups of phonemes. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:179. [PMID: 36732228 DOI: 10.1121/10.0016821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 12/16/2022] [Indexed: 06/18/2023]
Abstract
The substantial variation between articulated phonemes is a fundamental feature of human voice production. However, while the spectral and temporal aspects of the phonemes have been extensively studied, few have investigated the spatial aspects and analyzed phoneme-dependent differences in voice directivity. This paper extends our previous research focusing on the directivity patterns of selected vowels and fricatives [Pörschmann and Arend, J. Acoust. Soc. Am. 149(6), 4553-4564 (2021)] and examines different groups of phonemes, such as plosives, nasals, voiced alveolars, and additional fricatives. For this purpose, full-spherical voice directivity measurements were performed for 13 persons while they articulated the respective phonemes. The sound radiation was recorded simultaneously using a surrounding spherical microphone array with 32 microphones and then spatially upsampled to a dense sampling grid. Based on these upsampled datasets, the spherical voice directivity was studied, and phoneme-dependent variations were analyzed. The results show significant differences between the groups of phonemes. However, within three groups (plosives, nasals, and voiced alveolars), the differences are small, and the variations in the directivity index were statistically insignificant.
Collapse
Affiliation(s)
- Christoph Pörschmann
- Institute of Communications Engineering, TH Köln - University of Applied Sciences, Betzdorfer Str. 2, D-50679 Cologne, Germany
| | - Johannes M Arend
- Audio Communication Group, Technical University of Berlin, Einsteinufer 17c, D-10587 Berlin, Germany
| |
Collapse
|
8
|
Monson BB, Buss E. On the use of the TIMIT, QuickSIN, NU-6, and other widely used bandlimited speech materials for speech perception experiments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1639. [PMID: 36182310 PMCID: PMC9473723 DOI: 10.1121/10.0013993] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 07/20/2022] [Accepted: 08/20/2022] [Indexed: 05/29/2023]
Abstract
The use of spectrally degraded speech signals deprives listeners of acoustic information that is useful for speech perception. Several popular speech corpora, recorded decades ago, have spectral degradations, including limited extended high-frequency (EHF) (>8 kHz) content. Although frequency content above 8 kHz is often assumed to play little or no role in speech perception, recent research suggests that EHF content in speech can have a significant beneficial impact on speech perception under a wide range of natural listening conditions. This paper provides an analysis of the spectral content of popular speech corpora used for speech perception research to highlight the potential shortcomings of using bandlimited speech materials. Two corpora analyzed here, the TIMIT and NU-6, have substantial low-frequency spectral degradation (<500 Hz) in addition to EHF degradation. We provide an overview of the phenomena potentially missed by using bandlimited speech signals, and the factors to consider when selecting stimuli that are sensitive to these effects.
Collapse
Affiliation(s)
- Brian B Monson
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA
| | - Emily Buss
- Department of Otolaryngology/HNS, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27514, USA
| |
Collapse
|
9
|
Pörschmann C, Arend JM. Effects of hand postures on voice directivity. JASA EXPRESS LETTERS 2022; 2:035203. [PMID: 36154631 DOI: 10.1121/10.0009748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
While speaking, hand postures, such as holding a hand in front of the mouth or cupping the hands around the mouth, influence human voice directivity. This study presents and analyzes spherical voice directivity datasets of an articulated [a] with and without hand postures. The datasets were determined from measurements with 13 subjects in a surrounding spherical microphone array with 32 microphones and then upsampled to a higher spatial resolution. The results show that hand postures strongly impact voice directivity and affect the directivity index by up to 6 dB, which is more than variances caused by phoneme-dependent differences.
Collapse
Affiliation(s)
- Christoph Pörschmann
- Institute of Communications Engineering, TH Köln-University of Applied Sciences, Betzdorfer Strasse 2, D-50679 Cologne, Germany ,
| | - Johannes M Arend
- Institute of Communications Engineering, TH Köln-University of Applied Sciences, Betzdorfer Strasse 2, D-50679 Cologne, Germany ,
| |
Collapse
|
10
|
Lough M, Plack CJ. Extended high-frequency audiometry in research and clinical practice. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:1944. [PMID: 35364938 DOI: 10.1121/10.0009766] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 02/15/2022] [Indexed: 06/14/2023]
Abstract
Audiometric testing in research and in clinical settings rarely considers frequencies above 8 kHz. However, the sensitivity of young healthy ears extends to 20 kHz, and there is increasing evidence that testing in the extended high-frequency (EHF) region, above 8 kHz, might provide valuable additional information. Basal (EHF) cochlear regions are especially sensitive to the effects of aging, disease, ototoxic drugs, and possibly noise exposure. Hence, EHF loss may be an early warning of damage, useful for diagnosis and for monitoring hearing health. In certain environments, speech perception may rely on EHF information, and there is evidence for an association between EHF loss and speech perception difficulties, although this may not be causal: EHF loss may instead be a marker for sub-clinical damage at lower frequencies. If there is a causal relation, then amplification in the EHF range may be beneficial if the technical difficulties can be overcome. EHF audiometry in the clinic presents with no particular difficulty, the biggest obstacle being lack of specialist equipment. Currently, EHF audiometry has limited but increasing clinical application. With the development of international guidelines and standards, it is likely that EHF testing will become widespread in future.
Collapse
Affiliation(s)
- Melanie Lough
- Manchester Centre for Audiology and Deafness, The University of Manchester, Oxford Road, Manchester, M13 9PL, United Kingdom
| | - Christopher J Plack
- Manchester Centre for Audiology and Deafness, The University of Manchester, Oxford Road, Manchester, M13 9PL, United Kingdom
| |
Collapse
|
11
|
Hunter EJ, Berardi ML, Whitling S. A Semiautomated Protocol Towards Quantifying Vocal Effort in Relation to Vocal Performance During a Vocal Loading Task. J Voice 2022:S0892-1997(22)00004-2. [PMID: 35168867 PMCID: PMC9372227 DOI: 10.1016/j.jvoice.2022.01.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 12/30/2021] [Accepted: 01/04/2022] [Indexed: 11/19/2022]
Abstract
To increase the reliability and comparability of vocal loading studies, this paper proposes the use of a standardized approach with experiments that are [1] grounded on consistent definitions of terms related to vocal fatigue (vocal effort, vocal demand, and vocal demand response), and [2] designed to reduce uncertainty and increase repeatability. In the approach, a semi-automated vocal loading task that also increases efficiencies in collecting and preparing vocal samples for analysis was used to answer the following research question: To what extent is vocal effort and vocal demand response sensitive to changes in vocal demands (ie, noise only, noise plus duration)? Results indicate that the proposed protocol design consistently induced change in both vocal effort and vocal demand response, indicating vocal fatigue. The efficacy of future vocal loading studies would be improved by adopting a more consistent methodology for quantifying vocal fatigue, thus increasing interstudy comparability of results and conclusions.
Collapse
Affiliation(s)
- Eric J Hunter
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.
| | | | - Susanna Whitling
- Department of Logopedics, Phoniatrics and Audiology, Lund University, Lund, Sweden
| |
Collapse
|
12
|
Effect of Masker Head Orientation, Listener Age, and Extended High-Frequency Sensitivity on Speech Recognition in Spatially Separated Speech. Ear Hear 2022; 43:90-100. [PMID: 34260434 PMCID: PMC8712343 DOI: 10.1097/aud.0000000000001081] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
OBJECTIVES Masked speech recognition is typically assessed as though the target and background talkers are all directly facing the listener. However, background speech in natural environments is often produced by talkers facing other directions, and talker head orientation affects the spectral content of speech, particularly at the extended high frequencies (EHFs; >8 kHz). This study investigated the effect of masker head orientation and listeners' EHF sensitivity on speech-in-speech recognition and spatial release from masking in children and adults. DESIGN Participants were 5- to 7-year-olds (n = 15) and adults (n = 34), all with normal hearing up to 8 kHz and a range of EHF hearing thresholds. Speech reception thresholds (SRTs) were measured for target sentences recorded from a microphone directly in front of the talker's mouth and presented from a loudspeaker directly in front of the listener, simulating a target directly in front of and facing the listener. The maskers were two streams of concatenated words recorded from a microphone located at either 0° or 60° azimuth, simulating masker talkers facing the listener or facing away from the listener, respectively. Maskers were presented in one of three spatial conditions: co-located with the target, symmetrically separated on either side of the target (+54° and -54° on the horizontal plane), or asymmetrically separated to the right of the target (both +54° on the horizontal plane). RESULTS Performance was poorer for the facing than for the nonfacing masker head orientation. This benefit of the nonfacing masker head orientation, or head orientation release from masking (HORM), was largest under the co-located condition, but it was also observed for the symmetric and asymmetric masker spatial separation conditions. SRTs were positively correlated with the mean 16-kHz threshold across ears in adults for the nonfacing conditions but not for the facing masker conditions. In adults with normal EHF thresholds, the HORM was comparable in magnitude to the benefit of a symmetric spatial separation of the target and maskers. Although children benefited from the nonfacing masker head orientation, their HORM was reduced compared to adults with normal EHF thresholds. Spatial release from masking was comparable across age groups for symmetric masker placement, but it was larger in adults than children for the asymmetric masker. CONCLUSIONS Masker head orientation affects speech-in-speech recognition in children and adults, particularly those with normal EHF thresholds. This is important because masker talkers do not all face the listener under most natural listening conditions, and assuming a midline orientation would tend to overestimate the effect of spatial separation. The benefits associated with EHF audibility for speech-in-speech recognition may warrant clinical evaluation of thresholds above 8 kHz.
Collapse
|
13
|
Pörschmann C, Arend JM. Investigating phoneme-dependencies of spherical voice directivity patterns. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:4553. [PMID: 34241454 DOI: 10.1121/10.0005401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 06/01/2021] [Indexed: 06/13/2023]
Abstract
Dynamic directivity is a specific characteristic of the human voice, showing time-dependent variations while speaking or singing. To study and model the human voice's articulation-dependencies and provide datasets that can be applied in virtual acoustic environments, full-spherical voice directivity measurements were carried out for 13 persons while articulating eight phonemes. Since it is nearly impossible for subjects to repeat exactly the same articulation numerous times, the sound radiation was captured simultaneously using a surrounding spherical microphone array with 32 microphones and then subsequently spatially upsampled to a dense sampling grid. Based on these dense directivity patterns, the spherical voice directivity was studied for different phonemes, and phoneme-dependent variations were analyzed. The differences between the phonemes can, to some extent, be explained by articulation-dependent properties, e.g., the mouth opening size. The directivity index, averaged across all subjects, varied by a maximum of 3 dB between any of the vowels or fricatives, and statistical analysis showed that these phoneme-dependent differences are significant.
Collapse
Affiliation(s)
- Christoph Pörschmann
- Institute of Communications Engineering, TH Köln-University of Applied Sciences, Betzdorfer Str. 2, 50679 Cologne, Germany
| | - Johannes M Arend
- Institute of Communications Engineering, TH Köln-University of Applied Sciences, Betzdorfer Str. 2, 50679 Cologne, Germany
| |
Collapse
|
14
|
Extended high-frequency hearing and head orientation cues benefit children during speech-in-speech recognition. Hear Res 2021; 406:108230. [PMID: 33951577 DOI: 10.1016/j.heares.2021.108230] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 03/03/2021] [Accepted: 03/18/2021] [Indexed: 12/29/2022]
Abstract
While the audible frequency range for humans spans approximately 20 Hz to 20 kHz, children display enhanced sensitivity relative to adults when detecting extended high frequencies (frequencies above 8 kHz; EHFs), as indicated by better pure tone thresholds. The impact that this increased hearing sensitivity to EHFs may have on children's speech recognition has not been established. One context in which EHF hearing may be particularly important for children is when recognizing speech in the presence of competing talkers. In the present study, we examined the extent to which school-age children (ages 5-17 years) with normal hearing were able to benefit from EHF cues when recognizing sentences in a two-talker speech masker. Two filtering conditions were tested: all stimuli were either full band or were low-pass filtered at 8 kHz to remove EHFs. Given that EHF energy emission in speech is highly dependent on head orientation of the talker (i.e., radiation becomes more directional with increasing frequency), two masker head angle conditions were tested: both co-located maskers were facing 45°, or both were facing 60° relative to the listener. The results demonstrated that regardless of age, children performed better when EHFs were present. In addition, a small change in masker head orientation also impacted performance, with better recognition at 60° compared to 45°. These findings suggest that EHF energy in the speech signal above 8 kHz is beneficial for children in complex listening situations. The magnitude of benefit from EHF cues and talker head orientation cues did not differ between children and adults. Therefore, while EHFs were beneficial for children as young as 5 years of age, children's generally better EHF hearing relative to adults did not provide any additional benefit.
Collapse
|
15
|
Steffens H, van de Par S, Ewert SD. The role of early and late reflections on perception of source orientation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2255. [PMID: 33940902 DOI: 10.1121/10.0003823] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 03/01/2021] [Indexed: 06/12/2023]
Abstract
Sound radiation of most natural sources, like human speakers or musical instruments, typically exhibits a spatial directivity pattern. This directivity contributes to the perception of sound sources in rooms, affecting the spatial energy distribution of early reflections and late diffuse reverberation. Thus, for convincing sound field reproduction and acoustics simulation, source directivity has to be considered. Whereas perceptual effects of directivity, such as source-orientation-dependent coloration, appear relevant for the direct sound and individual early reflections, it is unclear how spectral and spatial cues interact for later reflections. Better knowledge of the perceptual relevance of source orientation cues might help to simplify the acoustics simulation. Here, it is assessed as to what extent directivity of a human speaker should be simulated for early reflections and diffuse reverberation. The computationally efficient hybrid approach to simulate and auralize binaural room impulse responses [Wendt et al., J. Audio Eng. Soc. 62, 11 (2014)] was extended to simulate source directivity. Two psychoacoustic experiments assessed the listeners' ability to distinguish between different virtual source orientations when the frequency-dependent spatial directivity pattern of the source was approximated by a direction-independent average filter for different higher reflection orders. The results indicate that it is sufficient to simulate effects of source directivity in the first-order reflections.
Collapse
Affiliation(s)
- Henning Steffens
- Medizinische Physik, Universität Oldenburg, Oldenburg 26111, Germany
| | - Steven van de Par
- Acoustics Group and Cluster of Excellence Hearing4all, Universität Oldenburg, Oldenburg 26111, Germany
| | - Stephan D Ewert
- Medizinische Physik and Cluster of Excellence Hearing4all, Universität Oldenburg, Oldenburg 26111, Germany
| |
Collapse
|
16
|
Frič M, Podzimková I. Comparison of sound radiation between classical and pop singers. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
17
|
Leishman TW, Bellows SD, Pincock CM, Whiting JK. High-resolution spherical directivity of live speech from a multiple-capture transfer function method. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1507. [PMID: 33765812 PMCID: PMC8329840 DOI: 10.1121/10.0003363] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 12/15/2020] [Accepted: 12/29/2020] [Indexed: 06/12/2023]
Abstract
Although human speech radiation has been a subject of considerable interest for decades, researchers have not previously measured its directivity over a complete sphere with high spatial and spectral resolution using live phonetically balanced passages. The research reported in this paper addresses this deficiency by employing a multiple-capture transfer function technique and spherical harmonic expansions. The work involved eight subjects and 2522 unique sampling positions over a 1.22 or 1.83 m sphere with 5° polar and azimuthal-angle increments. The paper explains the methods and directs readers to archived results for further exploration, modeling, and speech simulation in acoustical environments. Comparisons of the results to those of a KEMAR head-and-torso simulator, lower-resolution single-capture measurements, other authors' work, and basic symmetry expectations all substantiate their validity. The completeness and high resolution of the measurements offer insights into spherical speech directivity patterns that will aid researchers in the speech sciences, architectural acoustics, audio, and communications.
Collapse
Affiliation(s)
- Timothy W Leishman
- Acoustics Research Group, Department of Physics and Astronomy, Brigham Young University, N284 Eyring Science Center, Provo, Utah 84602, USA
| | - Samuel D Bellows
- Acoustics Research Group, Department of Physics and Astronomy, Brigham Young University, N284 Eyring Science Center, Provo, Utah 84602, USA
| | - Claire M Pincock
- Acoustics Research Group, Department of Physics and Astronomy, Brigham Young University, N284 Eyring Science Center, Provo, Utah 84602, USA
| | - Jennifer K Whiting
- Acoustics Research Group, Department of Physics and Astronomy, Brigham Young University, N284 Eyring Science Center, Provo, Utah 84602, USA
| |
Collapse
|
18
|
Trine A, Monson BB. Extended High Frequencies Provide Both Spectral and Temporal Information to Improve Speech-in-Speech Recognition. Trends Hear 2020; 24:2331216520980299. [PMID: 33345755 PMCID: PMC7756042 DOI: 10.1177/2331216520980299] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Several studies have demonstrated that extended high frequencies (EHFs; >8 kHz) in speech are not only audible but also have some utility for speech recognition, including for speech-in-speech recognition when maskers are facing away from the listener. However, the contribution of EHF spectral versus temporal information to speech recognition is unknown. Here, we show that access to EHF temporal information improved speech-in-speech recognition relative to speech bandlimited at 8 kHz but that additional access to EHF spectral detail provided an additional small but significant benefit. Results suggest that both EHF spectral structure and the temporal envelope contribute to the observed EHF benefit. Speech recognition performance was quite sensitive to masker head orientation, with a rotation of only 15° providing a highly significant benefit. An exploratory analysis indicated that pure-tone thresholds at EHFs are better predictors of speech recognition performance than low-frequency pure-tone thresholds.
Collapse
Affiliation(s)
- Allison Trine
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, United States
| | - Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, United States.,Neuroscience Program, University of Illinois at Urbana-Champaign, Champaign, United States
| |
Collapse
|
19
|
Pörschmann C, Lübeck T, Arend JM. Impact of face masks on voice radiation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:3663. [PMID: 33379881 PMCID: PMC7857507 DOI: 10.1121/10.0002853] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 10/30/2020] [Accepted: 11/11/2020] [Indexed: 05/19/2023]
Abstract
With the COVID-19 pandemic, the wearing of face masks covering mouth and nose has become ubiquitous all around the world. This study investigates the impact of typical face masks on voice radiation. To analyze the transmission loss caused by masks and the influence of masks on directivity, this study measured the full-spherical voice directivity of a dummy head with a mouth simulator covered with six masks of different types, i.e., medical masks, filtering facepiece respirator masks, and cloth face coverings. The results show a significant frequency-dependent transmission loss, which varies depending on the mask, especially above 2 kHz. Furthermore, the two facepiece respirator masks also significantly affect speech directivity, as determined by the directivity index (DI). Compared to the measurements without a mask, the DI deviates by up to 7 dB at frequencies above 3 kHz. For all other masks, the deviations are below 2 dB in all third-octave frequency bands.
Collapse
Affiliation(s)
- Christoph Pörschmann
- Institute of Communications Engineering, TH Köln-University of Applied Sciences, Betzdorfer Straße 2, 50679 Cologne, Germany
| | - Tim Lübeck
- Institute of Communications Engineering, TH Köln-University of Applied Sciences, Betzdorfer Straße 2, 50679 Cologne, Germany
| | - Johannes M Arend
- Institute of Communications Engineering, TH Köln-University of Applied Sciences, Betzdorfer Straße 2, 50679 Cologne, Germany
| |
Collapse
|
20
|
Hunter LL, Monson BB, Moore DR, Dhar S, Wright BA, Munro KJ, Zadeh LM, Blankenship CM, Stiepan SM, Siegel JH. Extended high frequency hearing and speech perception implications in adults and children. Hear Res 2020; 397:107922. [PMID: 32111404 PMCID: PMC7431381 DOI: 10.1016/j.heares.2020.107922] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 02/10/2020] [Accepted: 02/11/2020] [Indexed: 01/09/2023]
Abstract
Extended high frequencies (EHF), above 8 kHz, represent a region of the human hearing spectrum that is generally ignored by clinicians and researchers alike. This article is a compilation of contributions that, together, make the case for an essential role of EHF in both normal hearing and auditory dysfunction. We start with the fundamentals of biological and acoustic determinism - humans have EHF hearing for a purpose, for example, the detection of prey, predators, and mates. EHF hearing may also provide a boost to speech perception in challenging conditions and its loss, conversely, might help explain difficulty with the same task. However, it could be that EHF are a marker for damage in the conventional frequency region that is more related to speech perception difficulties. Measurement of EHF hearing in concert with otoacoustic emissions could provide an early warning of age-related hearing loss. In early life, when EHF hearing sensitivity is optimal, we can use it for enhanced phonetic identification during language learning, but we are also susceptible to diseases that can prematurely damage it. EHF audiometry techniques and standardization are reviewed, providing evidence that they are reliable to measure and provide important information for early detection, monitoring and possible prevention of hearing loss in populations at-risk. To better understand the full contribution of EHF to human hearing, clinicians and researchers can contribute by including its measurement, along with measures of speech in noise and self-report of hearing difficulties and tinnitus in clinical evaluations and studies.
Collapse
Affiliation(s)
- Lisa L Hunter
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, USA; Department of Otolaryngology, University of Cincinnati, USA.
| | - Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, USA; Neuroscience Program, University of Illinois at Urbana-Champaign, USA
| | - David R Moore
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, USA; Department of Otolaryngology, University of Cincinnati, USA; Manchester Centre for Audiology and Deafness, School of Health Sciences, University of Manchester, UK
| | - Sumitrajit Dhar
- Roxelyn & Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL, USA; Knowles Hearing Center, Northwestern University, Evanston, IL, USA
| | - Beverly A Wright
- Roxelyn & Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL, USA
| | - Kevin J Munro
- Manchester Centre for Audiology and Deafness, School of Health Sciences, University of Manchester, UK
| | - Lina Motlagh Zadeh
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, USA
| | - Chelsea M Blankenship
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, USA
| | - Samantha M Stiepan
- Roxelyn & Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL, USA; Knowles Hearing Center, Northwestern University, Evanston, IL, USA
| | - Jonathan H Siegel
- Roxelyn & Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL, USA; Knowles Hearing Center, Northwestern University, Evanston, IL, USA
| |
Collapse
|
21
|
Talkington WJ, Donai J, Kadner AS, Layne ML, Forino A, Wen S, Gao S, Gray MM, Ashraf AJ, Valencia GN, Smith BD, Khoo SK, Gray SJ, Lass N, Brefczynski-Lewis JA, Engdahl S, Graham D, Frum CA, Lewis JW. Electrophysiological Evidence of Early Cortical Sensitivity to Human Conspecific Mimic Voice as a Distinct Category of Natural Sound. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3539-3559. [PMID: 32936717 PMCID: PMC8060013 DOI: 10.1044/2020_jslhr-20-00063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 04/29/2020] [Accepted: 07/01/2020] [Indexed: 06/11/2023]
Abstract
Purpose From an anthropological perspective of hominin communication, the human auditory system likely evolved to enable special sensitivity to sounds produced by the vocal tracts of human conspecifics whether attended or passively heard. While numerous electrophysiological studies have used stereotypical human-produced verbal (speech voice and singing voice) and nonverbal vocalizations to identify human voice-sensitive responses, controversy remains as to when (and where) processing of acoustic signal attributes characteristic of "human voiceness" per se initiate in the brain. Method To explore this, we used animal vocalizations and human-mimicked versions of those calls ("mimic voice") to examine late auditory evoked potential responses in humans. Results Here, we revealed an N1b component (96-120 ms poststimulus) during a nonattending listening condition showing significantly greater magnitude in response to mimics, beginning as early as primary auditory cortices, preceding the time window reported in previous studies that revealed species-specific vocalization processing initiating in the range of 147-219 ms. During a sound discrimination task, a P600 (500-700 ms poststimulus) component showed specificity for accurate discrimination of human mimic voice. Distinct acoustic signal attributes and features of the stimuli were used in a classifier model, which could distinguish most human from animal voice comparably to behavioral data-though none of these single features could adequately distinguish human voiceness. Conclusions These results provide novel ideas for algorithms used in neuromimetic hearing aids, as well as direct electrophysiological support for a neurocognitive model of natural sound processing that informs both neurodevelopmental and anthropological models regarding the establishment of auditory communication systems in humans. Supplemental Material https://doi.org/10.23641/asha.12903839.
Collapse
Affiliation(s)
- William J. Talkington
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Jeremy Donai
- Department of Communication Sciences and Disorders, College of Education and Human Services, West Virginia University, Morgantown
| | - Alexandra S. Kadner
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Molly L. Layne
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Andrew Forino
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Sijin Wen
- Department of Biostatistics, West Virginia University, Morgantown
| | - Si Gao
- Department of Biostatistics, West Virginia University, Morgantown
| | - Margeaux M. Gray
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Alexandria J. Ashraf
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Gabriela N. Valencia
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Brandon D. Smith
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Stephanie K. Khoo
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Stephen J. Gray
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Norman Lass
- Department of Communication Sciences and Disorders, College of Education and Human Services, West Virginia University, Morgantown
| | | | - Susannah Engdahl
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - David Graham
- Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown
| | - Chris A. Frum
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - James W. Lewis
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| |
Collapse
|
22
|
The Feasibility of a Neck-Surface Accelerometer for Estimating the Amount of Acoustic Output During Phonation Regardless of the Difference in the Mouth Configuration. J Voice 2020; 36:297-308. [DOI: 10.1016/j.jvoice.2020.06.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 06/10/2020] [Accepted: 06/10/2020] [Indexed: 11/23/2022]
|
23
|
Monson BB, Caravello J. The maximum audible low-pass cutoff frequency for speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:EL496. [PMID: 31893732 DOI: 10.1121/1.5140032] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 11/22/2019] [Indexed: 06/10/2023]
Abstract
Speech energy beyond 8 kHz is often audible for listeners with normal hearing. Limits to audibility in this frequency range are not well described. This study assessed the maximum audible low-pass cutoff frequency for speech, relative to full-bandwidth speech. The mean audible cutoff frequency was approximately 13 kHz, with a small but significant effect of talker sex. Better pure tone thresholds at extended high frequencies correlated with higher audible cutoff frequency. These findings demonstrate that bandlimiting speech even at 13 kHz results in a detectable loss for the average normal-hearing listener, suggesting there is information regarding the speech signal beyond 13 kHz.
Collapse
Affiliation(s)
- Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, 901 South Sixth Street, Champaign, Illinois 61820, ,
| | - Jacob Caravello
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, 901 South Sixth Street, Champaign, Illinois 61820, ,
| |
Collapse
|
24
|
Monson BB, Rock J, Schulz A, Hoffman E, Buss E. Ecological cocktail party listening reveals the utility of extended high-frequency hearing. Hear Res 2019; 381:107773. [PMID: 31404807 DOI: 10.1016/j.heares.2019.107773] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 07/19/2019] [Accepted: 07/27/2019] [Indexed: 10/26/2022]
Abstract
A fundamental principle of neuroscience is that each species' and individual's sensory systems are tailored to meet the demands placed upon them by their environments and experiences. What has driven the upper limit of the human frequency range of hearing? The traditional view is that sensitivity to the highest frequencies (i.e., beyond 8 kHz) facilitates localization of sounds in the environment. However, this has yet to be demonstrated for naturally occurring non-speech sounds. An alternative view is that, for social species such as humans, the biological relevance of conspecific vocalizations has driven the development and retention of auditory system features. Here, we provide evidence for the latter theory. We evaluated the contribution of extended high-frequency (EHF) hearing to common ecological speech perception tasks. We found that restricting access to EHFs reduced listeners' discrimination of talker head orientation by approximately 34%. Furthermore, access to EHFs significantly improved speech recognition under listening conditions in which the target talker's head was facing the listener while co-located background talkers faced away from the listener. Our findings raise the possibility that sensitivity to the highest audio frequencies fosters communication and socialization of the human species. These findings suggest that loss of sensitivity to the highest frequencies may lead to deficits in speech perception. Such EHF hearing loss typically goes undiagnosed, but is widespread among the middle-aged population.
Collapse
Affiliation(s)
- Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States; Neuroscience Program, University of Illinois at Urbana-Champaign, United States.
| | - Jenna Rock
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States
| | - Anneliese Schulz
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States
| | - Elissa Hoffman
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, United States
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina at Chapel Hill, United States
| |
Collapse
|
25
|
Thaler L, De Vos HPJC, Kish D, Antoniou M, Baker CJ, Hornikx MCJ. Human Click-Based Echolocation of Distance: Superfine Acuity and Dynamic Clicking Behaviour. J Assoc Res Otolaryngol 2019; 20:499-510. [PMID: 31286299 PMCID: PMC6797687 DOI: 10.1007/s10162-019-00728-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Accepted: 06/06/2019] [Indexed: 01/25/2023] Open
Abstract
Some people who are blind have trained themselves in echolocation using mouth clicks. Here, we provide the first report of psychophysical and clicking data during echolocation of distance from a group of 8 blind people with experience in mouth click-based echolocation (daily use for > 3 years). We found that experienced echolocators can detect changes in distance of 3 cm at a reference distance of 50 cm, and a change of 7 cm at a reference distance of 150 cm, regardless of object size (i.e. 28.5 cm vs. 80 cm diameter disk). Participants made mouth clicks that were more intense and they made more clicks for weaker reflectors (i.e. same object at farther distance, or smaller object at same distance), but number and intensity of clicks were adjusted independently from one another. The acuity we found is better than previous estimates based on samples of sighted participants without experience in echolocation or individual experienced participants (i.e. single blind echolocators tested) and highlights adaptation of the perceptual system in blind human echolocators. Further, the dynamic adaptive clicking behaviour we observed suggests that number and intensity of emissions serve separate functions to increase SNR. The data may serve as an inspiration for low-cost (i.e. non-array based) artificial ‘cognitive’ sonar and radar systems, i.e. signal design, adaptive pulse repetition rate and intensity. It will also be useful for instruction and guidance for new users of echolocation.
Collapse
Affiliation(s)
- Lore Thaler
- Department of Psychology, Durham University, Science Site, South Road, Durham, DH1 3LE, UK.
| | - H P J C De Vos
- Eindhoven University of Technology, Eindhoven, The Netherlands
| | - D Kish
- World Access for the Blind, Placentia, CA, USA
| | - M Antoniou
- Department of Electronic Electrical and Systems Engineering, University of Birmingham, Birmingham, UK
| | - C J Baker
- Department of Electronic Electrical and Systems Engineering, University of Birmingham, Birmingham, UK
| | - M C J Hornikx
- Eindhoven University of Technology, Eindhoven, The Netherlands
| |
Collapse
|
26
|
Weisser A, Buchholz JM. Conversational speech levels and signal-to-noise ratios in realistic acoustic conditions. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:349. [PMID: 30710956 DOI: 10.1121/1.5087567] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 12/21/2018] [Indexed: 06/09/2023]
Abstract
Estimating the basic acoustic parameters of conversational speech in noisy real-world conditions has been an elusive task in hearing research. Nevertheless, these data are essential ingredients for speech intelligibility tests and fitting rules for hearing aids. Previous surveys did not provide clear methodology for their acoustic measurements and setups, were opaque about their samples, or did not control for distance between the talker and listener, even though people are known to adapt their distance in noisy conversations. In the present study, conversations were elicited between pairs of people by asking them to play a collaborative game that required them to communicate. While performing this task, the subjects listened to binaural recordings of different everyday scenes, which were presented to them at their original sound pressure level (SPL) via highly open headphones. Their voices were recorded separately using calibrated headset microphones. The subjects were seated inside an anechoic chamber at 1 and 0.5 m distances. Precise estimates of realistic speech levels and signal-to-noise ratios (SNRs) were obtained for the different acoustic scenes, at broadband and third octave levels. It is shown that with acoustic background noise at above approximately 69 dB SPL at 1 m distance, or 75 dB SPL at 0.5 m, the average SNR can become negative. It is shown through interpolation of the two conditions that if the conversation partners would have been allowed to optimize their positions by moving closer to each other, then positive SNRs should be only observed above 75 dB SPL. The implications of the results on speech tests and hearing aid fitting rules are discussed.
Collapse
Affiliation(s)
- Adam Weisser
- Department of Linguistics-Audiology Section, Macquarie University, Australian Hearing Hub-Level 3, 16 University Avenue, New South Wales 2109, Australia
| | - Jörg M Buchholz
- Department of Linguistics-Audiology Section, Macquarie University, Australian Hearing Hub-Level 3, 16 University Avenue, New South Wales 2109, Australia
| |
Collapse
|
27
|
Schwartz JC, Whyte AT, Al-Nuaimi M, Donai JJ. Effects of signal bandwidth and noise on individual speaker identification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:EL447. [PMID: 30522302 DOI: 10.1121/1.5078770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 10/28/2018] [Indexed: 06/09/2023]
Abstract
Two experiments were conducted to evaluate the effects of increasing spectral bandwidth from 3 to 10 kHz on individual speaker recognition in noisy conditions (+5, 0, and -5 dB signal-to-noise ratio). Experiment 1 utilized h(Vowel)d (hVd) signals, while experiment 2 utilized sentences from the Rainbow Passage. Both experiments showed significant improvements in individual speaker identification in the 10 kHz bandwidth condition (6% for hVds; 10% for sentences). These results coincide with the extant machine recognition literature demonstrating significant amounts of individual speaker information present in the speech signal above approximately 3-4 kHz. Cues from the high-frequency region for speaker identity warrant further study.
Collapse
Affiliation(s)
- Jeremy C Schwartz
- Department of Communication Sciences and Disorders, West Virginia University, Morgantown, West Virginia 26506, USA
| | - Ashtyn T Whyte
- Department of Communication Sciences and Disorders, West Virginia University, Morgantown, West Virginia 26506, USA
| | - Mohanad Al-Nuaimi
- Department of Mechanical and Aerospace Engineering, West Virginia University, Morgantown, West Virginia 26506, USA
| | - Jeremy J Donai
- Department of Communication Sciences and Disorders, West Virginia University, Morgantown, West Virginia 26506, USA
| |
Collapse
|
28
|
Holt LL, Tierney AT, Guerra G, Laffere A, Dick F. Dimension-selective attention as a possible driver of dynamic, context-dependent re-weighting in speech processing. Hear Res 2018; 366:50-64. [PMID: 30131109 PMCID: PMC6107307 DOI: 10.1016/j.heares.2018.06.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 06/10/2018] [Accepted: 06/19/2018] [Indexed: 12/24/2022]
Abstract
The contribution of acoustic dimensions to an auditory percept is dynamically adjusted and reweighted based on prior experience about how informative these dimensions are across the long-term and short-term environment. This is especially evident in speech perception, where listeners differentially weight information across multiple acoustic dimensions, and use this information selectively to update expectations about future sounds. The dynamic and selective adjustment of how acoustic input dimensions contribute to perception has made it tempting to conceive of this as a form of non-spatial auditory selective attention. Here, we review several human speech perception phenomena that might be consistent with auditory selective attention although, as of yet, the literature does not definitively support a mechanistic tie. We relate these human perceptual phenomena to illustrative nonhuman animal neurobiological findings that offer informative guideposts in how to test mechanistic connections. We next present a novel empirical approach that can serve as a methodological bridge from human research to animal neurobiological studies. Finally, we describe four preliminary results that demonstrate its utility in advancing understanding of human non-spatial dimension-based auditory selective attention.
Collapse
Affiliation(s)
- Lori L Holt
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, 15213, USA; Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK
| | - Giada Guerra
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK
| | - Aeron Laffere
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK
| | - Frederic Dick
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK; Department of Experimental Psychology, University College London, London, WC1H 0AP, UK
| |
Collapse
|
29
|
Kocon P, Monson BB. Horizontal directivity patterns differ between vowels extracted from running speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:EL7. [PMID: 30075666 PMCID: PMC6033614 DOI: 10.1121/1.5044508] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Directivity patterns for vocalizations radiating from the human mouth have been examined regularly, but phoneme-specific changes in radiation have rarely been identified. This study reports half-plane horizontal directivity up to 20 kHz with 15° angular resolution for /ɑ/, /e/, /i/, /o/, and /u/ extracted from running speech, compared with long-term averaged speech. An effect of vowel category on the directivity index was observed, with /ɑ/ being most directional. Angle-dependent third-octave band weighting functions, useful for simulating real-world listening conditions, highlighted disparities in directivity between running speech and individual vowels. These findings point to rapidly changing dynamic directivity patterns during speech.
Collapse
Affiliation(s)
- Paulina Kocon
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, 901 South Sixth Street, Champaign, Illinois 61820, USA ,
| | - Brian B Monson
- Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, 901 South Sixth Street, Champaign, Illinois 61820, USA ,
| |
Collapse
|
30
|
Švec JG, Granqvist S. Tutorial and Guidelines on Measurement of Sound Pressure Level in Voice and Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:441-461. [PMID: 29450495 DOI: 10.1044/2017_jslhr-s-17-0095] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Accepted: 08/16/2017] [Indexed: 06/08/2023]
Abstract
PURPOSE Sound pressure level (SPL) measurement of voice and speech is often considered a trivial matter, but the measured levels are often reported incorrectly or incompletely, making them difficult to compare among various studies. This article aims at explaining the fundamental principles behind these measurements and providing guidelines to improve their accuracy and reproducibility. METHOD Basic information is put together from standards, technical, voice and speech literature, and practical experience of the authors and is explained for nontechnical readers. RESULTS Variation of SPL with distance, sound level meters and their accuracy, frequency and time weightings, and background noise topics are reviewed. Several calibration procedures for SPL measurements are described for stand-mounted and head-mounted microphones. CONCLUSIONS SPL of voice and speech should be reported together with the mouth-to-microphone distance so that the levels can be related to vocal power. Sound level measurement settings (i.e., frequency weighting and time weighting/averaging) should always be specified. Classified sound level meters should be used to assure measurement accuracy. Head-mounted microphones placed at the proximity of the mouth improve signal-to-noise ratio and can be taken advantage of for voice SPL measurements when calibrated. Background noise levels should be reported besides the sound levels of voice and speech.
Collapse
Affiliation(s)
- Jan G Švec
- Department of Biophysics, Faculty of Science, Palacký University, Olomouc, Czech Republic
| | - Svante Granqvist
- Department of Basic Science and Biomedicine, School of Technology and Health, Royal Institute of Technology, Stockholm, Sweden
- Division of Speech and Language Pathology, Department of Clinical Science, Intervention and Technology, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
31
|
Thaler L, De Vos R, Kish D, Antoniou M, Baker C, Hornikx M. Human echolocators adjust loudness and number of clicks for detection of reflectors at various azimuth angles. Proc Biol Sci 2018; 285:20172735. [PMID: 29491173 PMCID: PMC5832709 DOI: 10.1098/rspb.2017.2735] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 02/06/2018] [Indexed: 11/15/2022] Open
Abstract
In bats it has been shown that they adjust their emissions to situational demands. Here we report similar findings for human echolocation. We asked eight blind expert echolocators to detect reflectors positioned at various azimuth angles. The same 17.5 cm diameter circular reflector placed at 100 cm distance at 0°, 45° or 90° with respect to straight ahead was detected with 100% accuracy, but performance dropped to approximately 80% when it was placed at 135° (i.e. somewhat behind) and to chance levels (50%) when placed at 180° (i.e. right behind). This can be explained based on poorer target ensonification owing to the beam pattern of human mouth clicks. Importantly, analyses of sound recordings show that echolocators increased loudness and numbers of clicks for reflectors at farther angles. Echolocators were able to reliably detect reflectors when level differences between echo and emission were as low as -27 dB, which is much lower than expected based on previous work. Increasing intensity and numbers of clicks improves signal-to-noise ratio and in this way compensates for weaker target reflections. Our results are, to our knowledge, the first to show that human echolocation experts adjust their emissions to improve sensory sampling. An implication from our findings is that human echolocators accumulate information from multiple samples.
Collapse
Affiliation(s)
- L Thaler
- Department of Psychology, Durham University, Science Site, South Road, Durham DH1 3LE, UK
| | - R De Vos
- Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands
| | - D Kish
- World Access for the Blind, Placentia 92870, CA, USA
| | - M Antoniou
- Department of Electronic Electrical and Systems Engineering, School of Engineering, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - C Baker
- Department of Electronic Electrical and Systems Engineering, School of Engineering, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - M Hornikx
- Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands
| |
Collapse
|
32
|
Schloneger MJ, Hunter EJ. Assessments of Voice Use and Voice Quality Among College/University Singing Students Ages 18-24 Through Ambulatory Monitoring With a Full Accelerometer Signal. J Voice 2017; 31:124.e21-124.e30. [PMID: 26897545 PMCID: PMC4988942 DOI: 10.1016/j.jvoice.2015.12.018] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2015] [Accepted: 12/29/2015] [Indexed: 10/22/2022]
Abstract
The multiple social and performance demands placed on college/university singers could put their still-developing voices at risk. Previous ambulatory monitoring studies have analyzed the duration, intensity, and frequency (in Hertz) of voice use among such students. Nevertheless, no studies to date have incorporated the simultaneous acoustic voice quality measures into the acquisition of these measures to allow for direct comparison during the same voicing period. Such data could provide greater insight into how young singers use their voices, as well as identify potential correlations between vocal dose and acoustic changes in voice quality. The purpose of this study was to assess the voice use and the estimated voice quality of college/university singing students (18-24 years old, N = 19). Ambulatory monitoring was conducted over three full, consecutive weekdays measuring voice from an unprocessed accelerometer signal measured at the neck. From this signal, traditional vocal dose metrics such as phonation percentage, dose time, cycle dose, and distance dose were analyzed. Additional acoustic measures included perceived pitch, pitch strength, long-term average spectrum slope, alpha ratio, dB sound pressure level 1-3 kHz, and harmonic-to-noise ratio. Major findings from more than 800 hours of recording indicated that among these students (a) higher vocal doses correlated significantly with greater voice intensity, more vocal clarity and less perturbation; and (b) there were significant differences in some acoustic voice quality metrics between nonsinging, solo singing, and choral singing.
Collapse
Affiliation(s)
| | - Eric J Hunter
- Communicative Sciences and Disorders, Michigan State University, Philadelphia, Pennsylvania
| |
Collapse
|
33
|
Blandin R, Arnela M, Laboissière R, Pelorson X, Guasch O, Van Hirtum A, Laval X. Effects of higher order propagation modes in vocal tract like geometries. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:832-843. [PMID: 25698017 DOI: 10.1121/1.4906166] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, a multimodal theory accounting for higher order acoustical propagation modes is presented as an extension to the classical plane wave theory. This theoretical development is validated against experiments on vocal tract replicas, obtained using a 3D printer and finite element simulations. Simplified vocal tract geometries of increasing complexity are used to investigate the influence of some geometrical parameters on the acoustical properties of the vocal tract. It is shown that the higher order modes can produce additional resonances and anti-resonances and can also strongly affect the radiated sound. These effects appear to be dependent on the eccentricity and the cross-sectional shape of the geometries. Finally, the comparison between the simulations and the experiments points out the importance of taking visco-thermal losses into account to increase the accuracy of the resonance bandwidths prediction.
Collapse
Affiliation(s)
- Rémi Blandin
- GIPSA-Lab, Unité Mixte de Recherche au Centre National de la Recherche Scientifique 5216, Grenoble Campus, St Martin dHeres, F-38402, France
| | - Marc Arnela
- Grup de recerca en Tecnologies Mèdia, La Salle, Universitat Ramon Llull C/Quatre Camins 2, E-08022 Barcelona, Catalonia, Spain
| | - Rafael Laboissière
- PACS Team, INSERM Unit 1028: Cognition and Brain Dynamics, Lyon Neurosciences Research Centre, EPU-ISTIL, Claude Bernard University, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Xavier Pelorson
- GIPSA-Lab, Unité Mixte de Recherche au Centre National de la Recherche Scientifique 5216, Grenoble Campus, St Martin dHeres, F-38402, France
| | - Oriol Guasch
- Grup de recerca en Tecnologies Mèdia, La Salle, Universitat Ramon Llull C/Quatre Camins 2, E-08022 Barcelona, Catalonia, Spain
| | - Annemie Van Hirtum
- GIPSA-Lab, Unité Mixte de Recherche au Centre National de la Recherche Scientifique 5216, Grenoble Campus, St Martin dHeres, F-38402, France
| | - Xavier Laval
- GIPSA-Lab, Unité Mixte de Recherche au Centre National de la Recherche Scientifique 5216, Grenoble Campus, St Martin dHeres, F-38402, France
| |
Collapse
|
34
|
Vitela AD, Monson BB, Lotto AJ. Phoneme categorization relying solely on high-frequency energy. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:EL65-70. [PMID: 25618101 PMCID: PMC4272376 DOI: 10.1121/1.4903917] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Speech perception studies generally focus on the acoustic information present in the frequency regions below 6 kHz. Recent evidence suggests that there is perceptually relevant information in the higher frequencies, including information affecting speech intelligibility. This experiment examined whether listeners are able to accurately identify a subset of vowels and consonants in CV-context when only high-frequency (above 5 kHz) acoustic information is available (through high-pass filtering and masking of lower frequency energy). The findings reveal that listeners are capable of extracting information from these higher frequency regions to accurately identify certain consonants and vowels.
Collapse
Affiliation(s)
- A Davi Vitela
- Department of Psychology, University of Nevada, Las Vegas, 4505 South Maryland Parkway, Las Vegas, Nevada 89154
| | - Brian B Monson
- Department of Pediatric Newborn Medicine, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, Boston, Massachusetts 02115
| | - Andrew J Lotto
- Speech, Language, and Hearing Sciences, University of Arizona, 1131 East Second Street, Tucson, Arizona 85721
| |
Collapse
|
35
|
Monson BB, Hunter EJ, Lotto AJ, Story BH. The perceptual significance of high-frequency energy in the human voice. Front Psychol 2014; 5:587. [PMID: 24982643 PMCID: PMC4059169 DOI: 10.3389/fpsyg.2014.00587] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2014] [Accepted: 05/26/2014] [Indexed: 11/25/2022] Open
Abstract
While human vocalizations generate acoustical energy at frequencies up to (and beyond) 20 kHz, the energy at frequencies above about 5 kHz has traditionally been neglected in speech perception research. The intent of this paper is to review (1) the historical reasons for this research trend and (2) the work that continues to elucidate the perceptual significance of high-frequency energy (HFE) in speech and singing. The historical and physical factors reveal that, while HFE was believed to be unnecessary and/or impractical for applications of interest, it was never shown to be perceptually insignificant. Rather, the main causes for focus on low-frequency energy appear to be because the low-frequency portion of the speech spectrum was seen to be sufficient (from a perceptual standpoint), or the difficulty of HFE research was too great to be justifiable (from a technological standpoint). The advancement of technology continues to overcome concerns stemming from the latter reason. Likewise, advances in our understanding of the perceptual effects of HFE now cast doubt on the first cause. Emerging evidence indicates that HFE plays a more significant role than previously believed, and should thus be considered in speech and voice perception research, especially in research involving children and the hearing impaired.
Collapse
Affiliation(s)
- Brian B. Monson
- Department of Pediatric Newborn Medicine, Brigham and Women’s Hospital, Harvard Medical SchoolBoston, MA, USA
- National Center for Voice and Speech, University of UtahSalt Lake City, UT, USA
| | - Eric J. Hunter
- National Center for Voice and Speech, University of UtahSalt Lake City, UT, USA
- Department of Communicative Sciences and Disorders, Michigan State UniversityEast Lansing, MI, USA
| | - Andrew J. Lotto
- Speech, Language, and Hearing Sciences, University of ArizonaTucson, AZ, USA
| | - Brad H. Story
- Speech, Language, and Hearing Sciences, University of ArizonaTucson, AZ, USA
| |
Collapse
|
36
|
Monson BB, Lotto AJ, Story BH. Analysis of high-frequency energy in long-term average spectra of singing, speech, and voiceless fricatives. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:1754-64. [PMID: 22978902 PMCID: PMC3460988 DOI: 10.1121/1.4742724] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2011] [Revised: 07/04/2012] [Accepted: 07/16/2012] [Indexed: 05/04/2023]
Abstract
The human singing and speech spectrum includes energy above 5 kHz. To begin an in-depth exploration of this high-frequency energy (HFE), a database of anechoic high-fidelity recordings of singers and talkers was created and analyzed. Third-octave band analysis from the long-term average spectra showed that production level (soft vs normal vs loud), production mode (singing vs speech), and phoneme (for voiceless fricatives) all significantly affected HFE characteristics. Specifically, increased production level caused an increase in absolute HFE level, but a decrease in relative HFE level. Singing exhibited higher levels of HFE than speech in the soft and normal conditions, but not in the loud condition. Third-octave band levels distinguished phoneme class of voiceless fricatives. Female HFE levels were significantly greater than male levels only above 11 kHz. This information is pertinent to various areas of acoustics, including vocal tract modeling, voice synthesis, augmentative hearing technology (hearing aids and cochlear implants), and training/therapy for singing and speech.
Collapse
Affiliation(s)
- Brian B Monson
- National Center for Voice and Speech, University of Utah, 136 S. Main Street, Suite 320, Salt Lake City, Utah 84101, USA.
| | | | | |
Collapse
|