1
|
Does Vocal Fatigue Negatively Affect Low Vocal Range in Professional, Female Opera Singers? A Survey Study and Single-Subject Pilot Study. J Voice 2024; 38:688-696. [PMID: 35045947 DOI: 10.1016/j.jvoice.2021.12.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 12/05/2021] [Accepted: 12/06/2021] [Indexed: 11/19/2022]
Abstract
OBJECTIVE 1. To survey how vocal fatigue manifests itself in the vocal range of a sample of professional, female opera singers. 2. To assess laryngeal videostroboscopic changes of one professional, female opera singer before and after extended operatic singing. METHODS Survey study: 296 professional, female opera singers were recruited to participate in an anonymous research survey querying the temporary impact of vocal fatigue in professional, female opera singers. 46.3% of participants described themselves as singing mainstage roles at large, A-level opera houses. Singers were asked to report where in their vocal range they experienced the effects of vocal fatigue and could choose more than one response. Single-subject study: One professional, female opera singer (the author) underwent two laryngeal videostroboscopic exams pre and post vocal loading. The exams were evaluated and compared independently by two blinded laryngologists. RESULTS The results of the survey found that 42.9% of the total responses from professional, female opera singers indicated a temporary impact on the lower middle range (≈C4-F4) as a result of vocal fatigue. 36.5% of participants experienced a temporary impact on their lowest range (≈below C4) and 19.6% reported a temporary impact on their higher range due to vocal fatigue. The results of the single-subject study showed reduced glottal closure pattern in the postloading, lower middle range, head voice condition. CONCLUSIONS A large proportion (64.9%) of the professional, female opera singers surveyed reported increased difficulty navigating their lower middle range and/or lowest range after extended operatic singing. These results support the single-subject study, which found that after vocal loading, there was a decrease in glottal competence while singing in head voice in the lower middle range.
Collapse
|
2
|
Aerosol Dispersion During Different Phonatory Tasks in Amateur Singers. J Voice 2024; 38:731-740. [PMID: 34963518 DOI: 10.1016/j.jvoice.2021.11.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 11/18/2021] [Accepted: 11/18/2021] [Indexed: 11/28/2022]
Abstract
INTRODUCTION Due to increased aerosol generation during singing, choir rehearsals were widely prohibited in the course of the CoVID-19 pandemic. Most studies on aerosol generation and dispersion focus on professional singers. However, it has not been clarified if these data are also representative for amateur singers. METHODS Nine non-professional singers (four male, five female) were asked to perform five tasks; speaking (T+), singing a text softly (MT-) and loudly (MT+), singing on the vowel [ə] (M+) and singing with a N95 mask (MT+N95). Before performing the tasks, the singers were asked to inhale 0.5 L vapor produced by an e-cigarette consisting of the basic liquid. The spread of the exhaled vapor was recorded in all three dimensions by high-definition cameras and the impulse dispersion was detected as a function of time. RESULTS Regarding the median dispersion to the front, all tasks showed comparable distances from 0.69 m to 0.82 m at the end of the tasks. However, the maximum aerosol dispersion showed a larger variety among different subjects or tasks, respectively. Especially in the M+ task a maximum distance of 1.96 m to the front was reached by a single subject. Although singing with a N95 mask resulted in a slightly increased median dispersion to the front, the maximum dispersion was decreased from 1.47 m (MT+) to 1.04 m (MT+N95). CONCLUSION The maximum dispersion distance to the front of 1.96 m at the end of the M+ task and 1.47 m at the end of the MT+ task showed higher values in comparison to professional singers. Differences in phonation, articulation and mouth opening could lead to greater impulse dispersion. Singing in loud phonation with a N95 mask reduced the maximum impulse dispersion to the front to 1.04 m. Taking all results into consideration, a slightly larger safety distance should be necessary for non-professional singers.
Collapse
|
3
|
Articulatory and acoustic differences between lyric and dramatic singing in Western classical music. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2659-2669. [PMID: 38634661 DOI: 10.1121/10.0025751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 03/27/2024] [Indexed: 04/19/2024]
Abstract
Within the realm of voice classification, singers could be sub-categorized by the weight of their repertoire, the so-called "singer's Fach." However, the opposite pole terms "lyric" and "dramatic" singing are not yet well defined by their acoustic and articulatory characteristics. Nine professional singers of different singers' Fach were asked to sing a diatonic scale on the vowel /a/, first in what the singers considered as lyric and second in what they considered as dramatic. Image recording was performed using real time magnetic resonance imaging (MRI) with 25 frames/s, and the audio signal was recorded via an optical microphone system. Analysis was performed with regard to sound pressure level (SPL), vibrato amplitude, and frequency and resonance frequencies as well as articulatory settings of the vocal tract. The analysis revealed three primary differences between dramatic and lyric singing: Dramatic singing was associated with greater SPL and greater vibrato amplitude and frequency as well as lower resonance frequencies. The higher SPL is an indication of voice source changes, and the lower resonance frequencies are probably caused by the lower larynx position. However, all these strategies showed a considerable individual variability. The singers' Fach might contribute to perceptual differences even for the same singer with regard to the respective repertoire.
Collapse
|
4
|
The Effect of Singers' Masks on the Impulse Dispersion of Aerosols During Singing. J Voice 2024; 38:247.e1-247.e10. [PMID: 34610881 DOI: 10.1016/j.jvoice.2021.08.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 08/24/2021] [Accepted: 08/26/2021] [Indexed: 10/20/2022]
Abstract
BACKGROUND During the Covid-19 pandemic, singing activities were restricted due to several super-spreading events that have been observed during rehearsals and vocal performances. However, it has not been clarified how the aerosol dispersion, which has been assumed to be the leading transmission factor, could be reduced by masks which are specially designed for singers. MATERIAL AND METHODS Twelve professional singers (10 of the Bavarian Radio-Chorus and two freelancers, seven females and five males) were asked to sing the melody of the ode of joy of Beethoven's 9th symphony "Freude schöner Götterfunken, Tochter aus Elisium" in D-major without masks and afterwards with five different singers' masks, all distinctive in their material and proportions. Every task was conducted after inhaling the basic liquid from an e-cigarette. The aerosol dispersion was recorded by three high-definition video cameras during and after the task. The cloud was segmented and the dispersion was analyzed for all three spatial dimensions. Further, the subjects were asked to rate the practicability of wearing the tested masks during singing activities using a questionnaire. RESULTS Concerning the median distances of dispersion, all masks were able to decrease the impulse dispersion of the aerosols to the front. In contrast, the dispersion to the sides and to the top was increased. The evaluation revealed that most of the subjects would reject performing a concert with any of the masks. CONCLUSION Although, the results exhibit that the tested masks could be able to reduce the radius of aerosol expulsion for virus-laden aerosol particles, there are more improvements necessary to enable the practical implementations for professional singing.
Collapse
|
5
|
Dynamic changes of vocal tract dimensions with sound pressure level during messa di vocea). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3595-3603. [PMID: 38038612 DOI: 10.1121/10.0022582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 11/14/2023] [Indexed: 12/02/2023]
Abstract
The messa di voce (MdV), which consists of a continuous crescendo and subsequent decrescendo on one pitch is one of the more difficult exercises of the technical repertoire of Western classical singing. With rising lung pressure, regulatory adjustments both on the level of the glottis and the vocal tract are required to keep the pitch stable. The dynamic changes of vocal tract dimensions with the bidirectional variation of sound pressure level (SPL) during MdV were analyzed by two-dimensional real-time magnetic resonance imaging (25 frames/s) and synchronous audio recordings in 12 professional singer subjects. Close associations in the respective articulatory kinetics were found between SPL and lip opening, jaw opening, pharynx width, uvula elevation, and vertical larynx position. However, changes in vocal tract dimensions during plateaus of SPL suggest that perceived loudness could have been varied beyond the dimension of SPL. Further multimodal investigation, including the analysis of sound spectra, is needed for a better understanding of the role of vocal tract resonances in the control of vocal loudness in human phonation.
Collapse
|
6
|
Relationship between epilarynx tube shape and the radiated sound pressure level during phonation is gender specific. LOGOP PHONIATR VOCO 2023; 48:44-56. [PMID: 34644212 DOI: 10.1080/14015439.2021.1988143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
OBJECTIVE/HYPOTHESIS The aim of the study was to measure the morphology of the epilaryngeal tube during sustained phonation as a function of loudness variation and to compare subjects of different genders. STUDY DESIGN This is a prospective study. METHODS Five female and five male classically trained singers were recorded by magnetic resonance imaging with simultaneous audio recordings while sustaining phonation at three different loudness conditions. Three-dimensional subsections of the vocal tract were segmented on multi-image-based cross-sections. Different volume and area measures were determined and their relation to sound pressure level and loudness condition was analyzed. RESULTS Male singers tended to narrow the epilaryngeal tube when increasing sound pressure level whereas female singers did not. CONCLUSION Strategies of vocal tract adjustments during loudness variation in classical singing appear to be gender specific.
Collapse
|
7
|
Influence of Loudness on Vocal Stability in the Male Passaggio. J Voice 2023; 37:296.e1-296.e8. [PMID: 33455852 DOI: 10.1016/j.jvoice.2020.12.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 12/11/2020] [Accepted: 12/15/2020] [Indexed: 11/20/2022]
Abstract
INTRODUCTION Vocal registers and the frequency region where registration events occur, the passaggio, have been in focus of scientific research for almost 200 years. In professional tenors, it has been shown before that singing across the passaggio avoiding a register shift and therefore using their stage voice above the passaggio (SVaP) is associated with greater vocal stability than a register change to the falsetto. However, it is unclarified how much different loudness conditions contribute to this vocal stability. MATERIAL AND METHODS Six professional tenors were asked to perform four pitch glides from A3 to A4 (220-440 Hz) on the vowel [i:]. These glides included (1) the passaggio from modal register to falsetto. The following glides into SVaP were performed under different loudness conditions, (2) mezzoforte (average loudness), (3) pianissimo (as quietly as possible), and (4) fortissimo (the loudest possible). During phonation, high speed videoendoscopy (HSV), electroglottography, and audio signals were recorded simultaneously. The glottal area waveform was derived based on the HSV material. RESULTS Modal to falsetto transitions were associated with relatively low sound pressure level and rise of open quotients (OQ) for the falsetto. Transitions to SVaP showed a clear dependence on the intended loudness. The OQs were lower the louder the task was. There was no clear evidence that transitions with softer voice showed greater stability of vocal fold oscillation patterns than louder tasks. CONCLUSIONS The vocal fold oscillation pattern show- differences among various loudness conditions within the tenors' passaggio but no clear differences with regard to oscillatory stability.
Collapse
|
8
|
Impact of Instructed Laryngeal Manipulation on Acoustic Measures of Voice-Preliminary Results. J Voice 2023; 37:143.e1-143.e11. [PMID: 33288382 DOI: 10.1016/j.jvoice.2020.11.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 11/10/2020] [Accepted: 11/11/2020] [Indexed: 01/11/2023]
Abstract
BACKGROUND Control of laryngeal muscles is required to manipulate pitch, volume, and voice quality. False vocal fold activity (FVFA) refers to the constriction and release of constriction of the false vocal folds. True vocal fold mass (TVFM) represents the cross-sectional thickness of the vocal folds. Larynx height (LH) refers to the vertical position of the larynx in the neck. To date, studies of voice control have examined the effects of these parameters separately. No study has investigated the impact of instructed systematic manipulation of these parameters on acoustic voice measures in vocally healthy trained subjects. AIMS This study examined the effects of systematically manipulating FVFA, TVFM, and LH on several acoustic voice measures. METHOD Twelve vocally trained speakers were instructed to use specific techniques to achieve experimental conditions of constriction and release of constriction of FVFA, thicker and thinner TVFM, and normal and low LH. Each condition was implemented in combination with manipulating the other parameters. Voice recordings of sustained vowel /a/ and Rainbow Passage were obtained for all laryngeal manipulation conditions and underwent acoustic analyses for fundamental frequency (F0), signal typing, harmonics-to-noise ratio (HNR), cepstral peak prominence (CPP), and vocal relative intensity. RESULTS Constricted FVFA caused more aperiodicity in the signals, lower CPP, and lower vocal relative intensity than release of constriction. Thicker TVFM resulted in significantly higher CPP and vocal relative intensity than thinner TVFM. Modifying TVFM did not affect F0 and HNR. Low LH had significantly lower F0 but did not impact on HNR, CPP, and intensity. CONCLUSIONS The effects of systematic manipulation of each laryngeal parameter resulted in independent acoustic effects without measurable interaction. Release of constriction of FVFA, thicker TVFM, and low LH were configurations that resulted in more optimal acoustic signals.
Collapse
|
9
|
High-Resolution Three-Dimensional Hybrid MRI + Low Dose CT Vocal Tract Modeling: A Cadaveric Pilot Study. J Voice 2022. [DOI: 10.1016/j.jvoice.2022.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
10
|
Characterizing respiratory aerosol emissions during sustained phonation. JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2022; 32:689-696. [PMID: 35351959 PMCID: PMC8963400 DOI: 10.1038/s41370-022-00430-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 03/11/2022] [Accepted: 03/15/2022] [Indexed: 06/10/2023]
Abstract
OBJECTIVE To elucidate the role of phonation frequency (i.e., pitch) and intensity of speech on respiratory aerosol emissions during sustained phonations. METHODS Respiratory aerosol emissions are measured in 40 (24 males and 16 females) healthy, non-trained singers phonating the phoneme /a/ at seven specific frequencies at varying vocal intensity levels. RESULTS Increasing frequency of phonation was positively correlated with particle production (r = 0.28, p < 0.001). Particle production rate was also positively correlated (r = 0.37, p < 0.001) with the vocal intensity of phonation, confirming previously reported findings. The primary mode (particle diameter ~0.6 μm) and width of the particle number size distribution were independent of frequency and vocal intensity. Regression models of the particle production rate using frequency, vocal intensity, and the individual subject as predictor variables only produced goodness of fit of adjusted R2 = 40% (p < 0.001). Finally, it is proposed that superemitters be defined as statistical outliers, which resulted in the identification of one superemitter in the sample of 40 participants. SIGNIFICANCE The results suggest there remain unexplored effects (e.g., biomechanical, environmental, behavioral, etc.) that contribute to the high variability in respiratory particle production rates, which ranged from 0.2 particles/s to 142 particles/s across all trials. This is evidenced as well by changes in the distribution of participant particle production that transitions to a more bimodal distribution (second mode at particle diameter ~2 μm) at higher frequencies and vocal intensity levels.
Collapse
|
11
|
Sub-millisecond 2D MRI of the vocal fold oscillation using single-point imaging with rapid encoding. MAGNETIC RESONANCE MATERIALS IN PHYSICS BIOLOGY AND MEDICINE 2021; 35:301-310. [PMID: 34542771 PMCID: PMC8995286 DOI: 10.1007/s10334-021-00959-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/06/2021] [Accepted: 09/06/2021] [Indexed: 10/24/2022]
Abstract
OBJECTIVE The slow spatial encoding of MRI has precluded its application to rapid physiologic motion in the past. The purpose of this study is to introduce a new fast acquisition method and to demonstrate feasibility of encoding rapid two-dimensional motion of human vocal folds with sub-millisecond resolution. METHOD In our previous work, we achieved high temporal resolution by applying a rapidly switched phase encoding gradient along the direction of motion. In this work, we extend phase encoding to the second image direction by using single-point imaging with rapid encoding (SPIRE) to image the two-dimensional vocal fold oscillation in the coronal view. Image data were gated using electroglottography (EGG) and motion corrected. An iterative reconstruction with a total variation (TV) constraint was used and the sequence was also simulated using a motion phantom. RESULTS Dynamic images of the vocal folds during phonation at pitches of 150 and 165 Hz were acquired in two volunteers and the periodic motion of the vocal folds at a temporal resolution of about 600 µs was shown. The simulations emphasize the necessity of SPIRE for two-dimensional motion encoding. DISCUSSION SPIRE is a new MRI method to image rapidly oscillating structures and for the first time provides dynamic images of the vocal folds oscillations in the coronal plane.
Collapse
|
12
|
Characterizing Vocal Tract Dimensions in the Vocal Modes Using Magnetic Resonance Imaging. J Voice 2021; 35:804.e27-804.e42. [DOI: 10.1016/j.jvoice.2020.01.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 01/15/2020] [Accepted: 01/16/2020] [Indexed: 11/25/2022]
|
13
|
Dark tone quality and vocal tract shaping in soprano song production: Insights from real-time MRI. JASA EXPRESS LETTERS 2021; 1:075202. [PMID: 34291230 PMCID: PMC8273971 DOI: 10.1121/10.0005109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 05/10/2021] [Indexed: 06/13/2023]
Abstract
Tone quality termed "dark" is an aesthetically important property of Western classical voice performance and has been associated with lowered formant frequencies, lowered larynx, and widened pharynx. The present study uses real-time magnetic resonance imaging with synchronous audio recordings to investigate dark tone quality in four professionally trained sopranos with enhanced ecological validity and a relatively complete view of the vocal tract. Findings differ from traditional accounts, indicating that labial narrowing may be the primary driver of dark tone quality across performers, while many other aspects of vocal tract shaping are shown to differ significantly in a performer-specific way.
Collapse
|
14
|
|
15
|
Vocal fold oscillation pattern changes related to loudness in patients with vocal fold mass lesions. J Otolaryngol Head Neck Surg 2020; 49:80. [PMID: 33228812 PMCID: PMC7686765 DOI: 10.1186/s40463-020-00481-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 11/17/2020] [Indexed: 11/10/2022] Open
Abstract
Introduction Vocal fold mass lesions can affect vocal fold oscillation patterns and therefore voice production. It has been previously observed that perturbation values from audio signals were lower with increased loudness. However, how much the oscillation patterns change with gradual alteration of loudness is not yet fully understood. Material and methods Eight patients with vocal fold mass lesions were asked to perform a glide from minimum to maximum loudness on the vowel /i/, ƒo of 125 Hz for male or 250 Hz for female voices. During phonation the subjects were simultaneously recorded with transnasal high speed videoendoscopy (HSV, 20,000 fps), electroglottography (EGG), and an audio recording. Based on the HSV material the Glottal Area Waveform (GAW) was segmented and GAW parameters were computed. Results The greatest vocal fold irregularities were observed at different values between minimum and maximum sound pressure level. There was a relevant discrepancy between the HSV and EGG derived open quotients. Furthermore, the EGG derived sample entropy and GAW values also evidenced different behavior. Conclusions The amount of vocal fold irregularity changes with varying loudness. Therefore, any evaluation of the voice should be performed under different loudness conditions. The discrepancy between EGG and GAW values appears to be much stronger in patients with vocal fold mass lesions than those with normal physiological conditions. Level of evidence 4.
Collapse
|
16
|
Realistic Dynamic Numerical Phantom for MRI of the Upper Vocal Tract. J Imaging 2020; 6:86. [PMID: 34460743 PMCID: PMC8320850 DOI: 10.3390/jimaging6090086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 08/08/2020] [Accepted: 08/24/2020] [Indexed: 11/16/2022] Open
Abstract
Dynamic and real-time MRI (rtMRI) of human speech is an active field of research, with interest from both the linguistics and clinical communities. At present, different research groups are investigating a range of rtMRI acquisition and reconstruction approaches to visualise the speech organs. Similar to other moving organs, it is difficult to create a physical phantom of the speech organs to optimise these approaches; therefore, the optimisation requires extensive scanner access and imaging of volunteers. As previously demonstrated in cardiac imaging, realistic numerical phantoms can be useful tools for optimising rtMRI approaches and reduce reliance on scanner access and imaging volunteers. However, currently, no such speech rtMRI phantom exists. In this work, a numerical phantom for optimising speech rtMRI approaches was developed and tested on different reconstruction schemes. The novel phantom comprised a dynamic image series and corresponding k-space data of a single mid-sagittal slice with a temporal resolution of 30 frames per second (fps). The phantom was developed based on images of a volunteer acquired at a frame rate of 10 fps. The creation of the numerical phantom involved the following steps: image acquisition, image enhancement, segmentation, mask optimisation, through-time and spatial interpolation and finally the derived k-space phantom. The phantom was used to: (1) test different k-space sampling schemes (Cartesian, radial and spiral); (2) create lower frame rate acquisitions by simulating segmented k-space acquisitions; (3) simulate parallel imaging reconstructions (SENSE and GRAPPA). This demonstrated how such a numerical phantom could be used to optimise images and test multiple sampling strategies without extensive scanner access.
Collapse
|
17
|
Amplitude Effects of Vocal Tract Resonance Adjustments When Singing Louder. J Voice 2020; 36:292.e11-292.e22. [PMID: 32624371 DOI: 10.1016/j.jvoice.2020.05.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 05/20/2020] [Accepted: 05/26/2020] [Indexed: 10/23/2022]
Abstract
In the literature on vocal pedagogy we may find suggestions to increase the mouth opening when singing louder. It is known that sopranos tend to sing loud high notes with a wider mouth opening which raises the frequency of the first resonance of the vocal tract (fR1) to tune it close to the fundamental. Our experiment with classically trained male singers revealed that they also tended to raise the fR1 with the dynamics at pitches where the formant tuning does not seem relevant. The analysis by synthesis showed that such behaviour may contribute to the strengthening of the singer's formant by several dB-s and to a rise in the centre of spectral gravity. The contribution of the fR1 raising to the overall sound level was less consistent. Changing the extent of the mouth opening with the dynamics may create several simultaneous semantic cues that signal how prominent the produced sound is and how great the physical effort by the singer is. The diminishing of the mouth opening when singing piano may also have an importance as it helps singers to produce a quieter sound by increasing the distance between the fR1 and higher resonances, which lowers the transfer function of the vocal tract at the relevant spectral regions.
Collapse
|
18
|
|
19
|
Multidimensional Timbre Spaces of Cochlear Implant Vocoded and Non-vocoded Synthetic Female Singing Voices. Front Neurosci 2020; 14:307. [PMID: 32372904 PMCID: PMC7179674 DOI: 10.3389/fnins.2020.00307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 03/16/2020] [Indexed: 12/04/2022] Open
Abstract
Many post-lingually deafened cochlear implant (CI) users report that they no longer enjoy listening to music, which could possibly contribute to a perceived reduction in quality of life. One aspect of music perception, vocal timbre perception, may be difficult for CI users because they may not be able to use the same timbral cues available to normal hearing listeners. Vocal tract resonance frequencies have been shown to provide perceptual cues to voice categories such as baritone, tenor, mezzo-soprano, and soprano, while changes in glottal source spectral slope are believed to be related to perception of vocal quality dimensions such as fluty vs. brassy. As a first step toward understanding vocal timbre perception in CI users, we employed an 8-channel noise-band vocoder to test how vocoding can alter the timbral perception of female synthetic sung vowels across pitches. Non-vocoded and vocoded stimuli were synthesized with vibrato using 3 excitation source spectral slopes and 3 vocal tract transfer functions (mezzo-soprano, intermediate, soprano) at the pitches C4, B4, and F5. Six multi-dimensional scaling experiments were conducted: C4 not vocoded, C4 vocoded, B4 not vocoded, B4 vocoded, F5 not vocoded, and F5 vocoded. At the pitch C4, for both non-vocoded and vocoded conditions, dimension 1 grouped stimuli according to voice category and was most strongly predicted by spectral centroid from 0 to 2 kHz. While dimension 2 grouped stimuli according to excitation source spectral slope, it was organized slightly differently and predicted by different acoustic parameters in the non-vocoded and vocoded conditions. For pitches B4 and F5 spectral centroid from 0 to 2 kHz most strongly predicted dimension 1. However, while dimension 1 separated all 3 voice categories in the vocoded condition, dimension 1 only separated the soprano stimuli from the intermediate and mezzo-soprano stimuli in the non-vocoded condition. While it is unclear how these results predict timbre perception in CI listeners, in general, these results suggest that perhaps some aspects of vocal timbre may remain.
Collapse
|
20
|
Automatic vocal tract landmark localization from midsagittal MRI data. Sci Rep 2020; 10:1468. [PMID: 32001739 PMCID: PMC6992757 DOI: 10.1038/s41598-020-58103-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 01/09/2020] [Indexed: 11/29/2022] Open
Abstract
The various speech sounds of a language are obtained by varying the shape and position of the articulators surrounding the vocal tract. Analyzing their variations is crucial for understanding speech production, diagnosing speech disorders and planning therapy. Identifying key anatomical landmarks of these structures on medical images is a pre-requisite for any quantitative analysis and the rising amount of data generated in the field calls for an automatic solution. The challenge lies in the high inter- and intra-speaker variability, the mutual interaction between the articulators and the moderate quality of the images. This study addresses this issue for the first time and tackles it by means of Deep Learning. It proposes a dedicated network architecture named Flat-net and its performance are evaluated and compared with eleven state-of-the-art methods from the literature. The dataset contains midsagittal anatomical Magnetic Resonance Images for 9 speakers sustaining 62 articulations with 21 annotated anatomical landmarks per image. Results show that the Flat-net approach outperforms the former methods, leading to an overall Root Mean Square Error of 3.6 pixels/0.36 cm obtained in a leave-one-out procedure over the speakers. The implementation codes are also shared publicly on GitHub.
Collapse
|
21
|
Influence of Voice Focus Adjustments on Oral-Nasal Balance in Speech and Song. Folia Phoniatr Logop 2019; 72:351-362. [DOI: 10.1159/000501908] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 07/04/2019] [Indexed: 11/19/2022] Open
|
22
|
Sexual Dimorphism in Laryngeal Volumetric Measurements Using Magnetic Resonance Imaging. EAR, NOSE & THROAT JOURNAL 2019; 99:132-136. [PMID: 31018691 DOI: 10.1177/0145561319840568] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The objective of this study is to investigate the dimensional and volumetric measurements in the thyroarytenoid (TA) muscle in men and women using magnetic resonance imaging (MRI). The hypothesis is that there is a gender-related difference in these measurements. A retrospective chart review of 76 patients who underwent MRI of the neck at the American University of Beirut Medical Center was conducted. The dimension and volume of the right and left TA muscle were measured on axial and coronal planes short tau inversion recovery images. Male and female groups were compared with respect to demographic data and MRI findings using parametric and nonparametric tests. The mean length of the thyro-arytenoid muscle in males was larger than that in females on the right (males 2.44 [0.29] cm vs females 1.70 [0.22] cm) and on the left (males 2.50 [0.28] cm vs females 1.72 [0.24] cm) reaching statistical significance (P < .001). The mean width of the thyro-arytenoid muscle in males was larger than that in females on the right (males 0.68 [0.13] cm vs females 0.59 [0.11] cm) and on the left (males 0.68 [0.12] cm vs females 0.57 [0.12] cm) reaching statistical significance (P < .001). The mean height of the thyro-arytenoid muscle in males was larger than that in females on the right (males 1.05 [0.21] cm vs females 0.95 [0.12] cm) and on the left (males 1.05 [0.21] cm vs females 0.95 [0.12] cm) reaching statistical significance (P < .01 on the right and P < .05 on the left). The volume of the thyroarytenoid muscle in males was larger than that in females on the right (males 0.86 [0.25] mL vs females 0.48 [0.15] mL) and on the left (males 0.89 [0.27] mL vs females 0.48 [0.17] mL) reaching statistical significance (P < .001). The results of this investigation clearly indicate a significant difference in these measurements between men and women.
Collapse
|
23
|
|
24
|
The Vocal Tract Organ: A New Musical Instrument Using 3-D Printed Vocal Tracts. J Voice 2018; 32:660-667. [DOI: 10.1016/j.jvoice.2017.09.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2017] [Revised: 09/18/2017] [Accepted: 09/20/2017] [Indexed: 10/18/2022]
|
25
|
Abstract
At first glance, the monkey brain looks like a smaller version of the human brain. Indeed, the anatomical and functional architecture of the cortical auditory system in monkeys is very similar to that of humans, with dual pathways segregated into a ventral and a dorsal processing stream. Yet, monkeys do not speak. Repeated attempts to pin this inability on one particular cause have failed. A closer look at the necessary components of language, according to Darwin, reveals that all of them got a significant boost during evolution from nonhuman to human primates. The vocal-articulatory system, in particular, has developed into the most sophisticated of all human sensorimotor systems with about a dozen effectors that, in combination with each other, result in an auditory communication system like no other. This sensorimotor network possesses all the ingredients of an internal model system that permits the emergence of sequence processing, as required for phonology and syntax in modern languages.
Collapse
|
26
|
Vocal Tract Morphology in Inhaling Singing: Characteristics During Vowel Production-A Case Study in a Professional Singer. J Voice 2017; 32:643.e17-643.e23. [PMID: 28886973 DOI: 10.1016/j.jvoice.2017.08.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 08/01/2017] [Accepted: 08/01/2017] [Indexed: 11/16/2022]
Abstract
BACKGROUND A professional singer produced various vowels on a comfortable loudness and pitch in an inspiratory and expiratory phonation manner. The present study investigates the morphological differences and tries to find a link with the acoustical characteristics. OBJECTIVES/HYPOTHESIS We hypothesize that features, constantly present over all vowels, characterize inhaling phonation and that the formant frequencies reflect the morphological findings. STUDY DESIGN A prospective case study was carried out. METHODS A female singer uttered the vowels /a/, /e/, /i/, /o/, and /u/ in a supine position under magnetic resonance imaging, on a comfortable loudness and pitch, in both inhaling and exhaling manner. The exact same parameters as in previous reports were measured (1-3). Acoustical analysis was performed with Praat. RESULTS Wilcoxon directional testing demonstrates a statistically significant difference in (1) the distance between the lips, (2) the antero-posterior tongue diameter, (3) the distance between the lips and the tip of the tongue, (4) the distance between the epiglottis and the posterior pharyngeal wall, (5) the narrowing of the subglottic space, and (6) the oropharyngeal and the hypopharyngeal areas. Acoustical analysis reveals slightly more noise and irregularity during reverse phonation. The central frequency of F0 and F1 is identical, whereas that of F2 and F3 increases, and that of F4 varies. CONCLUSIONS A smaller mouth opening, a narrowing of the subglottic space, a larger supralaryngeal inlet, and a smaller antero-posterior tongue diameter can be considered as morphological characteristics for reverse phonation. Acoustically, reverse phonation discretely contains more noise and perturbation. The formant frequency distribution concurs with a mouth narrowing and pharyngeal widening during inhaling.
Collapse
|
27
|
Three-dimensional Vocal Tract Morphology Based on Multiple Magnetic Resonance Images Is Highly Reproducible During Sustained Phonation. J Voice 2017; 31:504.e11-504.e20. [DOI: 10.1016/j.jvoice.2016.11.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Revised: 11/09/2016] [Accepted: 11/10/2016] [Indexed: 11/21/2022]
|
28
|
Test-retest repeatability of human speech biomarkers from static and real-time dynamic magnetic resonance imaging. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:3323. [PMID: 28599561 PMCID: PMC5436977 DOI: 10.1121/1.4983081] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Static anatomical and real-time dynamic magnetic resonance imaging (RT-MRI) of the upper airway is a valuable method for studying speech production in research and clinical settings. The test-retest repeatability of quantitative imaging biomarkers is an important parameter, since it limits the effect sizes and intragroup differences that can be studied. Therefore, this study aims to present a framework for determining the test-retest repeatability of quantitative speech biomarkers from static MRI and RT-MRI, and apply the framework to healthy volunteers. Subjects (n = 8, 4 females, 4 males) are imaged in two scans on the same day, including static images and dynamic RT-MRI of speech tasks. The inter-study agreement is quantified using intraclass correlation coefficient (ICC) and mean within-subject standard deviation (σe). Inter-study agreement is strong to very strong for static measures (ICC: min/median/max 0.71/0.89/0.98, σe: 0.90/2.20/6.72 mm), poor to strong for dynamic RT-MRI measures of articulator motion range (ICC: 0.26/0.75/0.90, σe: 1.6/2.5/3.6 mm), and poor to very strong for velocities (ICC: 0.21/0.56/0.93, σe: 2.2/4.4/16.7 cm/s). In conclusion, this study characterizes repeatability of static and dynamic MRI-derived speech biomarkers using state-of-the-art imaging. The introduced framework can be used to guide future development of speech biomarkers. Test-retest MRI data are provided free for research use.
Collapse
|
29
|
Abstract
Abstract. Previous research involving preschool children and adults suggests that moving in synchrony with others can foster cooperation. Song provides a rich oscillatory framework that supports synchronous movement and may thus be considered a powerful agent of positive social relations. In the current study, we assessed this hypothesis in a group of primary-school aged children with diverse ethnic and socioeconomic backgrounds. Children participated in one of three activity conditions: group singing, group art, or competitive games. They were then asked to play a prisoner’s dilemma game as a measure of cooperation. Results showed that children who engaged in group singing were more cooperative than children who engaged in group art or competitive games.
Collapse
|