1
|
Funk R, Weirich M, Simpson AP. The Effect of Fundamental Frequency on Gender Perception in Prepubertal Children: Insights from the LoKiS Database. J Voice 2024:S0892-1997(24)00129-2. [PMID: 38704276 DOI: 10.1016/j.jvoice.2024.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/05/2024] [Accepted: 04/05/2024] [Indexed: 05/06/2024]
Abstract
This study examines the impact of fundamental frequency on gender perception in prepubertal children in the LoKiS database - a longitudinal project collecting and analyzing recordings of approximately 60 German primary school children aged 6 to 10years. Spontaneous and content-controlled audio recordings were collected in two German primary schools. Three distinct listening experiments with over 100 listeners were conducted. In the first experiment, listeners judged the gender of the voices on a seven-point scale. The second experiment explored the relationships between perceptual attribute ratings and corresponding acoustic parameters associated with fundamental frequency. The third experiment utilized voice morphing techniques to investigate the influence of fundamental frequency on gender perception while controlling for other acoustic parameters. About one-third of the children receive unambiguous gender attributions. The perceived gender difference between children assigned female at birth (AFAB) and assigned male at birth (AMAB) increases from first to third grade. The feminine-sounding children were perceived as significantly higher-pitched and more melodious. A strong correlation between perceived pitch and measured fundamental frequency was found. While the acoustic analysis revealed only a few significant differences between AFAB and AMAB children in general, the feminine-sounding children exhibited markedly higher values than the masculine-sounding ones. Stronger differences of fundamental frequency and semitone range occur as AFAB and AMAB children get older. Linear mixed models confirm a significant influence of fundamental frequency and semitone range on gender perception. Other interacting factors include the speech material used, as well as the gender of the listener. The influence of fundamental frequency was even more pronounced when controlling for other acoustic parameters.
Collapse
|
2
|
Haas E, Ziegler W, Schölderle T. Age Estimation and Gender Attribution in Typically Developing Children and Children With Dysarthria. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 33:1236-1253. [PMID: 38416062 DOI: 10.1044/2023_ajslp-23-00246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
PURPOSE The purposes of this study were (a) to investigate adult listeners' perceptions of age and gender in typically developing children and children with dysarthria and (b) to identify predictors of their estimates among auditory-perceptual parameters and an acoustic measure of vocal pitch (F0). We aimed to evaluate the influence of dysarthria on the listeners' impressions of age and gender against the background of typical developmental processes. METHOD In a listening experiment, adult listeners completed age and gender estimates of 144 typically developing children (3-9 years of age) and 25 children with dysarthria (5-9 years of age). The Bogenhausen Dysarthria Scales for Childhood Dysarthria (BoDyS-KiD) were applied to record speech samples and to complete auditory-perceptual judgments covering all speech subsystems. Furthermore, each child's mean F0 was determined from samples of four BoDyS-KiD sentences. RESULTS Age estimates for the typically developing children showed a regression to the mean, whereas children with dysarthria were systematically underestimated in their age. The estimates of all children were predicted by developmental speech features; for the children with dysarthria, specific dysarthria symptoms had an additional effect. We found a significantly higher accuracy of gender attribution in the typically developing children than in the children with dysarthria. The prediction accuracy of the listeners' gender attribution in the preadolescent children by the included speech characteristics was limited. CONCLUSIONS Children with dysarthria are more difficult to estimate for their age and gender than their typically developing peers. Dysarthria thus alters the auditory-perceptual impression of indexical speech features in children, which must be considered another facet of the communication disorder associated with childhood dysarthria.
Collapse
Affiliation(s)
- Elisabet Haas
- Clinical Neuropsychology Research Group, Institute for Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, Germany
| | - Wolfram Ziegler
- Clinical Neuropsychology Research Group, Institute for Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, Germany
| | - Theresa Schölderle
- Clinical Neuropsychology Research Group, Institute for Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, Germany
| |
Collapse
|
3
|
Li Z, Zhang D, Chen H, Liu Y, Wang HC. Voice Pitch Shaping and Genderization: New Needs of Cosmetic Phonoplastic Surgery. Aesthetic Plast Surg 2024:10.1007/s00266-024-03919-0. [PMID: 38565723 DOI: 10.1007/s00266-024-03919-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Accepted: 02/08/2024] [Indexed: 04/04/2024]
Abstract
Voices can convey content, emotion, and essential information about an individual's gender and social information. Closely related to gender identification and sexual attraction, voices also positively affect many psychological factors of individuals. Surgeries have evolved from treating congenital diseases to fulfilling an individual's aesthetic needs for voice. Voice shaping is emerging as the next cosmetic surgery hotspot after skincare and appearance and body shaping. This paper summarizes the development of voice pitch shaping and genderization procedures out of the cosmetic need. LEVEL OF EVIDENCE IV: This journal requires that authors assign a level of evidence to each article. For a full description of these evidence-based medicine ratings, please refer to the Table of Contents or the online Instructions to Authors https://www.springer.com/00266 .
Collapse
Affiliation(s)
- Zhijin Li
- Department of Plastic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Dingyue Zhang
- Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Hongsai Chen
- Department of Otorhinolaryngology, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Ying Liu
- Department of Plastic and Reconstructive Surgery, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, No. 639 of Zhizaoju Road, District Huangpu, Shanghai, 200011, China.
| | - Hayson Chenyu Wang
- Department of Plastic and Reconstructive Surgery, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, No. 639 of Zhizaoju Road, District Huangpu, Shanghai, 200011, China.
| |
Collapse
|
4
|
Anikin A, Barreda S, Reby D. A practical guide to calculating vocal tract length and scale-invariant formant patterns. Behav Res Methods 2023:10.3758/s13428-023-02288-x. [PMID: 38158551 DOI: 10.3758/s13428-023-02288-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2023] [Indexed: 01/03/2024]
Abstract
Formants (vocal tract resonances) are increasingly analyzed not only by phoneticians in speech but also by behavioral scientists studying diverse phenomena such as acoustic size exaggeration and articulatory abilities of non-human animals. This often involves estimating vocal tract length acoustically and producing scale-invariant representations of formant patterns. We present a theoretical framework and practical tools for carrying out this work, including open-source software solutions included in R packages soundgen and phonTools. Automatic formant measurement with linear predictive coding is error-prone, but formant_app provides an integrated environment for formant annotation and correction with visual and auditory feedback. Once measured, formants can be normalized using a single recording (intrinsic methods) or multiple recordings from the same individual (extrinsic methods). Intrinsic speaker normalization can be as simple as taking formant ratios and calculating the geometric mean as a measure of overall scale. The regression method implemented in the function estimateVTL calculates the apparent vocal tract length assuming a single-tube model, while its residuals provide a scale-invariant vowel space based on how far each formant deviates from equal spacing (the schwa function). Extrinsic speaker normalization provides more accurate estimates of speaker- and vowel-specific scale factors by pooling information across recordings with simple averaging or mixed models, which we illustrate with example datasets and R code. The take-home messages are to record several calls or vowels per individual, measure at least three or four formants, check formant measurements manually, treat uncertain values as missing, and use the statistical tools best suited to each modeling context.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Department of Philosophy, Lund University, Box 192, SE-221 00, Lund, Sweden.
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, University of Saint Étienne, 42023, St-Étienne, France.
| | - Santiago Barreda
- Department of Linguistics, University of California, Davis, Davis, CA, USA
| | - David Reby
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, University of Saint Étienne, 42023, St-Étienne, France
- Institut Universitaire de France, 75005, Paris, France
| |
Collapse
|
5
|
Houle N, Lerario MP, Levi SV. Spectral analysis of strident fricatives in cisgender and transfeminine speakersa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3089-3100. [PMID: 37962405 PMCID: PMC10651311 DOI: 10.1121/10.0022387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/24/2023] [Accepted: 10/10/2023] [Indexed: 11/15/2023]
Abstract
The spectral features of /s/ and /ʃ/ carry important sociophonetic information regarding a speaker's gender. Often, gender is misclassified as a binary of male or female, but this excludes people who may identify as transgender or nonbinary. In this study, we use a more expansive definition of gender to investigate the acoustics (duration and spectral moments) of /s/ and /ʃ/ across cisgender men, cisgender women, and transfeminine speakers in voiced and whispered speech and the relationship between spectral measures and transfeminine gender expression. We examined /s/ and /ʃ/ productions in words from 35 speakers (11 cisgender men, 17 cisgender women, 7 transfeminine speakers) and 34 speakers (11 cisgender men, 15 cisgender women, 8 transfeminine speakers), respectively. In general, /s/ and /ʃ/ center of gravity was highest in productions by cisgender women, followed by transfeminine speakers, and then cisgender men speakers. There were no other gender-related differences. Within transfeminine speakers, /s/ and /ʃ/ center of gravity and skewness were not related to the time proportion expressing their feminine spectrum gender or their Trans Women Voice Questionnaire scores. Taken together, the acoustics of /s/ and /ʃ/ may signal gender group identification but may not account for within-gender variation in transfeminine gender expression.
Collapse
Affiliation(s)
- Nichole Houle
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA
| | | | - Susannah V Levi
- Department of Communicative Sciences and Disorders, New York University, New York, New York 10012, USA
| |
Collapse
|
6
|
Funk R, Simpson AP. The Acoustic and Perceptual Correlates of Gender in Children's Voices. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3346-3363. [PMID: 37625149 DOI: 10.1044/2023_jslhr-22-00682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/27/2023]
Abstract
PURPOSE This study investigates the perceptual and acoustic correlates of gender in prepubertal voices. The study is part of a longitudinal project analyzing recordings of circa 60 German primary school children from the first to fourth grades (6- to 10-year-olds). METHOD Spontaneous and content-controlled audio recordings were made of 62 first-grade children (29 girls, 33 boys; age: 6- to 7-year-olds) from two German primary schools. Information on gender conformity was also recorded. A total of 167 listeners judged the gender of the voices on a 7-point scale. The results of the listening experiments and gender conformity ratings were related to a range of typical acoustic parameters. RESULTS Measures of self-reported gender conformity differ significantly between the boys and the girls. Sixteen of the 62 children show unambiguous gender attributions in the listening experiment. A hierarchical cluster analysis including gender perception, gender conformity, and acoustic parameters shows four different types of speakers. Two multiple regression models revealed a significant main effect of fundamental frequency on the gender perception ratings of the listening experiment across and within gender. Significant correlations were found between the center of gravity and skewness of the sibilants and gender conformity, especially for the male speakers. CONCLUSIONS Fundamental frequency plays an important role in influencing perceptual judgments, whereas sibilant spectra are correlated with gender conformity. In further listening experiments, we will examine in more detail the role of individual acoustic parameters and analyze how the vocal expression of gender and gender conformity in individual children develops before reaching puberty.
Collapse
Affiliation(s)
- Riccarda Funk
- Institute for German Linguistics, Friedrich-Schiller University, Jena, Germany
| | - Adrian P Simpson
- Institute for German Linguistics, Friedrich-Schiller University, Jena, Germany
| |
Collapse
|
7
|
Horses cross-modally recognize women and men. Sci Rep 2023; 13:3864. [PMID: 36890162 PMCID: PMC9995451 DOI: 10.1038/s41598-023-30830-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 03/02/2023] [Indexed: 03/10/2023] Open
Abstract
Several studies have shown that horses have the ability to cross-modally recognize humans by associating their voice with their physical appearance. However, it remains unclear whether horses are able to differentiate humans according to different criteria, such as the fact that they are women or men. Horses might recognize some human characteristics, such as sex, and use these characteristics to classify them into different categories. The aim of this study was to explore whether domesticated horses are able to cross-modally recognize women and men according to visual and auditory cues, using a preferential looking paradigm. We simultaneously presented two videos of women and men's faces, while playing a recording of a human voice belonging to one of these two categories through a loudspeaker. The results showed that the horses looked significantly more towards the congruent video than towards the incongruent video, suggesting that they are able to associate women's voices with women's faces and men's voices with men's faces. Further investigation is necessary to determine the mechanism underlying this recognition, as it might be interesting to determine which characteristics horses use to categorize humans. These results suggest a novel perspective that could allow us to better understand how horses perceive humans.
Collapse
|
8
|
Marchand Knight J, Sares AG, Deroche MLD. Visual biases in evaluation of speakers' and singers' voice type by cis and trans listeners. Front Psychol 2023; 14:1046672. [PMID: 37205083 PMCID: PMC10187036 DOI: 10.3389/fpsyg.2023.1046672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 03/29/2023] [Indexed: 05/21/2023] Open
Abstract
Introduction A singer's or speaker's Fach (voice type) should be appraised based on acoustic cues characterizing their voice. Instead, in practice, it is often influenced by the individual's physical appearance. This is especially distressful for transgender people who may be excluded from formal singing because of perceived mismatch between their voice and appearance. To eventually break down these visual biases, we need a better understanding of the conditions under which they occur. Specifically, we hypothesized that trans listeners (not actors) would be better able to resist such biases, relative to cis listeners, precisely because they would be more aware of appearance-voice dissociations. Methods In an online study, 85 cisgender and 81 transgender participants were presented with 18 different actors singing or speaking short sentences. These actors covered six voice categories from high/bright (traditionally feminine) to low/dark (traditionally masculine) voices: namely soprano, mezzo-soprano (referred to henceforth as mezzo), contralto (referred to henceforth as alto), tenor, baritone, and bass. Every participant provided voice type ratings for (1) Audio-only (A) stimuli to get an unbiased estimate of a given actor's voice type, (2) Video-only (V) stimuli to get an estimate of the strength of the bias itself, and (3) combined Audio-Visual (AV) stimuli to see how much visual cues would affect the evaluation of the audio. Results Results demonstrated that visual biases are not subtle and hold across the entire scale, shifting voice appraisal by about a third of the distance between adjacent voice types (for example, a third of the bass-to-baritone distance). This shift was 30% smaller for trans than for cis listeners, confirming our main hypothesis. This pattern was largely similar whether actors sung or spoke, though singing overall led to more feminine/high/bright ratings. Conclusion This study is one of the first demonstrations that transgender listeners are in fact better judges of a singer's or speaker's voice type because they are better able to separate the actors' voice from their appearance, a finding that opens exciting avenues to fight more generally against implicit (or sometimes explicit) biases in voice appraisal.
Collapse
|
9
|
Munson B, Lackas N, Koeppe K. Individual Differences in the Development of Gendered Speech in Preschool Children: Evidence From a Longitudinal Study. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1311-1330. [PMID: 35240039 PMCID: PMC9499347 DOI: 10.1044/2021_jslhr-21-00465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 11/07/2021] [Accepted: 12/13/2021] [Indexed: 06/14/2023]
Abstract
PURPOSE We evaluated whether naive listeners' ratings of the gender typicality of the speech of children assigned male at birth (AMAB) and children assigned female at birth (AFAB) were different at two time points: one at which children were 2.5-3.5 years old and one when they were 4.5-5.5 years old. We also examined whether measures of speech, language, and inhibitory control predicted developmental changes in these ratings. METHOD A group of adults (N = 80) rated single-word productions of 55 AMAB and 55 AFAB children on a continuous scale from "definitely a boy" to "definitely a girl." Children's productions were taken from previous longitudinal study of phonological development and vocabulary growth. As part of that study, children completed a battery of standardized and nonstandardized tests at both time points. RESULTS Listener ratings for AMAB and AFAB children were significantly different at both time points. The difference was larger at the later time point, and this was due entirely to changes in the ratings of AMAB children's speech. A measure of language production and a measure of inhibitory control predicted developmental changes in these ratings, albeit only weakly, and not in a consistent direction. CONCLUSIONS The gender typicality of AMAB and AFAB children's speech is perceptibly different for children as young as 2.5 years old. Developmental changes in perceived gender typicality are driven by changes in the speech of AMAB children. The learning of gendered speech is not constrained or facilitated by overall speech and language skill.
Collapse
Affiliation(s)
- Benjamin Munson
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| | - Natasha Lackas
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| | - Kiana Koeppe
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis
| |
Collapse
|
10
|
Cartei V, Reby D, Garnham A, Oakhill J, Banerjee R. Peer audience effects on children's vocal masculinity and femininity. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200397. [PMID: 34775826 PMCID: PMC8591376 DOI: 10.1098/rstb.2020.0397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Existing evidence suggests that children from around the age of 8 years strategically alter their public image in accordance with known values and preferences of peers, through the self-descriptive information they convey. However, an important but neglected aspect of this 'self-presentation' is the medium through which such information is communicated: the voice itself. The present study explored peer audience effects on children's vocal productions. Fifty-six children (26 females, aged 8-10 years) were presented with vignettes where a fictional child, matched to the participant's age and sex, is trying to make friends with a group of same-sex peers with stereotypically masculine or feminine interests (rugby and ballet, respectively). Participants were asked to impersonate the child in that situation and, as the child, to read out loud masculine, feminine and gender-neutral self-descriptive statements to these hypothetical audiences. They also had to decide which of those self-descriptive statements would be most helpful for making friends. In line with previous research, boys and girls preferentially selected masculine or feminine self-descriptive statements depending on the audience interests. Crucially, acoustic analyses of fundamental frequency and formant frequency spacing revealed that children also spontaneously altered their vocal productions: they feminized their voices when speaking to members of the ballet club, while they masculinized their voices when speaking to members of the rugby club. Both sexes also feminized their voices when uttering feminine sentences, compared to when uttering masculine and gender-neutral sentences. Implications for the hitherto neglected role of acoustic qualities of children's vocal behaviour in peer interactions are discussed. This article is part of the theme issue 'Voice modulation: from origin and mechanism to social impact (Part II)'.
Collapse
Affiliation(s)
- Valentina Cartei
- School of Psychology, University of Sussex, Brighton, UK,Equipe Neuro-Ethologie Sensorielle, ENES/CRNL, CNRS UMR5292, INSERM UMR_S 1028, University of Lyon, Saint-Etienne, France,Psychology, University of Chichester, Chichester, UK
| | - David Reby
- Equipe Neuro-Ethologie Sensorielle, ENES/CRNL, CNRS UMR5292, INSERM UMR_S 1028, University of Lyon, Saint-Etienne, France
| | - Alan Garnham
- School of Psychology, University of Sussex, Brighton, UK
| | - Jane Oakhill
- School of Psychology, University of Sussex, Brighton, UK
| | - Robin Banerjee
- School of Psychology, University of Sussex, Brighton, UK
| |
Collapse
|
11
|
Anikin A, Pisanski K, Reby D. Static and dynamic formant scaling conveys body size and aggression. ROYAL SOCIETY OPEN SCIENCE 2022; 9:211496. [PMID: 35242348 PMCID: PMC8753157 DOI: 10.1098/rsos.211496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 12/09/2021] [Indexed: 05/03/2023]
Abstract
When producing intimidating aggressive vocalizations, humans and other animals often extend their vocal tracts to lower their voice resonance frequencies (formants) and thus sound big. Is acoustic size exaggeration more effective when the vocal tract is extended before, or during, the vocalization, and how do listeners interpret within-call changes in apparent vocal tract length? We compared perceptual effects of static and dynamic formant scaling in aggressive human speech and nonverbal vocalizations. Acoustic manipulations corresponded to elongating or shortening the vocal tract either around (Experiment 1) or from (Experiment 2) its resting position. Gradual formant scaling that preserved average frequencies conveyed the impression of smaller size and greater aggression, regardless of the direction of change. Vocal tract shortening from the original length conveyed smaller size and less aggression, whereas vocal tract elongation conveyed larger size and more aggression, and these effects were stronger for static than for dynamic scaling. Listeners familiarized with the speaker's natural voice were less often 'fooled' by formant manipulations when judging speaker size, but paid more attention to formants when judging aggressive intent. Thus, within-call vocal tract scaling conveys emotion, but a better way to sound large and intimidating is to keep the vocal tract consistently extended.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
- ENES Sensory Neuro-Ethology lab, CRNL, Jean Monnet University of Saint Étienne, UMR 5293, 42023, St-Étienne, France
| | - Katarzyna Pisanski
- ENES Sensory Neuro-Ethology lab, CRNL, Jean Monnet University of Saint Étienne, UMR 5293, 42023, St-Étienne, France
| | - David Reby
- ENES Sensory Neuro-Ethology lab, CRNL, Jean Monnet University of Saint Étienne, UMR 5293, 42023, St-Étienne, France
| |
Collapse
|
12
|
Waters S, Kanber E, Lavan N, Belyk M, Carey D, Cartei V, Lally C, Miquel M, McGettigan C. Singers show enhanced performance and neural representation of vocal imitation. Philos Trans R Soc Lond B Biol Sci 2021; 376:20200399. [PMID: 34719245 PMCID: PMC8558773 DOI: 10.1098/rstb.2020.0399] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/06/2021] [Indexed: 12/17/2022] Open
Abstract
Humans have a remarkable capacity to finely control the muscles of the larynx, via distinct patterns of cortical topography and innervation that may underpin our sophisticated vocal capabilities compared with non-human primates. Here, we investigated the behavioural and neural correlates of laryngeal control, and their relationship to vocal expertise, using an imitation task that required adjustments of larynx musculature during speech. Highly trained human singers and non-singer control participants modulated voice pitch and vocal tract length (VTL) to mimic auditory speech targets, while undergoing real-time anatomical scans of the vocal tract and functional scans of brain activity. Multivariate analyses of speech acoustics, larynx movements and brain activation data were used to quantify vocal modulation behaviour and to search for neural representations of the two modulated vocal parameters during the preparation and execution of speech. We found that singers showed more accurate task-relevant modulations of speech pitch and VTL (i.e. larynx height, as measured with vocal tract MRI) during speech imitation; this was accompanied by stronger representation of VTL within a region of the right somatosensory cortex. Our findings suggest a common neural basis for enhanced vocal control in speech and song. This article is part of the theme issue 'Voice modulation: from origin and mechanism to social impact (Part I)'.
Collapse
Affiliation(s)
- Sheena Waters
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Wolfson Institute of Preventive Medicine, Barts and The London School of Medicine and Dentistry, Charterhouse Square, London EC1M 6BQ, UK
| | - Elise Kanber
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| | - Nadine Lavan
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
- Department of Biological and Experimental Psychology, Queen Mary University of London, Mile End Road, Bethnal Green, London E1 4NS, UK
| | - Michel Belyk
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| | - Daniel Carey
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Data & AI, Novartis Pharmaceuticals, Novartis Global Service Center, 203 Merrion Road, Dublin 4 D04 NN12, Ireland
| | - Valentina Cartei
- Equipe de Neuro-Ethologie Sensorielle (ENES), Centre de Recherche en Neurosciences de Lyon, Université de Lyon/Saint-Etienne, 21 rue du Docteur Paul Michelon, 42100 Saint-Etienne, France
- Department of Psychology, Institute of Education, Health and Social Sciences, University of Chichester, College Lane, Chichester, West Sussex PO19 6PE, UK
| | - Clare Lally
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| | - Marc Miquel
- Department of Clinical Physics, Barts Health NHS Trust, London EC1A 7BE, UK
- William Harvey Research Institute, Queen Mary University of London, London EC1M 6BQ, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| |
Collapse
|
13
|
Barreda S, Assmann PF. Perception of gender in children's voices. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3949. [PMID: 34852594 DOI: 10.1121/10.0006785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 09/30/2021] [Indexed: 06/13/2023]
Abstract
To investigate the perception of gender from children's voices, adult listeners were presented with /hVd/ syllables, in isolation and in sentence context, produced by children between 5 and 18 years. Half the listeners were informed of the age of the talker during trials, while the other half were not. Correct gender identifications increased with talker age; however, performance was above chance even for age groups where the cues most often associated with gender differentiation (i.e., average fundamental frequency and formant frequencies) were not consistently different between boys and girls. The results of acoustic models suggest that cues were used in an age-dependent manner, whether listeners were explicitly told the age of the talker or not. Overall, results are consistent with the hypothesis that talker age and gender are estimated jointly in the process of speech perception. Furthermore, results show that the gender of individual talkers can be identified accurately well before reliable anatomical differences arise in the vocal tracts of females and males. In general, results support the notion that the transmission of gender information from voice depends substantially on gender-dependent patterns of articulation, rather than following deterministically from anatomical differences between male and female talkers.
Collapse
Affiliation(s)
- Santiago Barreda
- Department of Linguistics, University of California, Davis, California 95616, USA
| | - Peter F Assmann
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas 75080, USA
| |
Collapse
|
14
|
Cartei V, Oakhill J, Garnham A, Banerjee R, Reby D. "This Is What a Mechanic Sounds Like": Children's Vocal Control Reveals Implicit Occupational Stereotypes. Psychol Sci 2020; 31:957-967. [PMID: 32639857 PMCID: PMC7441328 DOI: 10.1177/0956797620929297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
In this study, we explored the use of variation in sex-related cues of the voice
to investigate implicit occupational stereotyping in children. Eighty-two
children between the ages of 5 and 10 years took part in an imitation task in
which they were provided with descriptions of nine occupations (three
traditionally male, three traditionally female, and three gender-neutral
professions) and asked to give voices to them (e.g., “How would a mechanic say .
. . ?”). Overall, children adapted their voices to conform to gender-stereotyped
expectations by masculinizing (lowering voice pitch and resonance) and
feminizing (raising voice pitch and resonance) their voices for the
traditionally male and female occupations, respectively. The magnitude of these
shifts increased with age, particularly in boys, and was not mediated by
children’s explicit stereotyping of the same occupations. We conclude by
proposing a simple tool based on voice pitch for assessing levels of implicit
occupational-gender stereotyping in children.
Collapse
Affiliation(s)
| | | | | | | | - David Reby
- School of Psychology, University of Sussex.,Equipe de Neuro-Ethologie Sensorielle (ENES), Université Jean Monnet
| |
Collapse
|
15
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. Development of voice perception is dissociated across gender cues in school-age children. Sci Rep 2020; 10:5074. [PMID: 32193411 PMCID: PMC7081243 DOI: 10.1038/s41598-020-61732-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 02/27/2020] [Indexed: 11/11/2022] Open
Abstract
Children's ability to distinguish speakers' voices continues to develop throughout childhood, yet it remains unclear how children's sensitivity to voice cues, such as differences in speakers' gender, develops over time. This so-called voice gender is primarily characterized by speakers' mean fundamental frequency (F0), related to glottal pulse rate, and vocal-tract length (VTL), related to speakers' size. Here we show that children's acquisition of adult-like performance for discrimination, a lower-order perceptual task, and categorization, a higher-order cognitive task, differs across voice gender cues. Children's discrimination was adult-like around the age of 8 for VTL but still differed from adults at the age of 12 for F0. Children's perceptual weight attributed to F0 for gender categorization was adult-like around the age of 6 but around the age of 10 for VTL. Children's discrimination and weighting of F0 and VTL were only correlated for 4- to 6-year-olds. Hence, children's development of discrimination and weighting of voice gender cues are dissociated, i.e., adult-like performance for F0 and VTL is acquired at different rates and does not seem to be closely related. The different developmental patterns for auditory discrimination and categorization highlight the complexity of the relationship between perceptual and cognitive mechanisms of voice perception.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands.
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands.
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, Cambridge, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
16
|
Cartei V, Banerjee R, Garnham A, Oakhill J, Roberts L, Anns S, Bond R, Reby D. Physiological and perceptual correlates of masculinity in children's voices. Horm Behav 2020; 117:104616. [PMID: 31644889 DOI: 10.1016/j.yhbeh.2019.104616] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 10/06/2019] [Accepted: 10/12/2019] [Indexed: 11/19/2022]
Abstract
Low frequency components (i.e. a low pitch (F0) and low formant spacing (ΔF)) signal high salivary testosterone and height in adult male voices and are associated with high masculinity attributions by unfamiliar listeners (in both men and women). However, the relation between the physiological, acoustic and perceptual dimensions of speakers' masculinity prior to puberty remains unknown. In this study, 110 pre-pubertal children (58 girls), aged 3 to 10, were recorded as they described a cartoon picture. 315 adults (182 women) rated children's perceived masculinity from the voice only after listening to the speakers' audio recordings. On the basis of their voices alone, boys who had higher salivary testosterone levels were rated as more masculine and the relation between testosterone and perceived masculinity was partially mediated by F0. The voices of taller boys were also rated as more masculine, but the relation between height and perceived masculinity was not mediated by the considered acoustic parameters, indicating that acoustic cues other than F0 and ΔF may signal stature. Both boys and girls who had lower F0, were also rated as more masculine, while ΔF did not affect ratings. These findings highlight the interdependence of physiological, acoustic and perceptual dimensions, and suggest that inter-individual variation in male voices, particularly F0, may advertise hormonal masculinity from a very early age.
Collapse
Affiliation(s)
| | - Robin Banerjee
- School of Psychology, University of Sussex, Brighton, UK
| | - Alan Garnham
- School of Psychology, University of Sussex, Brighton, UK
| | - Jane Oakhill
- Equipe Neuro-Ethologie Sensorielle, ENES/CRNL, CNRS UMR5292, INSERM UMR_S 1028, University of Lyon, Saint-Etienne, France
| | - Lucy Roberts
- School of Psychology, University of Sussex, Brighton, UK
| | - Sophie Anns
- School of Psychology, University of Sussex, Brighton, UK
| | - Rod Bond
- School of Psychology, University of Sussex, Brighton, UK
| | - David Reby
- School of Psychology, University of Sussex, Brighton, UK; Equipe Neuro-Ethologie Sensorielle, ENES/CRNL, CNRS UMR5292, INSERM UMR_S 1028, University of Lyon, Saint-Etienne, France
| |
Collapse
|