1
|
Shi K, Liu X, Yuan X, Shang H, Dai R, Wang H, Fu Y, Jiang N, He J. AADNet: Exploring EEG Spatiotemporal Information for Fast and Accurate Orientation and Timbre Detection of Auditory Attention Based on a Cue-Masked Paradigm. IEEE Trans Neural Syst Rehabil Eng 2025; 33:1349-1359. [PMID: 40168202 DOI: 10.1109/tnsre.2025.3555542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2025]
Abstract
Auditory attention decoding from electroencephalogram (EEG) could infer to which source the user is attending in noisy environments. Decoding algorithms and experimental paradigm designs are crucial for the development of technology in practical applications. To simulate real-world scenarios, this study proposed a cue-masked auditory attention paradigm to avoid information leakage before the experiment. To obtain high decoding accuracy with low latency, an end-to-end deep learning model, AADNet, was proposed to exploit the spatiotemporal information from the short time window of EEG signals. The results showed that with a 0.5-second EEG window, AADNet achieved an average accuracy of 93.46% and 91.09% in decoding auditory orientation attention (OA) and timbre attention (TA), respectively. It significantly outperformed five previous methods and did not need the knowledge of the original audio source. This work demonstrated that it was possible to detect the orientation and timbre of auditory attention from EEG signals fast and accurately. The results are promising for the real-time multi-property auditory attention decoding, facilitating the application of the neuro-steered hearing aids and other assistive listening devices.
Collapse
|
2
|
Bruder C, Larrouy-Maestri P. CoVox: A dataset of contrasting vocalizations. Behav Res Methods 2025; 57:142. [PMID: 40216652 PMCID: PMC11991967 DOI: 10.3758/s13428-025-02664-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/17/2025] [Indexed: 04/14/2025]
Abstract
The human voice is remarkably versatile and can vary greatly in sound depending on how it is used. An increasing number of studies have addressed the differences and similarities between the singing and the speaking voice. However, finding adequate stimuli material that is at the same time controlled and ecologically valid is challenging, and most datasets lack variability in terms of vocal styles performed by the same voice. Here, we describe a curated stimulus set of vocalizations where 22 female singers performed the same melody excerpts in three contrasting singing styles (as a lullaby, as a pop song, and as an opera aria) and spoke the text aloud in two speaking styles (as if speaking to an adult or to an infant). All productions were made with the songs' original lyrics, in Brazilian Portuguese, and with a/lu/sound. This ecologically valid dataset of 1320 vocalizations was validated through a forced-choice lab experiment (N = 25 for each stimulus) where lay listeners could recognize the intended vocalization style with high accuracy (proportion of correct recognition superior to 69% for all styles). We also provide acoustic characterization of the stimuli, depicting clear and contrasting acoustic profiles depending on the style of vocalization. All recordings are made freely available under a Creative Commons license and can be downloaded at https://osf.io/cgexn/ .
Collapse
Affiliation(s)
- Camila Bruder
- Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, 60322, Frankfurt Am Main, Germany.
| | - Pauline Larrouy-Maestri
- Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, 60322, Frankfurt Am Main, Germany.
- Center for Language, Music, and Emotion (CLaME), New York, NY, USA.
| |
Collapse
|
3
|
Valente D, Magnard C, Koutseff A, Patural H, Chauleur C, Reby D, Pisanski K. Vocal communication and perception of pain in childbirth vocalizations. Philos Trans R Soc Lond B Biol Sci 2025; 380:20240009. [PMID: 40176506 PMCID: PMC11966154 DOI: 10.1098/rstb.2024.0009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 11/25/2024] [Accepted: 01/24/2025] [Indexed: 04/04/2025] Open
Abstract
Nonlinear acoustic phenomena (NLP) likely facilitate the expression of distress in animal vocalizations, making calls perceptually rough and hard to ignore. Yet, their function in adult human vocal communication remains poorly understood. Here, to examine the production and perception of acoustic correlates of pain in spontaneous human nonverbal vocalizations, we take advantage of childbirth-a natural context in which labouring women typically produce a range of highly evocative loud vocalizations, including moans and screams-as they experience excruciating pain. We combine acoustic analyses of these real-life pain vocalizations with psychoacoustic experiments involving the playback of natural and synthetic calls to both naïve and expert listeners. We show that vocalizations become acoustically rougher, higher in fundamental frequency (pitch), less stable, louder and longer as child labour progresses, paralleling a rise in women's self-assessed pain. In perception experiments, we show that both naïve listeners and obstetric professionals assign the highest pain ratings to vocalizations produced in the final expulsion phase of labour. Experiments with synthetic vocal stimuli confirm that listeners rely largely on nonlinear phenomena to assess pain. Our study confirms that nonlinear phenomena communicate intense, pain-induced distress in humans, consistent with their widespread function to signal distress and arousal in vertebrate vocal signals.This article is part of the theme issue 'Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions'.
Collapse
Affiliation(s)
- Daria Valente
- Department of Life Sciences and Systems Biology, University of Turin, Torino10123, Italy
| | - Cecile Magnard
- Lucie Hussel Hospital, Maternity Ward, Montée du Dr Chapuis, Vienne38200, France
| | - Alexis Koutseff
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, Jean Monnet University of Saint Étienne, St-Étienne42023, France
| | - Hugues Patural
- Department of Pediatrics, University Hospital Centre of Saint-Étienne, Saint-Étienne42055, France
| | - Celine Chauleur
- Department of Gynecology and Obstetrics, University Hospital Centre of Saint-Étienne, Saint-Étienne42055, France
| | - David Reby
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, Jean Monnet University of Saint Étienne, St-Étienne42023, France
- Institut Universitaire de France, Paris, Île-de-France75005, France
| | - Katarzyna Pisanski
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, Jean Monnet University of Saint Étienne, St-Étienne42023, France
- Institute of Psychology, University of Wrocław, Wrocław50-527, Poland
| |
Collapse
|
4
|
Bryant GA, Smaldino PE. The cultural evolution of distortion in music (and other norms of mixed appeal). Philos Trans R Soc Lond B Biol Sci 2025; 380:20240014. [PMID: 40176525 PMCID: PMC11966159 DOI: 10.1098/rstb.2024.0014] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 01/05/2025] [Accepted: 02/04/2025] [Indexed: 04/04/2025] Open
Abstract
Music traditions worldwide are subject to remarkable diversity but the origins of this variation are not well understood. Musical behaviour is the product of a multicomponent collection of abilities, some possibly evolved for music but most derived from traits serving nonmusical functions. Cultural evolution has stitched together these systems, generating variable normative practices across cultures and musical genres. Here, we describe the cultural evolution of musical distortion, a noisy manipulation of instrumental and vocal timbre that emulates nonlinear phenomena (NLP) present in the vocal signals of many animals. We suggest that listeners' sensitivity to NLP has facilitated technological developments for altering musical instruments and singing with distortion, which continues to evolve culturally via the need for groups to both coordinate internally and differentiate themselves from other groups. To support this idea, we present an agent-based model of norm evolution illustrating possible dynamics of continuous traits such as timbral distortion in music, dependent on (i) a functional optimum, (ii) intra-group cohesion and inter-group differentiation and (iii) groupishness for assortment and social learning. This account illustrates how cultural transmission dynamics can lead to diversity in musical sounds and genres, and also provides a more general explanation for the emergence of subgroup-differentiating norms.This article is part of the theme issue 'Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions'.
Collapse
Affiliation(s)
- Gregory A. Bryant
- Department of Communication, University of California, Los Angeles, CA90095-1563, USA
- UCLA Center for Behavior, Evolution, and Culture, Los Angeles, CA, USA
| | - Paul E. Smaldino
- Cognitive and Information Sciences, University of California, Merced, CA, USA
- Santa Fe Institute, Santa Fe, NM, USA
| |
Collapse
|
5
|
Massenet M, Mathevon N, Anikin A, Briefer EF, Fitch WT, Reby D. Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions. Philos Trans R Soc Lond B Biol Sci 2025; 380:20240002. [PMID: 40176513 PMCID: PMC11966157 DOI: 10.1098/rstb.2024.0002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2025] [Accepted: 02/06/2025] [Indexed: 04/04/2025] Open
Abstract
Nonlinear phenomena (NLP) are acoustic irregularities that are widespread in animal and human vocal repertoires, as well as in music. These phenomena have recently attracted considerable interest but, surprisingly, have never been the subject of a comprehensive review. NLP result from irregular sound production, contribute to perceptual harshness, and have long been considered nonadaptive vocal features or by-products of sound production characterizing pathological voices. This view is beginning to change: NLP are increasingly documented in nonverbal vocalizations of healthy humans, and an impressive variety of acoustic irregularities are found in the vocalizations of nonhuman vertebrates. Indeed, evidence is accumulating that NLP have evolved to serve specific functions such as attracting listeners' attention, signalling high arousal, or communicating aggression, size, dominance, distress and/or pain. This special issue presents a selection of theoretical and empirical studies showcasing novel concepts and analysis tools to address the following key questions: How are NLP in vertebrate vocalizations defined and classified? What are their biomechanical origins? What are their communicative functions? How and why did they evolve? We also discuss the broader significance and societal implications of research on NLP for non-invasively monitoring and improving human and animal welfare.This article is part of the theme issue 'Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions'.
Collapse
Affiliation(s)
- Mathilde Massenet
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, 42100 Saint-Etienne, France
- Division of Cognitive Science, Lund University, 223 62 Lund, Sweden
| | - Nicolas Mathevon
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, 42100 Saint-Etienne, France
- Ecole Pratique des Hautes Etudes, University Paris-Sciences-Lettres, 75014 Paris, France
- Institut universitaire de France, 75231 Paris, France
| | - Andrey Anikin
- Division of Cognitive Science, Lund University, 223 62 Lund, Sweden
| | - Elodie F. Briefer
- Behavioural Ecology Group, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen Ø, Denmark
| | - W. Tecumseh Fitch
- Department of Behavioral and Cognitive Biology, University of Vienna, 1030 Vienna, Austria
| | - David Reby
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, 42100 Saint-Etienne, France
- Institut universitaire de France, 75231 Paris, France
| |
Collapse
|
6
|
Arnal LH, Gonçalves N. Rough is salient: a conserved vocal niche to hijack the brain's salience system. Philos Trans R Soc Lond B Biol Sci 2025; 380:20240020. [PMID: 40176527 PMCID: PMC11966164 DOI: 10.1098/rstb.2024.0020] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 11/21/2024] [Accepted: 12/01/2024] [Indexed: 04/04/2025] Open
Abstract
The propensity to communicate extreme emotional states and arousal through salient, non-referential vocalizations is ubiquitous among mammals and beyond. Screams, whether intended to warn conspecifics or deter aggressors, require a rapid increase of air influx through vocal folds to induce nonlinear distortions of the signal. These distortions contain salient, temporally patterned acoustic features in a restricted range of the audible spectrum. These features may have a biological significance, triggering fast behavioural responses in the receivers. We present converging neurophysiological and behavioural evidence from humans and animals supporting that the properties emerging from nonlinear vocal phenomena are ideally adapted to induce efficient sensory, emotional and behavioural responses. We argue that these fast temporal-rough-modulations are unlikely to be an epiphenomenon of vocal production but rather the result of selective evolutionary pressure on vocal warning signals to promote efficient communication. In this view, rough features may have been selected and conserved as an acoustic trait to recruit ancestral sensory salience pathways and elicit optimal reactions in the receiver. By exploring the impact of rough vocalizations at the receiver's end, we review the perceptual, behavioural and neural factors that may have shaped these signals to evolve as powerful communication tools.This article is part of the theme issue 'Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions'.
Collapse
Affiliation(s)
- Luc H. Arnal
- Université Paris Cité, Institut Pasteur, AP-HP, INSERM, CNRS, Fondation Pour l'Audition, Institut de l’Audition, IHU reConnect, Paris75012, France
| | - Noémi Gonçalves
- Université Paris Cité, Institut Pasteur, AP-HP, INSERM, CNRS, Fondation Pour l'Audition, Institut de l’Audition, IHU reConnect, Paris75012, France
| |
Collapse
|
7
|
Massenet M, Pisanski K, Reynaud K, Mathevon N, Reby D, Anikin A. Acoustic context and dynamics of nonlinear phenomena in mammalian calls: the case of puppy whines. Philos Trans R Soc Lond B Biol Sci 2025; 380:20240022. [PMID: 40176516 PMCID: PMC11966151 DOI: 10.1098/rstb.2024.0022] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 08/02/2024] [Accepted: 10/31/2024] [Indexed: 04/04/2025] Open
Abstract
Nonlinear phenomena (NLP) are often associated with high arousal and function to grab attention and/or signal urgency in vocalizations such as distress calls. Although biomechanical models and in vivo/ex vivo experiments suggest that their occurrence reflects the destabilization of vocal fold vibration under intense subglottal pressure and muscle tension, comprehensive descriptions of the dynamics of NLP occurrence in natural vocal signals are critically lacking. Here, to plug this gap, we report the timing, type, extent and acoustic context of NLP in 12 011 whines produced by Beagle puppies (Canis familiaris) during a brief separation from their mothers. Within bouts of whines, we show that both the proportion of time vocalizing and the number of whines containing NLP, especially those with chaos, increase with time since separation, presumably reflecting heightened arousal. Within whines, we show that NLP are typically produced during the first half of the call, following the steepest rises in pitch (fundamental frequency, fo) and amplitude. While our study reinforces the notion that NLP arise in calls due to instabilities in vocal production during high arousal, it also provides novel and efficient analytical tools for quantifying nonlinear acoustics in ecologically relevant mammal vocal communication contexts.This article is part of the theme issue 'Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions'.
Collapse
Affiliation(s)
- Mathilde Massenet
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Division of Cognitive Science, Lund University, Lund, Sweden
| | - Katarzyna Pisanski
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- DDL Dynamics of Language Laboratory, University of Lyon 2, Lyon, France
- Institute of Psychology, University of Wrocław, Wrocław, Poland
| | - Karine Reynaud
- École Nationale Vétérinaire d’Alfort, EnvA, Maisons-Alfort, France
- Physiologie de la Reproduction et des Comportements, CNRS, INRAE, Université de Tours, PRC, Nouzilly, France
| | - Nicolas Mathevon
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Institut universitaire de France, Paris, France
- Ecole Pratique des Hautes Etudes, University Paris-Sciences-Lettres, Paris, France
| | - David Reby
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Institut universitaire de France, Paris, France
| | - Andrey Anikin
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Division of Cognitive Science, Lund University, Lund, Sweden
| |
Collapse
|
8
|
Daunay V, Reby D, Bryant GA, Pisanski K. Production and perception of volitional laughter across social contexts. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2025; 157:2774-2789. [PMID: 40227885 DOI: 10.1121/10.0036388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Accepted: 03/23/2025] [Indexed: 04/16/2025]
Abstract
Human nonverbal vocalizations such as laughter communicate emotion, motivation, and intent during social interactions. While differences between spontaneous and volitional laughs have been described, little is known about the communicative functions of volitional (voluntary) laughter-a complex signal used across diverse social contexts. Here, we examined whether the acoustic structure of volitional laughter encodes social contextual information recognizable by humans and computers. We asked men and women to produce volitional laughs in eight distinct social contexts ranging from positive (e.g., watching a comedy) to negative valence (e.g., embarrassment). Human listeners and machine classification algorithms accurately identified most laughter contexts above chance. However, confusion often arose within valence categories, and could be largely explained by shared acoustics. Although some acoustic features varied across social contexts, including fundamental frequency (perceived as voice pitch) and energy parameters (entropy variance, loudness, spectral centroid, and cepstral peak prominence), which also predicted listeners' recognition of laughter contexts, laughs evoked across different social contexts still often overlapped in acoustic and perceptual space. Thus, we show that volitional laughter can convey some reliable information about social context, but much of this is tied to valence, suggesting that volitional laughter is a graded rather than discrete vocal signal.
Collapse
Affiliation(s)
- Virgile Daunay
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, University of Saint-Étienne, 42023 Saint-Étienne, France
- DDL Dynamics of Language Lab, CNRS French National Centre for Scientific Research, University of Lyon 2, 69363 Lyon, France
| | - David Reby
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, University of Saint-Étienne, 42023 Saint-Étienne, France
| | - Gregory A Bryant
- Department of Communication, Center for Behavior, Evolution, and Culture, University of California, Los Angeles, California 90095, USA
| | - Katarzyna Pisanski
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, University of Saint-Étienne, 42023 Saint-Étienne, France
- DDL Dynamics of Language Lab, CNRS French National Centre for Scientific Research, University of Lyon 2, 69363 Lyon, France
| |
Collapse
|
9
|
Fogarty MJ. Dendritic alterations precede age-related dysphagia and nucleus ambiguus motor neuron death. J Physiol 2025; 603:1299-1321. [PMID: 39868939 PMCID: PMC11870054 DOI: 10.1113/jp287457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Accepted: 12/18/2024] [Indexed: 01/28/2025] Open
Abstract
Motor neurons (MNs) within the nucleus ambiguus innervate the skeletal muscles of the larynx, pharynx and oesophagus, which are essential for swallow. Disordered swallow (dysphagia) is a serious problem in elderly humans, increasing the risk of aspiration, a key contributor to mortality. Despite this importance, very little is known about the pathophysiology of ageing dysphagia and the relative importance of frank muscle weakness compared to timing/activation abnormalities. In elderly humans and in aged Fisher 344 (F344) rats, a variety of motor pools exhibit weakness and atrophy (sarcopenia), contemporaneous to MN death. Synchronisation of swallow is dependent on the stability of MN dendrites, which integrate neural circuits. Dendritic derangement occurs in many neuromotor degenerative conditions prior to MN death. We hypothesise behavioural weakness and death of nucleus ambiguus MNs will occur by age 24 months in F344 rats and that this will be preceded by swallow-respiration dyscoordination and dendritic arbour degenerations from 18 months compared to controls at 6 months. Using pressure catheters to estimate laryngeal and diaphragm function during naturalistic water bolus applications, we show that swallow number and post-swallow apnoeas are altered from 18 months. Swallow pressure (weakness) and nucleus ambiguus MN numbers (evaluated via stereological assessments of Nissl staining) were reduced at 24 months. Dendritic lengths, surface areas and dendritic spines were reduced in nucleus ambiguus MNs from 18 months (evaluated by confocal imaging of Golgi-Cox impregnated brainstem). These results show that synapse loss occurs prior to MN death and behavioural weakness. Strategies to preserve synapses may be of utility in ameliorating sarcopenia. KEY POINTS: Dysphagia is a major contributor to ageing morbidity and mortality, but the underling pathophysiology is unexplored. Here, in Fischer 344 rats, we use pressure and timing evaluations of swallow-respiration, showing timing impairments occur prior to frank pressure defects. In nucleus ambiguus motor neurons, dendritic defects were apparent with the onset of swallow-respiration dyscoordination, with frank motor neuron loss occurring subsequently to synapse loss. Our results show that synapse loss occurs prior to motor neuron death and behavioural impairments. Strategies to preserve synapses may be of utility in ameliorating sarcopaenia.
Collapse
Affiliation(s)
- Matthew J. Fogarty
- Department of Physiology & Biomedical EngineeringMayo ClinicRochesterMNUSA
| |
Collapse
|
10
|
Jiang Z, Long Y, Zhang X, Liu Y, Bai X. CNEV: A corpus of Chinese nonverbal emotional vocalizations with a database of emotion category, valence, arousal, and gender. Behav Res Methods 2025; 57:62. [PMID: 39838181 DOI: 10.3758/s13428-024-02595-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/18/2024] [Indexed: 01/23/2025]
Abstract
Nonverbal emotional vocalizations play a crucial role in conveying emotions during human interactions. Validated corpora of these vocalizations have facilitated emotion-related research and found wide-ranging applications. However, existing corpora have lacked representation from diverse cultural backgrounds, which may limit the generalizability of the resulting theories. The present paper introduces the Chinese Nonverbal Emotional Vocalization (CNEV) corpus, the first nonverbal emotional vocalization corpus recorded and validated entirely by Mandarin speakers from China. The CNEV corpus contains 2415 vocalizations across five emotion categories: happiness, sadness, fear, anger, and neutrality. It also includes a database containing subjective evaluation data on emotion category, valence, arousal, and speaker gender, as well as the acoustic features of the vocalizations. Key conclusions drawn from statistical analyses of perceptual evaluations and acoustic analysis include the following: (1) the CNEV corpus exhibits adequate reliability and high validity; (2) perceptual evaluations reveal a tendency for individuals to associate anger with male voices and fear with female voices; (3) acoustic analysis indicates that males are more effective at expressing anger, while females excel in expressing fear; and (4) the observed perceptual patterns align with the acoustic analysis results, suggesting that the perceptual differences may stem not only from the subjective factors of perceivers but also from objective expressive differences in the vocalizations themselves. For academic research purposes, the CNEV corpus and database are freely available for download at https://osf.io/6gy4v/ .
Collapse
Affiliation(s)
- Zhongqing Jiang
- College of Psychology, Liaoning Normal University, No. 850 Huanghe Road, Dalian, 116029, Liaoning, China.
| | - Yanling Long
- College of Psychology, Liaoning Normal University, No. 850 Huanghe Road, Dalian, 116029, Liaoning, China
| | - Xi'e Zhang
- Xianyang Senior High School of Shaanxi Province, Xianyang, China
| | - Yangtao Liu
- College of Psychology, Liaoning Normal University, No. 850 Huanghe Road, Dalian, 116029, Liaoning, China
| | - Xue Bai
- College of Psychology, Liaoning Normal University, No. 850 Huanghe Road, Dalian, 116029, Liaoning, China
| |
Collapse
|
11
|
Kreiman J, Lee Y. Biological, linguistic, and individual factors govern voice qualitya). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2025; 157:482-492. [PMID: 39846773 DOI: 10.1121/10.0034848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 12/17/2024] [Indexed: 01/24/2025]
Abstract
Voice quality serves as a rich source of information about speakers, providing listeners with impressions of identity, emotional state, age, sex, reproductive fitness, and other biologically and socially salient characteristics. Understanding how this information is transmitted, accessed, and exploited requires knowledge of the psychoacoustic dimensions along which voices vary, an area that remains largely unexplored. Recent studies of English speakers have shown that two factors related to speaker size and arousal consistently emerge as the most important determinants of quality, regardless of who is speaking. The present findings extend this picture by demonstrating that in four languages that vary fundamental frequency (fo) and/or phonation type contrastively (Korean, Thai, Gujarati, and White Hmong), additional acoustic variability is systematically related to the phonology of the language spoken, and the amount of variability along each dimension is consistent across speaker groups. This study concludes that acoustic voice spaces are structured in a remarkably consistent way: first by biologically driven, evolutionarily grounded factors, second by learned linguistic factors, and finally by variations within a talker over utterances, possibly due to personal style, emotional state, social setting, or other dynamic factors. Implications for models of speaker recognition are also discussed.
Collapse
Affiliation(s)
- Jody Kreiman
- Departments of Head and Neck Surgery and Linguistics, UCLA, Los Angeles, California 90095-1794, USA
| | - Yoonjeong Lee
- USC Viterbi School of Engineering, University of Southern California, Los Angeles, California 90089-1455, USA
| |
Collapse
|
12
|
Ponsonnet M, Coupé C, Pellegrino F, Garcia Arasco A, Pisanski K. Vowel signatures in emotional interjections and nonlinguistic vocalizations expressing pain, disgust, and joy across languagesa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:3118-3139. [PMID: 39531311 DOI: 10.1121/10.0032454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Accepted: 10/01/2024] [Indexed: 11/16/2024]
Abstract
In this comparative cross-linguistic study we test whether expressive interjections (words like ouch or yay) share similar vowel signatures across the world's languages, and whether these can be traced back to nonlinguistic vocalizations (like screams and cries) expressing the same emotions of pain, disgust, and joy. We analyze vowels in interjections from dictionaries of 131 languages (over 600 tokens) and compare these with nearly 500 vowels based on formant frequency measures from voice recordings of volitional nonlinguistic vocalizations. We show that across the globe, pain interjections feature a-like vowels and wide falling diphthongs ("ai" as in Ayyy! "aw" as in Ouch!), whereas disgust and joy interjections do not show robust vowel regularities that extend geographically. In nonlinguistic vocalizations, all emotions yield distinct vowel signatures: pain prompts open vowels such as [a], disgust schwa-like central vowels, and joy front vowels such as [i]. Our results show that pain is the only affective experience tested with a clear, robust vowel signature that is preserved between nonlinguistic vocalizations and interjections across languages. These results offer empirical evidence for iconicity in some expressive interjections. We consider potential mechanisms and origins, from evolutionary pressures and sound symbolism to colexification, proposing testable hypotheses for future research.
Collapse
Affiliation(s)
- Maïa Ponsonnet
- Dynamique Du Langage, CNRS et Université Lumière Lyon 2, Lyon, France
- School of Social Sciences, The University of Western Australia, Perth, Australia
| | - Christophe Coupé
- Department of Linguistics, The University of Hong Kong, Hong Kong SAR, China
| | | | | | - Katarzyna Pisanski
- Dynamique Du Langage, CNRS et Université Lumière Lyon 2, Lyon, France
- ENES Bioacoustics Research Laboratory, University Jean Monnet of Saint-Etienne, CRNL, CNRS, Saint-Etienne, France
- Institute of Psychology, University of Wrocław, Wrocław, Poland
| |
Collapse
|
13
|
Pisanski K, Reby D, Oleszkiewicz A. Humans need auditory experience to produce typical volitional nonverbal vocalizations. COMMUNICATIONS PSYCHOLOGY 2024; 2:65. [PMID: 39242947 PMCID: PMC11332021 DOI: 10.1038/s44271-024-00104-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 05/16/2024] [Indexed: 09/09/2024]
Abstract
Human nonverbal vocalizations such as screams and cries often reflect their evolved functions. Although the universality of these putatively primordial vocal signals and their phylogenetic roots in animal calls suggest a strong reflexive foundation, many of the emotional vocalizations that we humans produce are under our voluntary control. This suggests that, like speech, volitional vocalizations may require auditory input to develop typically. Here, we acoustically analyzed hundreds of volitional vocalizations produced by profoundly deaf adults and typically-hearing controls. We show that deaf adults produce unconventional and homogenous vocalizations of aggression and pain that are unusually high-pitched, unarticulated, and with extremely few harsh-sounding nonlinear phenomena compared to controls. In contrast, fear vocalizations of deaf adults are relatively acoustically typical. In four lab experiments involving a range of perception tasks with 444 participants, listeners were less accurate in identifying the intended emotions of vocalizations produced by deaf vocalizers than by controls, perceived their vocalizations as less authentic, and reliably detected deafness. Vocalizations of congenitally deaf adults with zero auditory experience were most atypical, suggesting additive effects of auditory deprivation. Vocal learning in humans may thus be required not only for speech, but also to acquire the full repertoire of volitional non-linguistic vocalizations.
Collapse
Affiliation(s)
- Katarzyna Pisanski
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, University of Saint-Étienne, 42023, Saint-Étienne, France.
- CNRS French National Centre for Scientific Research, DDL Dynamics of Language Lab, University of Lyon 2, 69007, Lyon, France.
- Institute of Psychology, University of Wrocław, 50-527, Wrocław, Poland.
| | - David Reby
- ENES Bioacoustics Research Laboratory, CRNL Center for Research in Neuroscience in Lyon, University of Saint-Étienne, 42023, Saint-Étienne, France
- Institut Universitaire de France, Paris, France
| | - Anna Oleszkiewicz
- Institute of Psychology, University of Wrocław, 50-527, Wrocław, Poland.
- Department of Otorhinolaryngology, Smell and Taste Clinic, Carl Gustav Carus Medical School, Technische Universitaet Dresden, 01307, Dresden, Germany.
| |
Collapse
|
14
|
Fogarty MJ. Dendritic morphology of motor neurons and interneurons within the compact, semicompact, and loose formations of the rat nucleus ambiguus. Front Cell Neurosci 2024; 18:1409974. [PMID: 38933178 PMCID: PMC11199410 DOI: 10.3389/fncel.2024.1409974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 05/27/2024] [Indexed: 06/28/2024] Open
Abstract
Introduction Motor neurons (MNs) within the nucleus ambiguus innervate the skeletal muscles of the larynx, pharynx, and oesophagus. These muscles are activated during vocalisation and swallowing and must be coordinated with several respiratory and other behaviours. Despite many studies evaluating the projections and orientation of MNs within the nucleus ambiguus, there is no quantitative information regarding the dendritic arbours of MNs residing in the compact, and semicompact/loose formations of the nucleus ambiguus.. Methods In female and male Fischer 344 rats, we evaluated MN number using Nissl staining, and MN and non-MN dendritic morphology using Golgi-Cox impregnation Brightfield imaging of transverse Nissl sections (15 μm) were taken to stereologically assess the number of nucleus ambiguus MNs within the compact and semicompact/loose formations. Pseudo-confocal imaging of Golgi-impregnated neurons within the nucleus ambiguus (sectioned transversely at 180 μm) was traced in 3D to determine dendritic arbourisation. Results We found a greater abundance of MNs within the compact than the semicompact/loose formations. Dendritic lengths, complexity, and convex hull surface areas were greatest in MNs of the semicompact/loose formation, with compact formation MNs being smaller. MNs from both regions were larger than non-MNs reconstructed within the nucleus ambiguus. Conclusion Adding HBLS to the diet could be a potentially effective strategy to improve horses' health.
Collapse
Affiliation(s)
- Matthew J. Fogarty
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
15
|
Sammler D. Signatures of speech and song: "Universal" links despite cultural diversity. SCIENCE ADVANCES 2024; 10:eadp9620. [PMID: 38748801 PMCID: PMC11326043 DOI: 10.1126/sciadv.adp9620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 04/30/2024] [Indexed: 07/13/2024]
Abstract
Equitable collaboration between culturally diverse scientists reveals that acoustic fingerprints of human speech and song share parallel relationships across the globe.
Collapse
Affiliation(s)
- Daniela Sammler
- Max Planck Institute for Empirical Aesthetics, Research Group Neurocognition of Music and Language, Grüneburgweg 14, D-60322 Frankfurt am Main, Germany
- Max Planck Institute for Human Cognitive and Brain Sciences, Department of Neuropsychology, Stephanstr. 1a, D-04103 Leipzig, Germany
| |
Collapse
|
16
|
Kamiloğlu RG, Sauter DA. Sounds like a fight: listeners can infer behavioural contexts from spontaneous nonverbal vocalisations. Cogn Emot 2024; 38:277-295. [PMID: 37997898 PMCID: PMC11057848 DOI: 10.1080/02699931.2023.2285854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 11/13/2023] [Indexed: 11/25/2023]
Abstract
When we hear another person laugh or scream, can we tell the kind of situation they are in - for example, whether they are playing or fighting? Nonverbal expressions are theorised to vary systematically across behavioural contexts. Perceivers might be sensitive to these putative systematic mappings and thereby correctly infer contexts from others' vocalisations. Here, in two pre-registered experiments, we test the prediction that listeners can accurately deduce production contexts (e.g. being tickled, discovering threat) from spontaneous nonverbal vocalisations, like sighs and grunts. In Experiment 1, listeners (total n = 3120) matched 200 nonverbal vocalisations to one of 10 contexts using yes/no response options. Using signal detection analysis, we show that listeners were accurate at matching vocalisations to nine of the contexts. In Experiment 2, listeners (n = 337) categorised the production contexts by selecting from 10 response options in a forced-choice task. By analysing unbiased hit rates, we show that participants categorised all 10 contexts at better-than-chance levels. Together, these results demonstrate that perceivers can infer contexts from nonverbal vocalisations at rates that exceed that of random selection, suggesting that listeners are sensitive to systematic mappings between acoustic structures in vocalisations and behavioural contexts.
Collapse
Affiliation(s)
- Roza G. Kamiloğlu
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Disa A. Sauter
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
17
|
Rutovskaya MV, Volodin IA, Naidenko SV, Erofeeva MN, Alekseeva GS, Zhuravleva PS, Volobueva KA, Kim MD, Volodina EV. Relationship between acoustic traits of protesting cries of domestic kittens (Felis catus) and their individual chances for survival. Behav Processes 2024; 216:105009. [PMID: 38395238 DOI: 10.1016/j.beproc.2024.105009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 02/17/2024] [Accepted: 02/18/2024] [Indexed: 02/25/2024]
Abstract
Domestic cat (Felis catus) mothers may rely on offspring cries to allocate resources in use of individuals with greater chances for survival and sacrifice the weak ones in case of impossibility to raise the entire large litter. Potential victims of this maternal strategy can enhance their chances of survival, by producing vocalizations with traits mimicking those of higher-quality offspring. We compared acoustic traits of 4990 cries produced during blood sampling by 57 two-week-old captive feral kittens (28 males, 29 females); 47 of them survived to 90 days of age and 10 died by reasons not related to traumas or aggression. No relationship was found between acoustic parameters and kitten survival, however, positive relationship was found between survival and body weight. The cries had moderate cues to individuality and lacked cues to sex. Body weight correlated positively with fundamental frequency and negatively with call rate, duration, peak frequency and power quartiles. We discuss that dishonesty of acoustic traits of kitten quality could develop as adaptation for misleading a mother from allocation resources between the weaker and stronger individuals, thus enhancing individual chances for survival for the weaker littermates. Physical constraint, as body weight, may prevent extensive developing the deceptive vocal traits.
Collapse
Affiliation(s)
- Marina V Rutovskaya
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| | - Ilya A Volodin
- Department of Vertebrate Zoology, Faculty of Biology, Lomonosov Moscow State University, Vorobievy Gory, 1/12, Moscow 119234, Russia.
| | - Sergey V Naidenko
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| | - Mariya N Erofeeva
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| | - Galina S Alekseeva
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| | - Polina S Zhuravleva
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| | - Kseniya A Volobueva
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| | - Mariya D Kim
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| | - Elena V Volodina
- Department of Behaviour and Behavioural Ecology of Mammals, A.N. Severtsov Institute of Ecology and Evolution, Russian Academy of Sciences, Leninsky prospect, 33, Moscow 119071, Russia
| |
Collapse
|
18
|
Kreiman J. Information conveyed by voice qualitya). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1264-1271. [PMID: 38345424 DOI: 10.1121/10.0024609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 01/09/2024] [Indexed: 02/15/2024]
Abstract
The problem of characterizing voice quality has long caused debate and frustration. The richness of the available descriptive vocabulary is overwhelming, but the density and complexity of the information voices convey lead some to conclude that language can never adequately specify what we hear. Others argue that terminology lacks an empirical basis, so that language-based scales are inadequate a priori. Efforts to provide meaningful instrumental characterizations have also had limited success. Such measures may capture sound patterns but cannot at present explain what characteristics, intentions, or identity listeners attribute to the speaker based on those patterns. However, some terms continually reappear across studies. These terms align with acoustic dimensions accounting for variance across speakers and languages and correlate with size and arousal across species. This suggests that labels for quality rest on a bedrock of biology: We have evolved to perceive voices in terms of size/arousal, and these factors structure both voice acoustics and descriptive language. Such linkages could help integrate studies of signals and their meaning, producing a truly interdisciplinary approach to the study of voice.
Collapse
Affiliation(s)
- Jody Kreiman
- Departments of Head and Neck Surgery and Linguistics, University of California, Los Angeles, Los Angeles, California 90095-1794, USA
| |
Collapse
|
19
|
Sorokowski P, Groyecka-Bernard A, Frackowiak T, Kobylarek A, Kupczyk P, Sorokowska A, Misiak M, Oleszkiewicz A, Bugaj K, Włodarczyk M, Pisanski K. Comparing accuracy in voice-based assessments of biological speaker traits across speech types. Sci Rep 2023; 13:22989. [PMID: 38151496 PMCID: PMC10752881 DOI: 10.1038/s41598-023-49596-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 12/09/2023] [Indexed: 12/29/2023] Open
Abstract
Nonverbal acoustic parameters of the human voice provide cues to a vocaliser's sex, age, and body size that are relevant in human social and sexual communication, and also increasingly so for computer-based voice recognition and synthesis technologies. While studies have shown some capacity in human listeners to gauge these biological traits from unseen speakers, it remains unknown whether speech complexity improves accuracy. Here, in over 200 vocalisers and 1500 listeners of both sexes, we test whether voice-based assessments of sex, age, height and weight vary from isolated vowels and words, to sequences of vowels and words, to full sentences or paragraphs. We show that while listeners judge sex and especially age more accurately as speech complexity increases, accuracy remains high across speech types, even for a single vowel sound. In contrast, the actual heights and weights of vocalisers explain comparatively less variance in listener's assessments of body size, which do not vary systematically by speech type. Our results thus show that while more complex speech can improve listeners' biological assessments, the gain is ecologically small, as listeners already show an impressive capacity to gauge speaker traits from extremely short bouts of standardised speech, likely owing to within-speaker stability in underlying nonverbal vocal parameters such as voice pitch. We discuss the methodological, technological, and social implications of these results.
Collapse
Affiliation(s)
- Piotr Sorokowski
- Institute of Psychology, University of Wrocław, Wrocław, Poland.
| | | | | | | | - Piotr Kupczyk
- Institute of Psychology, University of Wrocław, Wrocław, Poland
| | | | - Michał Misiak
- Institute of Psychology, University of Wrocław, Wrocław, Poland
- Being Human Lab, University of Wrocław, Wrocław, Poland
| | - Anna Oleszkiewicz
- Institute of Psychology, University of Wrocław, Wrocław, Poland
- Interdisciplinary Center Smell & Taste, Department of Otorhinolaryngology, Technische Universität Dresden, Dresden, Germany
| | - Katarzyna Bugaj
- Institute of Psychology, University of Wrocław, Wrocław, Poland
| | | | - Katarzyna Pisanski
- Institute of Psychology, University of Wrocław, Wrocław, Poland.
- Laboratoire Dynamique du Langage, CNRS/Centre National de La Recherche Scientifique, Université Lyon 2, Lyon, France.
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Saint-Etienne, Inserm, France.
| |
Collapse
|
20
|
Anikin A, Canessa-Pollard V, Pisanski K, Massenet M, Reby D. Beyond speech: Exploring diversity in the human voice. iScience 2023; 26:108204. [PMID: 37908309 PMCID: PMC10613903 DOI: 10.1016/j.isci.2023.108204] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 07/20/2023] [Accepted: 10/11/2023] [Indexed: 11/02/2023] Open
Abstract
Humans have evolved voluntary control over vocal production for speaking and singing, while preserving the phylogenetically older system of spontaneous nonverbal vocalizations such as laughs and screams. To test for systematic acoustic differences between these vocal domains, we analyzed a broad, cross-cultural corpus representing over 2 h of speech, singing, and nonverbal vocalizations. We show that, while speech is relatively low-pitched and tonal with mostly regular phonation, singing and especially nonverbal vocalizations vary enormously in pitch and often display harsh-sounding, irregular phonation owing to nonlinear phenomena. The evolution of complex supralaryngeal articulatory spectro-temporal modulation has been critical for speech, yet has not significantly constrained laryngeal source modulation. In contrast, articulation is very limited in nonverbal vocalizations, which predominantly contain minimally articulated open vowels and rapid temporal modulation in the roughness range. We infer that vocal source modulation works best for conveying affect, while vocal filter modulation mainly facilitates semantic communication.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
| | - Valentina Canessa-Pollard
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
- Psychology, Institute of Psychology, Business and Human Sciences, University of Chichester, Chichester, West Sussex PO19 6PE, UK
| | - Katarzyna Pisanski
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
- CNRS French National Centre for Scientific Research, DDL Dynamics of Language Lab, University of Lyon 2, 69007 Lyon, France
- Institute of Psychology, University of Wrocław, Dawida 1, 50-527 Wrocław, Poland
| | - Mathilde Massenet
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
| | - David Reby
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
| |
Collapse
|
21
|
Lockhart-Bouron M, Anikin A, Pisanski K, Corvin S, Cornec C, Papet L, Levréro F, Fauchon C, Patural H, Reby D, Mathevon N. Infant cries convey both stable and dynamic information about age and identity. COMMUNICATIONS PSYCHOLOGY 2023; 1:26. [PMID: 39242685 PMCID: PMC11332224 DOI: 10.1038/s44271-023-00022-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 08/31/2023] [Indexed: 09/09/2024]
Abstract
What information is encoded in the cries of human babies? While it is widely recognized that cries can encode distress levels, whether cries reliably encode the cause of crying remains disputed. Here, we collected 39201 cries from 24 babies recorded in their homes longitudinally, from 15 days to 3.5 months of age, a database we share publicly for reuse. Based on the parental action that stopped the crying, which matched the parental evaluation of cry cause in 75% of cases, each cry was classified as caused by discomfort, hunger, or isolation. Our analyses show that baby cries provide reliable information about age and identity. Baby voices become more tonal and less shrill with age, while individual acoustic signatures drift throughout the first months of life. In contrast, neither machine learning algorithms nor trained adult listeners can reliably recognize the causes of crying.
Collapse
Affiliation(s)
- Marguerite Lockhart-Bouron
- Neonatal and Pediatric Intensive Care Unit, SAINBIOSE Iaboratory, Inserm, University Hospital of Saint-Etienne, University of Saint-Etienne, Saint-Etienne, France
| | - Andrey Anikin
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Division of Cognitive Science, Lund University, Lund, Sweden
| | - Katarzyna Pisanski
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Laboratoire Dynamique du Langage DDL, CNRS, University of Lyon 2, Lyon, France
| | - Siloé Corvin
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Central Integration of Pain-Neuropain Laboratory, CRNL, CNRS, Inserm, UCB Lyon 1, University of Saint-Etienne, Saint-Etienne, France
| | - Clément Cornec
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
| | - Léo Papet
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
| | - Florence Levréro
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Institut universitaire de France, Paris, France
| | - Camille Fauchon
- Central Integration of Pain-Neuropain Laboratory, CRNL, CNRS, Inserm, UCB Lyon 1, University of Saint-Etienne, Saint-Etienne, France
| | - Hugues Patural
- Neonatal and Pediatric Intensive Care Unit, SAINBIOSE Iaboratory, Inserm, University Hospital of Saint-Etienne, University of Saint-Etienne, Saint-Etienne, France
| | - David Reby
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France
- Institut universitaire de France, Paris, France
| | - Nicolas Mathevon
- ENES Bioacoustics Research Laboratory, CRNL, CNRS, Inserm, University of Saint-Etienne, Saint-Etienne, France.
- Institut universitaire de France, Paris, France.
- Ecole Pratique des Hautes Etudes, CHArt Lab, PSL Research University, Paris, France.
| |
Collapse
|
22
|
Schwartz JW, Gouzoules H. Humans read emotional arousal in monkey vocalizations: evidence for evolutionary continuities in communication. PeerJ 2022; 10:e14471. [PMID: 36518288 PMCID: PMC9744152 DOI: 10.7717/peerj.14471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 11/06/2022] [Indexed: 12/05/2022] Open
Abstract
Humans and other mammalian species communicate emotions in ways that reflect evolutionary conservation and continuity, an observation first made by Darwin. One approach to testing this hypothesis has been to assess the capacity to perceive the emotional content of the vocalizations of other species. Using a binary forced choice task, we tested perception of the emotional intensity represented in coos and screams of infant and juvenile female rhesus macaques (Macaca mulatta) by 113 human listeners without, and 12 listeners with, experience (as researchers or care technicians) with this species. Each stimulus pair contained one high- and one low-arousal vocalization, as measured at the time of recording by stress hormone levels for coos and the degree of intensity of aggression for screams. For coos as well as screams, both inexperienced and experienced participants accurately identified the high-arousal vocalization at significantly above-chance rates. Experience was associated with significantly greater accuracy with scream stimuli but not coo stimuli, and with a tendency to indicate screams as reflecting greater emotional intensity than coos. Neither measures of empathy, human emotion recognition, nor attitudes toward animal welfare showed any relationship with responses. Participants were sensitive to the fundamental frequency, noisiness, and duration of vocalizations; some of these tendencies likely facilitated accurate perceptions, perhaps due to evolutionary homologies in the physiology of arousal and vocal production between humans and macaques. Overall, our findings support a view of evolutionary continuity in emotional vocal communication. We discuss hypotheses about how distinctive dimensions of human nonverbal communication, like the expansion of scream usage across a range of contexts, might influence perceptions of other species' vocalizations.
Collapse
Affiliation(s)
- Jay W. Schwartz
- Department of Psychology, Emory University, Atlanta, GA, United States,Psychological Sciences Department, Western Oregon University, Monmouth, OR, United States
| | - Harold Gouzoules
- Department of Psychology, Emory University, Atlanta, GA, United States
| |
Collapse
|
23
|
Chen S, Han C, Wang S, Liu X, Wang B, Wei R, Lei X. Hearing the physical condition: The relationship between sexually dimorphic vocal traits and underlying physiology. Front Psychol 2022; 13:983688. [DOI: 10.3389/fpsyg.2022.983688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open
Abstract
A growing amount of research has shown associations between sexually dimorphic vocal traits and physiological conditions related to reproductive advantage. This paper presented a review of the literature on the relationship between sexually dimorphic vocal traits and sex hormones, body size, and physique. Those physiological conditions are important in reproductive success and mate selection. Regarding sex hormones, there are associations between sex-specific hormones and sexually dimorphic vocal traits; about body size, formant frequencies are more reliable predictors of human body size than pitch/fundamental frequency; with regard to the physique, there is a possible but still controversial association between human voice and strength and combat power, while pitch is more often used as a signal of aggressive intent in conflict. Future research should consider demographic, cross-cultural, cognitive interaction, and emotional motivation influences, in order to more accurately assess the relationship between voice and physiology. Moreover, neurological studies were recommended to gain a deeper understanding of the evolutionary origins and adaptive functions of voice modulation.
Collapse
|
24
|
Gouzoules H. When less is more in the evolution of language. Science 2022; 377:706-707. [PMID: 35951706 DOI: 10.1126/science.add6331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Did loss of vocal fold membranes typical of nonhuman primates enable human speech?
Collapse
|
25
|
Affiliation(s)
- Elisa Demuru
- Laboratoire Dynamique Du Langage, University of Lyon 2, CNRS UMR 5596, Lyon, France
- Équipe de Neuro-Éthologie Sensorielle, University of Lyon/Saint-Étienne, ENES/CRNL, CNRS UMR 5292, Inserm UMR S 1028, Saint-Étienne, France
| | - Cristina Giacoma
- Laboratoire Dynamique Du Langage, University of Lyon 2, CNRS UMR 5596, Lyon, France
- Department of Life Sciences and Systems Biology, University of Torino, Torino, Italy
| |
Collapse
|