1
|
Loutrari A, Alqadi A, Jiang C, Liu F. Exploring the role of singing, semantics, and amusia screening in speech-in-noise perception in musicians and non-musicians. Cogn Process 2024; 25:147-161. [PMID: 37851154 PMCID: PMC10827916 DOI: 10.1007/s10339-023-01165-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 09/26/2023] [Indexed: 10/19/2023]
Abstract
Sentence repetition has been the focus of extensive psycholinguistic research. The notion that music training can bolster speech perception in adverse auditory conditions has been met with mixed results. In this work, we sought to gauge the effect of babble noise on immediate repetition of spoken and sung phrases of varying semantic content (expository, narrative, and anomalous), initially in 100 English-speaking monolinguals with and without music training. The two cohorts also completed some non-musical cognitive tests and the Montreal Battery of Evaluation of Amusia (MBEA). When disregarding MBEA results, musicians were found to significantly outperform non-musicians in terms of overall repetition accuracy. Sung targets were recalled significantly better than spoken ones across groups in the presence of babble noise. Sung expository targets were recalled better than spoken expository ones, and semantically anomalous content was recalled more poorly in noise. Rerunning the analysis after eliminating thirteen participants who were diagnosed with amusia showed no significant group differences. This suggests that the notion of enhanced speech perception-in noise or otherwise-in musicians needs to be evaluated with caution. Musicianship aside, this study showed for the first time that sung targets presented in babble noise seem to be recalled better than spoken ones. We discuss the present design and the methodological approach of screening for amusia as factors which may partially account for some of the mixed results in the field.
Collapse
Affiliation(s)
- Ariadne Loutrari
- School of Psychology and Clinical Language Sciences, University of Reading, Earley Gate, Reading, RG6 6AL, UK
- Division of Psychology and Language Sciences, University College London, London, WC1N 1PF, UK
| | - Aseel Alqadi
- School of Psychology and Clinical Language Sciences, University of Reading, Earley Gate, Reading, RG6 6AL, UK
| | - Cunmei Jiang
- Music College, Shanghai Normal University, Shanghai, 200234, China
| | - Fang Liu
- School of Psychology and Clinical Language Sciences, University of Reading, Earley Gate, Reading, RG6 6AL, UK.
| |
Collapse
|
2
|
Brown JA, Bidelman GM. Attention, Musicality, and Familiarity Shape Cortical Speech Tracking at the Musical Cocktail Party. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.28.562773. [PMID: 37961204 PMCID: PMC10634879 DOI: 10.1101/2023.10.28.562773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The "cocktail party problem" challenges our ability to understand speech in noisy environments, which often include background music. Here, we explored the role of background music in speech-in-noise listening. Participants listened to an audiobook in familiar and unfamiliar music while tracking keywords in either speech or song lyrics. We used EEG to measure neural tracking of the audiobook. When speech was masked by music, the modeled peak latency at 50 ms (P1TRF) was prolonged compared to unmasked. Additionally, P1TRF amplitude was larger in unfamiliar background music, suggesting improved speech tracking. We observed prolonged latencies at 100 ms (N1TRF) when speech was not the attended stimulus, though only in less musical listeners. Our results suggest early neural representations of speech are enhanced with both attention and concurrent unfamiliar music, indicating familiar music is more distracting. One's ability to perceptually filter "musical noise" at the cocktail party depends on objective musical abilities.
Collapse
Affiliation(s)
- Jane A. Brown
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
- Institute for Intelligent Systems, University of Memphis, Memphis, TN 38152, USA
| | - Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| |
Collapse
|
3
|
Zhang H, Ma W, Ding H, Zhang Y. Sustainable Benefits of High Variability Phonetic Training in Mandarin-speaking Kindergarteners With Cochlear Implants: Evidence From Categorical Perception of Lexical Tones. Ear Hear 2023; 44:990-1006. [PMID: 36806578 DOI: 10.1097/aud.0000000000001341] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
OBJECTIVES Although pitch reception poses a great challenge for individuals with cochlear implants (CIs), formal auditory training (e.g., high variability phonetic training [HVPT]) has been shown to provide direct benefits in pitch-related perceptual performances such as lexical tone recognition for CI users. As lexical tones in spoken language are expressed with a multitude of distinct spectral, temporal, and intensity cues, it is important to determine the sources of training benefits for CI users. The purpose of the present study was to conduct a rigorous fine-scale evaluation with the categorical perception (CP) paradigm to control the acoustic parameters and test the efficacy and sustainability of HVPT for Mandarin-speaking pediatric CI recipients. The main hypothesis was that HVPT-induced perceptual learning would greatly enhance CI users' ability to extract the primary pitch contours from spoken words for lexical tone identification and discrimination. Furthermore, individual differences in immediate and long-term gains from training would likely be attributable to baseline performance and duration of CI use. DESIGN Twenty-eight prelingually deaf Mandarin-speaking kindergarteners with CIs were tested. Half of them received five sessions of HVPT within a period of 3 weeks. The other half served as control who did not receive the formal training. Two classical CP tasks on a tonal continuum from Mandarin tone 1 (high-flat in pitch) to tone 2 (mid-rising in pitch) with fixed acoustic features of duration and intensity were administered before (pretest), immediately after (posttest), and 10 weeks posttraining termination (follow-up test). Participants were instructed to either label a speech stimulus along the continuum (i.e., identification task) or determine whether a pair of stimuli separated by zero or two steps from the continuum was the same or different (i.e., discrimination task). Identification function measures (i.e., boundary position and boundary width) and discrimination function scores (i.e., between-category score, within-category score, and peakedness score) were assessed for each child participant across the three test sessions. RESULTS Linear mixed-effects (LME) models showed significant training-induced enhancement in lexical tone categorization with significantly narrower boundary width and better between-category discrimination in the immediate posttest over pretest for the trainees. Furthermore, training-induced gains were reliably retained in the follow-up test 10 weeks after training. By contrast, no significant changes were found in the control group across sessions. Regression analysis confirmed that baseline performance (i.e., boundary width in the pretest session) and duration of CI use were significant predictors for the magnitude of training-induced benefits. CONCLUSIONS The stringent CP tests with synthesized stimuli that excluded acoustic cues other than the pitch contour and were never used in training showed strong evidence for the efficacy of HVPT in yielding immediate and sustained improvement in lexical tone categorization for Mandarin-speaking children with CIs. The training results and individual differences have remarkable implications for developing personalized computer-based short-term HVPT protocols that may have sustainable long-term benefits for aural rehabilitation in this clinical population.
Collapse
Affiliation(s)
- Hao Zhang
- Center for Clinical Neurolinguistics, School of Foreign Languages and Literature, Shandong University, Jinan, China
| | - Wen Ma
- Center for Clinical Neurolinguistics, School of Foreign Languages and Literature, Shandong University, Jinan, China
| | - Hongwei Ding
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences and Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
4
|
Smit EA, Milne AJ, Escudero P. Music Perception Abilities and Ambiguous Word Learning: Is There Cross-Domain Transfer in Nonmusicians? Front Psychol 2022; 13:801263. [PMID: 35401340 PMCID: PMC8984940 DOI: 10.3389/fpsyg.2022.801263] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 02/08/2022] [Indexed: 11/14/2022] Open
Abstract
Perception of music and speech is based on similar auditory skills, and it is often suggested that those with enhanced music perception skills may perceive and learn novel words more easily. The current study tested whether music perception abilities are associated with novel word learning in an ambiguous learning scenario. Using a cross-situational word learning (CSWL) task, nonmusician adults were exposed to word-object pairings between eight novel words and visual referents. Novel words were either non-minimal pairs differing in all sounds or minimal pairs differing in their initial consonant or vowel. In order to be successful in this task, learners need to be able to correctly encode the phonological details of the novel words and have sufficient auditory working memory to remember the correct word-object pairings. Using the Mistuning Perception Test (MPT) and the Melodic Discrimination Test (MDT), we measured learners’ pitch perception and auditory working memory. We predicted that those with higher MPT and MDT values would perform better in the CSWL task and in particular for novel words with high phonological overlap (i.e., minimal pairs). We found that higher musical perception skills led to higher accuracy for non-minimal pairs and minimal pairs differing in their initial consonant. Interestingly, this was not the case for vowel minimal pairs. We discuss the results in relation to theories of second language word learning such as the Second Language Perception model (L2LP).
Collapse
Affiliation(s)
- Eline A. Smit
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia
- ARC Centre of Excellence for the Dynamics of Language, Canberra, ACT, Australia
- *Correspondence: Eline A. Smit,
| | - Andrew J. Milne
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia
| | - Paola Escudero
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia
- ARC Centre of Excellence for the Dynamics of Language, Canberra, ACT, Australia
| |
Collapse
|
5
|
Zhu J, Chen X, Chen F, Wiener S. Individuals With Congenital Amusia Show Degraded Speech Perception but Preserved Statistical Learning for Tone Languages. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:53-69. [PMID: 34860571 DOI: 10.1044/2021_jslhr-21-00383] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
PURPOSE Individuals with congenital amusia exhibit degraded speech perception. This study examined whether adult Chinese Mandarin listeners with amusia were still able to extract the statistical regularities of Mandarin speech sounds, despite their degraded speech perception. METHOD Using the gating paradigm with monosyllabic syllable-tone words, we tested 19 Mandarin-speaking amusics and 19 musically intact controls. Listeners heard increasingly longer fragments of the acoustic signal across eight duration-blocked gates. The stimuli varied in syllable token frequency and syllable-tone co-occurrence probability. The correct syllable-tone word, correct syllable-only, correct tone-only, and correct syllable-incorrect tone responses were compared respectively between the two groups using mixed-effects models. RESULTS Amusics were less accurate than controls in terms of the correct word, correct syllable-only, and correct tone-only responses. Amusics, however, showed consistent patterns of top-down processing, as indicated by more accurate responses to high-frequency syllables, high-probability tones, and tone errors all in manners similar to those of the control listeners. CONCLUSIONS Amusics are able to learn syllable and tone statistical regularities from the language input. This extends previous work by showing that amusics can track phonological segment and pitch cues despite their degraded speech perception. The observed speech deficits in amusics are therefore not due to an abnormal statistical learning mechanism. These results support rehabilitation programs aimed at improving amusics' sensitivity to pitch.
Collapse
Affiliation(s)
- Jiaqiang Zhu
- College of Foreign Languages, Hunan University, Changsha, China
| | - Xiaoxiang Chen
- College of Foreign Languages, Hunan University, Changsha, China
| | - Fei Chen
- College of Foreign Languages, Hunan University, Changsha, China
| | - Seth Wiener
- Department of Modern Languages, Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
6
|
Chen S, Yang Y, Wayland R. Categorical Perception of Mandarin Pitch Directions by Cantonese-Speaking Musicians and Non-musicians. Front Psychol 2021; 12:713949. [PMID: 34721160 PMCID: PMC8551581 DOI: 10.3389/fpsyg.2021.713949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 09/15/2021] [Indexed: 11/22/2022] Open
Abstract
Purpose: This study is to investigate whether Cantonese-speaking musicians may show stronger CP than Cantonese-speaking non-musicians in perceiving pitch directions generated based on Mandarin tones. It also aims to examine whether musicians may be more effective in processing stimuli and more sensitive to subtle differences caused by vowel quality. Methods: Cantonese-speaking musicians and non-musicians performed a categorical identification and a discrimination task on rising and falling continua of fundamental frequency generated based on Mandarin level, rising and falling tones on two vowels with nine duration values. Results: Cantonese-speaking musicians exhibited a stronger categorical perception (CP) of pitch contours than non-musicians based on the identification and discrimination tasks. Compared to non-musicians, musicians were also more sensitive to the change of stimulus duration and to the intrinsic F0 in pitch perception in pitch processing. Conclusion: The CP was strengthened due to musical experience and musicians benefited more from increased stimulus duration and were more efficient in pitch processing. Musicians might be able to better use the extra time to form an auditory representation with more acoustic details. Even with more efficiency in pitch processing, musicians' ability to detect subtle pitch changes caused by intrinsic F0 was not undermined, which is likely due to their superior ability to process temporal information. These results thus suggest musicians may have a great advantage in learning tones of a second language.
Collapse
Affiliation(s)
- Si Chen
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, SAR China.,Hong Kong Polytechnic University-Peking University Research Centre on Chinese Linguistics, Hong Kong, SAR China
| | - Yike Yang
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, SAR China
| | - Ratree Wayland
- Department of Linguistics, University of Florida, Gainesville, FL, United States
| |
Collapse
|
7
|
Tao R, Zhang K, Peng G. Music Does Not Facilitate Lexical Tone Normalization: A Speech-Specific Perceptual Process. Front Psychol 2021; 12:717110. [PMID: 34777097 PMCID: PMC8585521 DOI: 10.3389/fpsyg.2021.717110] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Accepted: 09/30/2021] [Indexed: 11/13/2022] Open
Abstract
Listeners utilize the immediate contexts to efficiently normalize variable vocal streams into standard phonology units. However, researchers debated whether non-speech contexts can also serve as valid clues for speech normalization. Supporters of the two sides proposed a general-auditory hypothesis and a speech-specific hypothesis to explain the underlying mechanisms. A possible confounding factor of this inconsistency is the listeners' perceptual familiarity of the contexts, as the non-speech contexts were perceptually unfamiliar to listeners. In this study, we examined this confounding factor by recruiting a group of native Cantonese speakers with sufficient musical training experience and a control group with minimal musical training. Participants performed lexical tone judgment tasks in three contextual conditions, i.e., speech, non-speech, and music context conditions. Both groups were familiar with the speech context and not familiar with the non-speech context. The musician group was more familiar with the music context than the non-musician group. The results evidenced the lexical tone normalization process in speech context but not non-speech nor music contexts. More importantly, musicians did not outperform non-musicians on any contextual conditions even if the musicians were experienced at pitch perception, indicating that there is no noticeable transfer in pitch perception from the music domain to the linguistic domain for tonal language speakers. The findings showed that even high familiarity with a non-linguistic context cannot elicit an effective lexical tone normalization process, supporting the speech-specific basis of the perceptual normalization process.
Collapse
Affiliation(s)
| | | | - Gang Peng
- Research Centre for Language, Cognition, and Neuroscience, Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Kowloon, Hong Kong SAR, China
| |
Collapse
|
8
|
Zhang C, Ho OY, Shao J, Ou J, Law SP. Dissociation of tone merger and congenital amusia in Hong Kong Cantonese. PLoS One 2021; 16:e0253982. [PMID: 34197546 PMCID: PMC8248700 DOI: 10.1371/journal.pone.0253982] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 06/17/2021] [Indexed: 12/03/2022] Open
Abstract
While the issue of individual variation has been widely studied in second language learning or processing, it is less well understood how perceptual and musical aptitude differences can explain individual variation in native speech processing. In the current study, we make use of tone merger in Hong Kong Cantonese, an ongoing sound change that concerns the merging of tones in perception, production or both in a portion of native speakers, to examine the possible relationship between tone merger and musical and pitch abilities. Although a previous study has reported the occurrence of tone merger independently of musical training, it has not been investigated before whether tone-merging individuals, especially those merging tones in perception, would have inferior musical perception and fine-grained pitch sensitivities, given the close relationship of speech and music. To this end, we tested three groups of tone-merging individuals with various tone perception and production profiles on musical perception and pitch threshold tasks, in comparison to a group of Cantonese speakers with congenital amusia, and another group of controls without tone merger or amusia. Additionally, the amusics were compared with tone-merging individuals on the details of their tone discrimination and production profiles. The results showed a clear dissociation of tone merger and amusia, with the tone-merging individuals exhibiting intact musical and pitch abilities; on the other hand, the amusics demonstrated widespread difficulties in tone discrimination yet intact tone production, in contrast to the highly selective confusion of a specific tone pair in production or discrimination in tone-merging individuals. These findings provide the first evidence that tone merger and amusia are distinct from each other, and further suggest that the cause of tone merger may lie elsewhere rather than being driven by musical or pitch deficits. We also discussed issues arising from the current findings regarding the neural mechanisms of tone merger and amusia.
Collapse
Affiliation(s)
- Caicai Zhang
- Research Centre for Language, Cognition, and Neuroscience, Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China
- * E-mail:
| | - Oi-Yee Ho
- Ear Institute, University College London, London, United Kingdom
| | - Jing Shao
- Department of English Language and Literature, Hong Kong Baptist University, Hong Kong SAR, China
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jinghua Ou
- Department of Linguistics, University of Chicago, Chicago, IL, United States of America
| | - Sam-Po Law
- Unit of Human Communication, Development, and Information Sciences, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
9
|
Ma J, Zhu J, Yang Y, Chen F. The Development of Categorical Perception of Segments and Suprasegments in Mandarin-Speaking Preschoolers. Front Psychol 2021; 12:693366. [PMID: 34354636 PMCID: PMC8329735 DOI: 10.3389/fpsyg.2021.693366] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Accepted: 05/27/2021] [Indexed: 11/13/2022] Open
Abstract
This study investigated the developmental trajectories of categorical perception (CP) of segments (i.e., stops) and suprasegments (i.e., lexical tones) in an attempt to examine the perceptual development of phonological categories and whether CP of suprasegments develops in parallel with that of segments. Forty-seven Mandarin-speaking monolingual preschoolers aged four to six years old, and fourteen adults completed both identification and discrimination tasks of the Tone 1-2 continuum and the /pa/-/pha/ continuum. Results revealed that children could perceive both lexical tones and aspiration of stops in a categorical manner by age four. The boundary position did not depend on age, with children having similar positions to adults regardless of speech continuum types. The boundary width, on the other hand, reached the adult-like level at age six for lexical tones, but not for stops. In addition, the within-category discrimination score did not differ significantly between children and adults for both continua. The between-category discrimination score improved with age and achieved the adult-like level at age five for lexical tones, but still not for stops even at age six. It suggests that the fine-grained perception of phonological categories is a protracted process, and the improvement and varying timeline of the development of segments and suprasegments are discussed in relation to statistical learning of the regularities of speech sounds in ambient language, ongoing maturation of perceptual systems, the memory mechanism underlying perceptual learning, and the intrinsic nature of speech elements.
Collapse
Affiliation(s)
- Junzhou Ma
- School of Foreign Languages, Taizhou University, Taizhou, China
| | - Jiaqiang Zhu
- School of Foreign Languages, Hunan University, Changsha, China
| | - Yuxiao Yang
- Foreign Studies College, Hunan Normal University, Changsha, China
| | - Fei Chen
- School of Foreign Languages, Hunan University, Changsha, China
| |
Collapse
|
10
|
Zhu J, Chen X, Chen F. Book Review: Speech Perception, Production and Acquisition: Multidisciplinary Approaches in Chinese Languages. Front Psychol 2021. [PMCID: PMC8141580 DOI: 10.3389/fpsyg.2021.621481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|