Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sohoglu E, Davis MH. Perceptual learning of degraded speech by minimizing prediction error. Proc Natl Acad Sci U S A 2016;113:E1747-56. [PMID: 26957596 PMCID: PMC4812728 DOI: 10.1073/pnas.1523266113] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Sohoglu E, Davis MH. Perceptual learning of degraded speech by minimizing prediction error. Proc Natl Acad Sci U S A 2016;113:E1747-56. [PMID: 26957596 PMCID: PMC4812728 DOI: 10.1073/pnas.1523266113] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Zhu Y, Li C, Hendry C, Glass J, Canseco-Gonzalez E, Pitts MA, Dykstra AR. Isolating Neural Signatures of Conscious Speech Perception with a No-Report Sine-Wave Speech Paradigm. J Neurosci 2024;44:e0145232023. [PMID: 38191569 PMCID: PMC10883607 DOI: 10.1523/jneurosci.0145-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 11/21/2023] [Accepted: 12/21/2023] [Indexed: 01/10/2024] Open

Karunathilake IMD, Kulasingham JP, Simon JZ. Neural tracking measures of speech intelligibility: Manipulating intelligibility while keeping acoustics unchanged. Proc Natl Acad Sci U S A 2023;120:e2309166120. [PMID: 38032934 PMCID: PMC10710032 DOI: 10.1073/pnas.2309166120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 10/21/2023] [Indexed: 12/02/2023] Open

Abstract

Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle the effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise-vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (nondegraded) version of the speech. This intermediate priming, which generates a "pop-out" percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affect acoustic and linguistic neural representations using multivariate temporal response functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. mTRFs analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex, in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.

Collapse

Mai G, Wang WSY. Distinct roles of delta- and theta-band neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing. Hum Brain Mapp 2023;44:6149-6172. [PMID: 37818940 PMCID: PMC10619373 DOI: 10.1002/hbm.26503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 08/17/2023] [Accepted: 09/13/2023] [Indexed: 10/13/2023] Open

Abstract

The brain tracks and encodes multi-level speech features during spoken language processing. It is evident that this speech tracking is dominant at low frequencies (<8 Hz) including delta and theta bands. Recent research has demonstrated distinctions between delta- and theta-band tracking but has not elucidated how they differentially encode speech across linguistic levels. Here, we hypothesised that delta-band tracking encodes prediction errors (enhanced processing of unexpected features) while theta-band tracking encodes neural sharpening (enhanced processing of expected features) when people perceive speech with different linguistic contents. EEG responses were recorded when normal-hearing participants attended to continuous auditory stimuli that contained different phonological/morphological and semantic contents: (1) real-words, (2) pseudo-words and (3) time-reversed speech. We employed multivariate temporal response functions to measure EEG reconstruction accuracies in response to acoustic (spectrogram), phonetic and phonemic features with the partialling procedure that singles out unique contributions of individual features. We found higher delta-band accuracies for pseudo-words than real-words and time-reversed speech, especially during encoding of phonetic features. Notably, individual time-lag analyses showed that significantly higher accuracies for pseudo-words than real-words started at early processing stages for phonetic encoding (<100 ms post-feature) and later stages for acoustic and phonemic encoding (>200 and 400 ms post-feature, respectively). Theta-band accuracies, on the other hand, were higher when stimuli had richer linguistic content (real-words > pseudo-words > time-reversed speech). Such effects also started at early stages (<100 ms post-feature) during encoding of all individual features or when all features were combined. We argue these results indicate that delta-band tracking may play a role in predictive coding leading to greater tracking of pseudo-words due to the presence of unexpected/unpredicted semantic information, while theta-band tracking encodes sharpened signals caused by more expected phonological/morphological and semantic contents. Early presence of these effects reflects rapid computations of sharpening and prediction errors. Moreover, by measuring changes in EEG alpha power, we did not find evidence that the observed effects can be solitarily explained by attentional demands or listening efforts. Finally, we used directed information analyses to illustrate feedforward and feedback information transfers between prediction errors and sharpening across linguistic levels, showcasing how our results fit with the hierarchical Predictive Coding framework. Together, we suggest the distinct roles of delta and theta neural tracking for sharpening and predictive coding of multi-level speech features during spoken language processing.

Collapse

Karunathilake ID, Kulasingham JP, Simon JZ. Neural Tracking Measures of Speech Intelligibility: Manipulating Intelligibility while Keeping Acoustics Unchanged. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.18.541269. [PMID: 37292644 PMCID: PMC10245672 DOI: 10.1101/2023.05.18.541269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

Neural speech tracking has advanced our understanding of how our brains rapidly map an acoustic speech signal onto linguistic representations and ultimately meaning. It remains unclear, however, how speech intelligibility is related to the corresponding neural responses. Many studies addressing this question vary the level of intelligibility by manipulating the acoustic waveform, but this makes it difficult to cleanly disentangle effects of intelligibility from underlying acoustical confounds. Here, using magnetoencephalography (MEG) recordings, we study neural measures of speech intelligibility by manipulating intelligibility while keeping the acoustics strictly unchanged. Acoustically identical degraded speech stimuli (three-band noise vocoded, ~20 s duration) are presented twice, but the second presentation is preceded by the original (non-degraded) version of the speech. This intermediate priming, which generates a 'pop-out' percept, substantially improves the intelligibility of the second degraded speech passage. We investigate how intelligibility and acoustical structure affects acoustic and linguistic neural representations using multivariate Temporal Response Functions (mTRFs). As expected, behavioral results confirm that perceived speech clarity is improved by priming. TRF analysis reveals that auditory (speech envelope and envelope onset) neural representations are not affected by priming, but only by the acoustics of the stimuli (bottom-up driven). Critically, our findings suggest that segmentation of sounds into words emerges with better speech intelligibility, and most strongly at the later (~400 ms latency) word processing stage, in prefrontal cortex (PFC), in line with engagement of top-down mechanisms associated with priming. Taken together, our results show that word representations may provide some objective measures of speech comprehension.

Collapse

Jiang J, Johnson JCS, Requena-Komuro MC, Benhamou E, Sivasathiaseelan H, Chokesuwattanaskul A, Nelson A, Nortley R, Weil RS, Volkmer A, Marshall CR, Bamiou DE, Warren JD, Hardy CJD. Comprehension of acoustically degraded speech in Alzheimer's disease and primary progressive aphasia. Brain 2023;146:4065-4076. [PMID: 37184986 PMCID: PMC10545509 DOI: 10.1093/brain/awad163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/20/2023] [Accepted: 04/27/2023] [Indexed: 05/17/2023] Open

Abstract

Successful communication in daily life depends on accurate decoding of speech signals that are acoustically degraded by challenging listening conditions. This process presents the brain with a demanding computational task that is vulnerable to neurodegenerative pathologies. However, despite recent intense interest in the link between hearing impairment and dementia, comprehension of acoustically degraded speech in these diseases has been little studied. Here we addressed this issue in a cohort of 19 patients with typical Alzheimer's disease and 30 patients representing the three canonical syndromes of primary progressive aphasia (non-fluent/agrammatic variant primary progressive aphasia; semantic variant primary progressive aphasia; logopenic variant primary progressive aphasia), compared to 25 healthy age-matched controls. As a paradigm for the acoustically degraded speech signals of daily life, we used noise-vocoding: synthetic division of the speech signal into frequency channels constituted from amplitude-modulated white noise, such that fewer channels convey less spectrotemporal detail thereby reducing intelligibility. We investigated the impact of noise-vocoding on recognition of spoken three-digit numbers and used psychometric modelling to ascertain the threshold number of noise-vocoding channels required for 50% intelligibility by each participant. Associations of noise-vocoded speech intelligibility threshold with general demographic, clinical and neuropsychological characteristics and regional grey matter volume (defined by voxel-based morphometry of patients' brain images) were also assessed. Mean noise-vocoded speech intelligibility threshold was significantly higher in all patient groups than healthy controls, and significantly higher in Alzheimer's disease and logopenic variant primary progressive aphasia than semantic variant primary progressive aphasia (all P < 0.05). In a receiver operating characteristic analysis, vocoded intelligibility threshold discriminated Alzheimer's disease, non-fluent variant and logopenic variant primary progressive aphasia patients very well from healthy controls. Further, this central hearing measure correlated with overall disease severity but not with peripheral hearing or clear speech perception. Neuroanatomically, after correcting for multiple voxel-wise comparisons in predefined regions of interest, impaired noise-vocoded speech comprehension across syndromes was significantly associated (P < 0.05) with atrophy of left planum temporale, angular gyrus and anterior cingulate gyrus: a cortical network that has previously been widely implicated in processing degraded speech signals. Our findings suggest that the comprehension of acoustically altered speech captures an auditory brain process relevant to daily hearing and communication in major dementia syndromes, with novel diagnostic and therapeutic implications.

Collapse

Affiliation(s)

Jessica Jiang Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK
Jeremy C S Johnson Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK
Maï-Carmen Requena-Komuro Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK Kidney Cancer Program, UT Southwestern Medical Centre, Dallas, TX 75390, USA
Elia Benhamou Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK
Harri Sivasathiaseelan Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK
Anthipa Chokesuwattanaskul Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK Division of Neurology, Department of Internal Medicine, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok 10330, Thailand
Annabel Nelson Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK
Ross Nortley Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK Wexham Park Hospital, Frimley Health NHS Foundation Trust, Slough SL2 4HL, UK
Rimona S Weil Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK
Anna Volkmer Division of Psychology and Language Sciences, University College London, London WC1H 0AP, UK
Charles R Marshall Preventive Neurology Unit, Wolfson Institute of Population Health, Queen Mary University of London, London EC1M 6BQ, UK
Doris-Eva Bamiou UCL Ear Institute and UCL/UCLH Biomedical Research Centre, National Institute of Health Research, University College London, London WC1X 8EE, UK
Jason D Warren Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK
Chris J D Hardy Dementia Research Centre, Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, University College London, London WC1N 3AR, UK

Collapse

Bernstein LE, Auer ET, Eberhardt SP. Modality-Specific Perceptual Learning of Vocoded Auditory versus Lipread Speech: Different Effects of Prior Information. Brain Sci 2023;13:1008. [PMID: 37508940 PMCID: PMC10377548 DOI: 10.3390/brainsci13071008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/26/2023] [Accepted: 06/08/2023] [Indexed: 07/30/2023] Open

Abstract

Traditionally, speech perception training paradigms have not adequately taken into account the possibility that there may be modality-specific requirements for perceptual learning with auditory-only (AO) versus visual-only (VO) speech stimuli. The study reported here investigated the hypothesis that there are modality-specific differences in how prior information is used by normal-hearing participants during vocoded versus VO speech training. Two different experiments, one with vocoded AO speech (Experiment 1) and one with VO, lipread, speech (Experiment 2), investigated the effects of giving different types of prior information to trainees on each trial during training. The training was for four ~20 min sessions, during which participants learned to label novel visual images using novel spoken words. Participants were assigned to different types of prior information during training: Word Group trainees saw a printed version of each training word (e.g., "tethon"), and Consonant Group trainees saw only its consonants (e.g., "t_th_n"). Additional groups received no prior information (i.e., Experiment 1, AO Group; Experiment 2, VO Group) or a spoken version of the stimulus in a different modality from the training stimuli (Experiment 1, Lipread Group; Experiment 2, Vocoder Group). That is, in each experiment, there was a group that received prior information in the modality of the training stimuli from the other experiment. In both experiments, the Word Groups had difficulty retaining the novel words they attempted to learn during training. However, when the training stimuli were vocoded, the Word Group improved their phoneme identification. When the training stimuli were visual speech, the Consonant Group improved their phoneme identification and their open-set sentence lipreading. The results are considered in light of theoretical accounts of perceptual learning in relationship to perceptual modality.

Collapse

Cope TE, Sohoglu E, Peterson KA, Jones PS, Rua C, Passamonti L, Sedley W, Post B, Coebergh J, Butler CR, Garrard P, Abdel-Aziz K, Husain M, Griffiths TD, Patterson K, Davis MH, Rowe JB. Temporal lobe perceptual predictions for speech are instantiated in motor cortex and reconciled by inferior frontal cortex. Cell Rep 2023;42:112422. [PMID: 37099422 DOI: 10.1016/j.celrep.2023.112422] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 12/23/2022] [Accepted: 04/05/2023] [Indexed: 04/27/2023] Open

Affiliation(s)

Thomas E Cope Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK; Cambridge University Hospitals NHS Trust, Cambridge CB2 0QQ, UK.
Ediz Sohoglu Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK; School of Psychology, University of Sussex, Brighton BN1 9RH, UK
Katie A Peterson Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Department of Radiology, University of Cambridge, Cambridge CB2 0QQ, UK
P Simon Jones Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK
Catarina Rua Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK
Luca Passamonti Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK
William Sedley Biosciences Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
Brechtje Post Theoretical and Applied Linguistics, Faculty of Modern & Medieval Languages & Linguistics, University of Cambridge, Cambridge CB3 9DA, UK
Jan Coebergh Ashford and St Peter's Hospital, Ashford TW15 3AA, UK; St George's Hospital, London SW17 0QT, UK
Christopher R Butler Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX3 9DU, UK; Faculty of Medicine, Department of Brain Sciences, Imperial College London, London W12 0NN, UK
Peter Garrard St George's Hospital, London SW17 0QT, UK; Molecular and Clinical Sciences Research Institute, St. George's, University of London, London SW17 0RE, UK
Khaled Abdel-Aziz Ashford and St Peter's Hospital, Ashford TW15 3AA, UK; St George's Hospital, London SW17 0QT, UK
Masud Husain Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX3 9DU, UK
Timothy D Griffiths Biosciences Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
Karalyn Patterson Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK
Matthew H Davis Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK
James B Rowe Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, UK; Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, UK; Cambridge University Hospitals NHS Trust, Cambridge CB2 0QQ, UK

Collapse

Rimmele JM, Sun Y, Michalareas G, Ghitza O, Poeppel D. Dynamics of Functional Networks for Syllable and Word-Level Processing. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2023;4:120-144. [PMID: 37229144 PMCID: PMC10205074 DOI: 10.1162/nol_a_00089] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 11/07/2022] [Indexed: 05/27/2023]

Abstract

Speech comprehension requires the ability to temporally segment the acoustic input for higher-level linguistic analysis. Oscillation-based approaches suggest that low-frequency auditory cortex oscillations track syllable-sized acoustic information and therefore emphasize the relevance of syllabic-level acoustic processing for speech segmentation. How syllabic processing interacts with higher levels of speech processing, beyond segmentation, including the anatomical and neurophysiological characteristics of the networks involved, is debated. In two MEG experiments, we investigate lexical and sublexical word-level processing and the interactions with (acoustic) syllable processing using a frequency-tagging paradigm. Participants listened to disyllabic words presented at a rate of 4 syllables/s. Lexical content (native language), sublexical syllable-to-syllable transitions (foreign language), or mere syllabic information (pseudo-words) were presented. Two conjectures were evaluated: (i) syllable-to-syllable transitions contribute to word-level processing; and (ii) processing of words activates brain areas that interact with acoustic syllable processing. We show that syllable-to-syllable transition information compared to mere syllable information, activated a bilateral superior, middle temporal and inferior frontal network. Lexical content resulted, additionally, in increased neural activity. Evidence for an interaction of word- and acoustic syllable-level processing was inconclusive. Decreases in syllable tracking (cerebroacoustic coherence) in auditory cortex and increases in cross-frequency coupling between right superior and middle temporal and frontal areas were found when lexical content was present compared to all other conditions; however, not when conditions were compared separately. The data provide experimental insight into how subtle and sensitive syllable-to-syllable transition information for word-level processing is.

Collapse

Reuter T, Mazzei C, Lew-Williams C, Emberson L. Infants' lexical comprehension and lexical anticipation abilities are closely linked in early language development. INFANCY 2023;28:532-549. [PMID: 36808682 DOI: 10.1111/infa.12534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 11/23/2022] [Accepted: 01/15/2023] [Indexed: 02/22/2023]

Benetti S, Ferrari A, Pavani F. Multimodal processing in face-to-face interactions: A bridging link between psycholinguistics and sensory neuroscience. Front Hum Neurosci 2023;17:1108354. [PMID: 36816496 PMCID: PMC9932987 DOI: 10.3389/fnhum.2023.1108354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 01/11/2023] [Indexed: 02/05/2023] Open

Zoefel B, Gilbert RA, Davis MH. Intelligibility improves perception of timing changes in speech. PLoS One 2023;18:e0279024. [PMID: 36634109 PMCID: PMC9836318 DOI: 10.1371/journal.pone.0279024] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 11/28/2022] [Indexed: 01/13/2023] Open

Wang H, Chen R, Yan Y, McGettigan C, Rosen S, Adank P. Perceptual Learning of Noise-Vocoded Speech Under Divided Attention. Trends Hear 2023;27:23312165231192297. [PMID: 37547940 PMCID: PMC10408355 DOI: 10.1177/23312165231192297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 07/13/2023] [Accepted: 07/19/2023] [Indexed: 08/08/2023] Open

Murai SA, Riquimaroux H. Long-term changes in cortical representation through perceptual learning of spectrally degraded speech. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2023;209:163-172. [PMID: 36464716 DOI: 10.1007/s00359-022-01593-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 11/07/2022] [Accepted: 11/08/2022] [Indexed: 12/07/2022]

MacGregor LJ, Gilbert RA, Balewski Z, Mitchell DJ, Erzinçlioğlu SW, Rodd JM, Duncan J, Fedorenko E, Davis MH. Causal Contributions of the Domain-General (Multiple Demand) and the Language-Selective Brain Networks to Perceptual and Semantic Challenges in Speech Comprehension. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2022;3:665-698. [PMID: 36742011 PMCID: PMC9893226 DOI: 10.1162/nol_a_00081] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 09/07/2022] [Indexed: 06/18/2023]

Abstract

Listening to spoken language engages domain-general multiple demand (MD; frontoparietal) regions of the human brain, in addition to domain-selective (frontotemporal) language regions, particularly when comprehension is challenging. However, there is limited evidence that the MD network makes a functional contribution to core aspects of understanding language. In a behavioural study of volunteers (n = 19) with chronic brain lesions, but without aphasia, we assessed the causal role of these networks in perceiving, comprehending, and adapting to spoken sentences made more challenging by acoustic-degradation or lexico-semantic ambiguity. We measured perception of and adaptation to acoustically degraded (noise-vocoded) sentences with a word report task before and after training. Participants with greater damage to MD but not language regions required more vocoder channels to achieve 50% word report, indicating impaired perception. Perception improved following training, reflecting adaptation to acoustic degradation, but adaptation was unrelated to lesion location or extent. Comprehension of spoken sentences with semantically ambiguous words was measured with a sentence coherence judgement task. Accuracy was high and unaffected by lesion location or extent. Adaptation to semantic ambiguity was measured in a subsequent word association task, which showed that availability of lower-frequency meanings of ambiguous words increased following their comprehension (word-meaning priming). Word-meaning priming was reduced for participants with greater damage to language but not MD regions. Language and MD networks make dissociable contributions to challenging speech comprehension: Using recent experience to update word meaning preferences depends on language-selective regions, whereas the domain-general MD network plays a causal role in reporting words from degraded speech.

Collapse

Lanzilotti C, Andéol G, Micheyl C, Scannella S. Cocktail party training induces increased speech intelligibility and decreased cortical activity in bilateral inferior frontal gyri. A functional near-infrared study. PLoS One 2022;17:e0277801. [PMID: 36454948 PMCID: PMC9714910 DOI: 10.1371/journal.pone.0277801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 11/03/2022] [Indexed: 12/03/2022] Open

Abstract

The human brain networks responsible for selectively listening to a voice amid other talkers remain to be clarified. The present study aimed to investigate relationships between cortical activity and performance in a speech-in-speech task, before (Experiment I) and after training-induced improvements (Experiment II). In Experiment I, 74 participants performed a speech-in-speech task while their cortical activity was measured using a functional near infrared spectroscopy (fNIRS) device. One target talker and one masker talker were simultaneously presented at three different target-to-masker ratios (TMRs): adverse, intermediate and favorable. Behavioral results show that performance may increase monotonically with TMR in some participants and failed to decrease, or even improved, in the adverse-TMR condition for others. On the neural level, an extensive brain network including the frontal (left prefrontal cortex, right dorsolateral prefrontal cortex and bilateral inferior frontal gyri) and temporal (bilateral auditory cortex) regions was more solicited by the intermediate condition than the two others. Additionally, bilateral frontal gyri and left auditory cortex activities were found to be positively correlated with behavioral performance in the adverse-TMR condition. In Experiment II, 27 participants, whose performance was the poorest in the adverse-TMR condition of Experiment I, were trained to improve performance in that condition. Results show significant performance improvements along with decreased activity in bilateral inferior frontal gyri, the right dorsolateral prefrontal cortex, the left inferior parietal cortex and the right auditory cortex in the adverse-TMR condition after training. Arguably, lower neural activity reflects higher efficiency in processing masker inhibition after speech-in-speech training. As speech-in-noise tasks also imply frontal and temporal regions, we suggest that regardless of the type of masking (speech or noise) the complexity of the task will prompt the implication of a similar brain network. Furthermore, the initial significant cognitive recruitment will be reduced following a training leading to an economy of cognitive resources.

Collapse

Schwarz J, Li KK, Sim JH, Zhang Y, Buchanan-Worster E, Post B, Gibson JL, McDougall K. Semantic Cues Modulate Children’s and Adults’ Processing of Audio-Visual Face Mask Speech. Front Psychol 2022;13:879156. [PMID: 35928422 PMCID: PMC9343587 DOI: 10.3389/fpsyg.2022.879156] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 05/25/2022] [Indexed: 12/03/2022] Open

Mankel K, Shrestha U, Tipirneni-Sajja A, Bidelman GM. Functional Plasticity Coupled With Structural Predispositions in Auditory Cortex Shape Successful Music Category Learning. Front Neurosci 2022;16:897239. [PMID: 35837119 PMCID: PMC9274125 DOI: 10.3389/fnins.2022.897239] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 05/25/2022] [Indexed: 11/23/2022] Open

Abstract

Categorizing sounds into meaningful groups helps listeners more efficiently process the auditory scene and is a foundational skill for speech perception and language development. Yet, how auditory categories develop in the brain through learning, particularly for non-speech sounds (e.g., music), is not well understood. Here, we asked musically naïve listeners to complete a brief (∼20 min) training session where they learned to identify sounds from a musical interval continuum (minor-major 3rds). We used multichannel EEG to track behaviorally relevant neuroplastic changes in the auditory event-related potentials (ERPs) pre- to post-training. To rule out mere exposure-induced changes, neural effects were evaluated against a control group of 14 non-musicians who did not undergo training. We also compared individual categorization performance with structural volumetrics of bilateral Heschl's gyrus (HG) from MRI to evaluate neuroanatomical substrates of learning. Behavioral performance revealed steeper (i.e., more categorical) identification functions in the posttest that correlated with better training accuracy. At the neural level, improvement in learners' behavioral identification was characterized by smaller P2 amplitudes at posttest, particularly over right hemisphere. Critically, learning-related changes in the ERPs were not observed in control listeners, ruling out mere exposure effects. Learners also showed smaller and thinner HG bilaterally, indicating superior categorization was associated with structural differences in primary auditory brain regions. Collectively, our data suggest successful auditory categorical learning of music sounds is characterized by short-term functional changes (i.e., greater post-training efficiency) in sensory coding processes superimposed on preexisting structural differences in bilateral auditory cortex.

Collapse

Hauswald A, Keitel A, Chen Y, Rösch S, Weisz N. Degradation levels of continuous speech affect neural speech tracking and alpha power differently. Eur J Neurosci 2022;55:3288-3302. [PMID: 32687616 PMCID: PMC9540197 DOI: 10.1111/ejn.14912] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 07/12/2020] [Accepted: 07/13/2020] [Indexed: 11/26/2022]

Distracting Linguistic Information Impairs Neural Tracking of Attended Speech. CURRENT RESEARCH IN NEUROBIOLOGY 2022;3:100043. [DOI: 10.1016/j.crneur.2022.100043] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/27/2022] [Accepted: 05/24/2022] [Indexed: 11/20/2022] Open

Cooke M, Scharenborg O, Meyer BT. The time course of adaptation to distorted speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022;151:2636. [PMID: 35461479 DOI: 10.1121/10.0010235] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 03/25/2022] [Indexed: 06/14/2023]

Corcoran AW, Perera R, Koroma M, Kouider S, Hohwy J, Andrillon T. Expectations boost the reconstruction of auditory features from electrophysiological responses to noisy speech. Cereb Cortex 2022;33:691-708. [PMID: 35253871 PMCID: PMC9890472 DOI: 10.1093/cercor/bhac094] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/11/2022] [Accepted: 02/12/2022] [Indexed: 02/04/2023] Open

Hidalgo C, Mohamed I, Zielinski C, Schön D. The effect of speech degradation on the ability to track and predict turn structure in conversation. Cortex 2022;151:105-115. [DOI: 10.1016/j.cortex.2022.01.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 11/15/2021] [Accepted: 01/20/2022] [Indexed: 11/03/2022]

Alderson-Day B, Moffatt J, Lima CF, Krishnan S, Fernyhough C, Scott SK, Denton S, Leong IYT, Oncel AD, Wu YL, Gurbuz Z, Evans S. Susceptibility to auditory hallucinations is associated with spontaneous but not directed modulation of top-down expectations for speech. Neurosci Conscious 2022;2022:niac002. [PMID: 35145758 PMCID: PMC8824703 DOI: 10.1093/nc/niac002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 01/13/2022] [Indexed: 11/29/2022] Open

Abstract

Auditory verbal hallucinations (AVHs)-or hearing voices-occur in clinical and non-clinical populations, but their mechanisms remain unclear. Predictive processing models of psychosis have proposed that hallucinations arise from an over-weighting of prior expectations in perception. It is unknown, however, whether this reflects (i) a sensitivity to explicit modulation of prior knowledge or (ii) a pre-existing tendency to spontaneously use such knowledge in ambiguous contexts. Four experiments were conducted to examine this question in healthy participants listening to ambiguous speech stimuli. In experiments 1a (n = 60) and 1b (n = 60), participants discriminated intelligible and unintelligible sine-wave speech before and after exposure to the original language templates (i.e. a modulation of expectation). No relationship was observed between top-down modulation and two common measures of hallucination-proneness. Experiment 2 (n = 99) confirmed this pattern with a different stimulus-sine-vocoded speech (SVS)-that was designed to minimize ceiling effects in discrimination and more closely model previous top-down effects reported in psychosis. In Experiment 3 (n = 134), participants were exposed to SVS without prior knowledge that it contained speech (i.e. naïve listening). AVH-proneness significantly predicted both pre-exposure identification of speech and successful recall for words hidden in SVS, indicating that participants could actually decode the hidden signal spontaneously. Altogether, these findings support a pre-existing tendency to spontaneously draw upon prior knowledge in healthy people prone to AVH, rather than a sensitivity to temporary modulations of expectation. We propose a model of clinical and non-clinical hallucinations, across auditory and visual modalities, with testable predictions for future research.

Collapse

Moberly AC, Lewis JH, Vasil KJ, Ray C, Tamati TN. Bottom-Up Signal Quality Impacts the Role of Top-Down Cognitive-Linguistic Processing During Speech Recognition by Adults with Cochlear Implants. Otol Neurotol 2021;42:S33-S41. [PMID: 34766942 PMCID: PMC8597903 DOI: 10.1097/mao.0000000000003377] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Abstract

HYPOTHESES

Significant variability persists in speech recognition outcomes in adults with cochlear implants (CIs). Sensory ("bottom-up") and cognitive-linguistic ("top-down") processes help explain this variability. However, the interactions of these bottom-up and top-down factors remain unclear. One hypothesis was tested: top-down processes would contribute differentially to speech recognition, depending on the fidelity of bottom-up input.

BACKGROUND

Bottom-up spectro-temporal processing, assessed using a Spectral-Temporally Modulated Ripple Test (SMRT), is associated with CI speech recognition outcomes. Similarly, top-down cognitive-linguistic skills relate to outcomes, including working memory capacity, inhibition-concentration, speed of lexical access, and nonverbal reasoning.

METHODS

Fifty-one adult CI users were tested for word and sentence recognition, along with performance on the SMRT and a battery of cognitive-linguistic tests. The group was divided into "low-," "intermediate-," and "high-SMRT" groups, based on SMRT scores. Separate correlation analyses were performed for each subgroup between a composite score of cognitive-linguistic processing and speech recognition.

RESULTS

Associations of top-down composite scores with speech recognition were not significant for the low-SMRT group. In contrast, these associations were significant and of medium effect size (Spearman's rho = 0.44-0.46) for two sentence types for the intermediate-SMRT group. For the high-SMRT group, top-down scores were associated with both word and sentence recognition, with medium to large effect sizes (Spearman's rho = 0.45-0.58).

CONCLUSIONS

Top-down processes contribute differentially to speech recognition in CI users based on the quality of bottom-up input. Findings have clinical implications for individualized treatment approaches relying on bottom-up device programming or top-down rehabilitation approaches.

Collapse

Wang YC, Sohoglu E, Gilbert RA, Henson RN, Davis MH. Predictive Neural Computations Support Spoken Word Recognition: Evidence from MEG and Competitor Priming. J Neurosci 2021;41:6919-6932. [PMID: 34210777 PMCID: PMC8360690 DOI: 10.1523/jneurosci.1685-20.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 05/22/2021] [Accepted: 05/25/2021] [Indexed: 11/24/2022] Open

Abstract

Human listeners achieve quick and effortless speech comprehension through computations of conditional probability using Bayes rule. However, the neural implementation of Bayesian perceptual inference remains unclear. Competitive-selection accounts (e.g., TRACE) propose that word recognition is achieved through direct inhibitory connections between units representing candidate words that share segments (e.g., hygiene and hijack share /haidʒ/). Manipulations that increase lexical uncertainty should increase neural responses associated with word recognition when words cannot be uniquely identified. In contrast, predictive-selection accounts (e.g., Predictive-Coding) propose that spoken word recognition involves comparing heard and predicted speech sounds and using prediction error to update lexical representations. Increased lexical uncertainty in words, such as hygiene and hijack, will increase prediction error and hence neural activity only at later time points when different segments are predicted. We collected MEG data from male and female listeners to test these two Bayesian mechanisms and used a competitor priming manipulation to change the prior probability of specific words. Lexical decision responses showed delayed recognition of target words (hygiene) following presentation of a neighboring prime word (hijack) several minutes earlier. However, this effect was not observed with pseudoword primes (higent) or targets (hijure). Crucially, MEG responses in the STG showed greater neural responses for word-primed words after the point at which they were uniquely identified (after /haidʒ/ in hygiene) but not before while similar changes were again absent for pseudowords. These findings are consistent with accounts of spoken word recognition in which neural computations of prediction error play a central role.SIGNIFICANCE STATEMENT Effective speech perception is critical to daily life and involves computations that combine speech signals with prior knowledge of spoken words (i.e., Bayesian perceptual inference). This study specifies the neural mechanisms that support spoken word recognition by testing two distinct implementations of Bayes perceptual inference. Most established theories propose direct competition between lexical units such that inhibition of irrelevant candidates leads to selection of critical words. Our results instead support predictive-selection theories (e.g., Predictive-Coding): by comparing heard and predicted speech sounds, neural computations of prediction error can help listeners continuously update lexical probabilities, allowing for more rapid word identification.

Collapse

Klimovich-Gray A, Barrena A, Agirre E, Molinaro N. One Way or Another: Cortical Language Areas Flexibly Adapt Processing Strategies to Perceptual And Contextual Properties of Speech. Cereb Cortex 2021;31:4092-4103. [PMID: 33825884 DOI: 10.1093/cercor/bhab071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 02/24/2021] [Accepted: 02/25/2021] [Indexed: 11/13/2022] Open

Kocagoncu E, Klimovich-Gray A, Hughes LE, Rowe JB. Evidence and implications of abnormal predictive coding in dementia. Brain 2021;144:3311-3321. [PMID: 34240109 PMCID: PMC8677549 DOI: 10.1093/brain/awab254] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 03/15/2021] [Accepted: 06/17/2021] [Indexed: 11/14/2022] Open

Beach SD, Ozernov-Palchik O, May SC, Centanni TM, Gabrieli JDE, Pantazis D. Neural Decoding Reveals Concurrent Phonemic and Subphonemic Representations of Speech Across Tasks. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021;2:254-279. [PMID: 34396148 PMCID: PMC8360503 DOI: 10.1162/nol_a_00034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 02/21/2021] [Indexed: 06/13/2023]

Darriba Á, Van Ommen S, Hsu YF, Waszak F. Visual Predictions Operate on Different Timescales. J Cogn Neurosci 2021;33:984-1002. [PMID: 34428794 DOI: 10.1162/jocn_a_01711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Jiang J, Benhamou E, Waters S, Johnson JCS, Volkmer A, Weil RS, Marshall CR, Warren JD, Hardy CJD. Processing of Degraded Speech in Brain Disorders. Brain Sci 2021;11:394. [PMID: 33804653 PMCID: PMC8003678 DOI: 10.3390/brainsci11030394] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 03/15/2021] [Accepted: 03/18/2021] [Indexed: 11/30/2022] Open

Tabas A, von Kriegstein K. Adjudicating Between Local and Global Architectures of Predictive Processing in the Subcortical Auditory Pathway. Front Neural Circuits 2021;15:644743. [PMID: 33776657 PMCID: PMC7994860 DOI: 10.3389/fncir.2021.644743] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 02/16/2021] [Indexed: 11/13/2022] Open

Abstract

Predictive processing, a leading theoretical framework for sensory processing, suggests that the brain constantly generates predictions on the sensory world and that perception emerges from the comparison between these predictions and the actual sensory input. This requires two distinct neural elements: generative units, which encode the model of the sensory world; and prediction error units, which compare these predictions against the sensory input. Although predictive processing is generally portrayed as a theory of cerebral cortex function, animal and human studies over the last decade have robustly shown the ubiquitous presence of prediction error responses in several nuclei of the auditory, somatosensory, and visual subcortical pathways. In the auditory modality, prediction error is typically elicited using so-called oddball paradigms, where sequences of repeated pure tones with the same pitch are at unpredictable intervals substituted by a tone of deviant frequency. Repeated sounds become predictable promptly and elicit decreasing prediction error; deviant tones break these predictions and elicit large prediction errors. The simplicity of the rules inducing predictability make oddball paradigms agnostic about the origin of the predictions. Here, we introduce two possible models of the organizational topology of the predictive processing auditory network: (1) the global view, that assumes that predictions on the sensory input are generated at high-order levels of the cerebral cortex and transmitted in a cascade of generative models to the subcortical sensory pathways; and (2) the local view, that assumes that independent local models, computed using local information, are used to perform predictions at each processing stage. In the global view information encoding is optimized globally but biases sensory representations along the entire brain according to the subjective views of the observer. The local view results in a diminished coding efficiency, but guarantees in return a robust encoding of the features of sensory input at each processing stage. Although most experimental results to-date are ambiguous in this respect, recent evidence favors the global model.

Collapse

Using fuzzy string matching for automated assessment of listener transcripts in speech intelligibility studies. Behav Res Methods 2021;53:1945-1953. [PMID: 33694079 PMCID: PMC8516752 DOI: 10.3758/s13428-021-01542-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/11/2021] [Indexed: 11/08/2022]

van Bree S, Sohoglu E, Davis MH, Zoefel B. Sustained neural rhythms reveal endogenous oscillations supporting speech perception. PLoS Biol 2021;19:e3001142. [PMID: 33635855 PMCID: PMC7946281 DOI: 10.1371/journal.pbio.3001142] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 03/10/2021] [Accepted: 02/08/2021] [Indexed: 12/23/2022] Open

Kajiura M, Jeong H, Kawata NYS, Yu S, Kinoshita T, Kawashima R, Sugiura M. Brain activity predicts future learning success in intensive second language listening training. BRAIN AND LANGUAGE 2021;212:104839. [PMID: 33271393 DOI: 10.1016/j.bandl.2020.104839] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 06/03/2020] [Accepted: 07/14/2020] [Indexed: 06/12/2023]

Tabas A, Mihai G, Kiebel S, Trampel R, von Kriegstein K. Abstract rules drive adaptation in the subcortical sensory pathway. eLife 2020;9:64501. [PMID: 33289479 PMCID: PMC7785290 DOI: 10.7554/elife.64501] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 12/03/2020] [Indexed: 01/19/2023] Open

Sohoglu E, Davis MH. Rapid computations of spectrotemporal prediction error support perception of degraded speech. eLife 2020;9:e58077. [PMID: 33147138 PMCID: PMC7641582 DOI: 10.7554/elife.58077] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 10/19/2020] [Indexed: 12/15/2022] Open

Lupyan G, Abdel Rahman R, Boroditsky L, Clark A. Effects of Language on Visual Perception. Trends Cogn Sci 2020;24:930-944. [PMID: 33012687 DOI: 10.1016/j.tics.2020.08.005] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 08/22/2020] [Accepted: 08/25/2020] [Indexed: 11/24/2022]

Responses to Visual Speech in Human Posterior Superior Temporal Gyrus Examined with iEEG Deconvolution. J Neurosci 2020;40:6938-6948. [PMID: 32727820 PMCID: PMC7470920 DOI: 10.1523/jneurosci.0279-20.2020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 06/01/2020] [Accepted: 06/02/2020] [Indexed: 12/22/2022] Open

Abstract

Experimentalists studying multisensory integration compare neural responses to multisensory stimuli with responses to the component modalities presented in isolation. This procedure is problematic for multisensory speech perception since audiovisual speech and auditory-only speech are easily intelligible but visual-only speech is not. To overcome this confound, we developed intracranial encephalography (iEEG) deconvolution. Individual stimuli always contained both auditory and visual speech, but jittering the onset asynchrony between modalities allowed for the time course of the unisensory responses and the interaction between them to be independently estimated. We applied this procedure to electrodes implanted in human epilepsy patients (both male and female) over the posterior superior temporal gyrus (pSTG), a brain area known to be important for speech perception. iEEG deconvolution revealed sustained positive responses to visual-only speech and larger, phasic responses to auditory-only speech. Confirming results from scalp EEG, responses to audiovisual speech were weaker than responses to auditory-only speech, demonstrating a subadditive multisensory neural computation. Leveraging the spatial resolution of iEEG, we extended these results to show that subadditivity is most pronounced in more posterior aspects of the pSTG. Across electrodes, subadditivity correlated with visual responsiveness, supporting a model in which visual speech enhances the efficiency of auditory speech processing in pSTG. The ability to separate neural processes may make iEEG deconvolution useful for studying a variety of complex cognitive and perceptual tasks.SIGNIFICANCE STATEMENT Understanding speech is one of the most important human abilities. Speech perception uses information from both the auditory and visual modalities. It has been difficult to study neural responses to visual speech because visual-only speech is difficult or impossible to comprehend, unlike auditory-only and audiovisual speech. We used intracranial encephalography deconvolution to overcome this obstacle. We found that visual speech evokes a positive response in the human posterior superior temporal gyrus, enhancing the efficiency of auditory speech processing.

Collapse

Lenc T, Keller PE, Varlet M, Nozaradan S. Neural and Behavioral Evidence for Frequency-Selective Context Effects in Rhythm Processing in Humans. Cereb Cortex Commun 2020;1:tgaa037. [PMID: 34296106 PMCID: PMC8152888 DOI: 10.1093/texcom/tgaa037] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 06/30/2020] [Accepted: 07/16/2020] [Indexed: 01/17/2023] Open

Rotman T, Lavie L, Banai K. Rapid Perceptual Learning: A Potential Source of Individual Differences in Speech Perception Under Adverse Conditions? Trends Hear 2020;24:2331216520930541. [PMID: 32552477 PMCID: PMC7303778 DOI: 10.1177/2331216520930541] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Shaping perceptual learning of synthetic speech through feedback. Psychon Bull Rev 2020;27:1043-1051. [PMID: 32500520 DOI: 10.3758/s13423-020-01743-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Casaponsa A, Sohoglu E, Moore DR, Füllgrabe C, Molloy K, Amitay S. Does training with amplitude modulated tones affect tone-vocoded speech perception? PLoS One 2019;14:e0226288. [PMID: 31881550 PMCID: PMC6934405 DOI: 10.1371/journal.pone.0226288] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Accepted: 11/22/2019] [Indexed: 11/17/2022] Open

Jenson D, Thornton D, Harkrider AW, Saltuklaroglu T. Influences of cognitive load on sensorimotor contributions to working memory: An EEG investigation of mu rhythm activity during speech discrimination. Neurobiol Learn Mem 2019;166:107098. [DOI: 10.1016/j.nlm.2019.107098] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 09/11/2019] [Accepted: 10/09/2019] [Indexed: 11/16/2022]

Guediche S, Zhu Y, Minicucci D, Blumstein SE. Written sentence context effects on acoustic-phonetic perception: fMRI reveals cross-modal semantic-perceptual interactions. BRAIN AND LANGUAGE 2019;199:104698. [PMID: 31586792 DOI: 10.1016/j.bandl.2019.104698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 09/15/2019] [Accepted: 09/18/2019] [Indexed: 06/10/2023]

Karas PJ, Magnotti JF, Metzger BA, Zhu LL, Smith KB, Yoshor D, Beauchamp MS. The visual speech head start improves perception and reduces superior temporal cortex responses to auditory speech. eLife 2019;8:e48116. [PMID: 31393261 PMCID: PMC6687434 DOI: 10.7554/elife.48116] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 07/17/2019] [Indexed: 12/30/2022] Open

Hierarchical contributions of linguistic knowledge to talker identification: Phonological versus lexical familiarity. Atten Percept Psychophys 2019;81:1088-1107. [PMID: 31218598 DOI: 10.3758/s13414-019-01778-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Abstract

Listeners identify talkers more accurately when listening to their native language compared to an unfamiliar, foreign language. This language-familiarity effect in talker identification has been shown to arise from familiarity with both the sound patterns (phonetics and phonology) and the linguistic content (words) of one's native language. However, it has been unknown whether these two sources of information contribute independently to talker identification abilities, particularly whether hearing familiar words can facilitate talker identification in the absence of familiar phonetics. To isolate the contribution of lexical familiarity, we conducted three experiments that tested listeners' ability to identify talkers saying familiar words, but with unfamiliar phonetics. In two experiments, listeners identified talkers from recordings of their native language (English), an unfamiliar foreign language (Mandarin Chinese), or "hybrid" speech stimuli (sentences spoken in Mandarin, but which can be convincingly coerced to sound like English when presented with subtitles that prime plausible English-language lexical interpretations based on the Mandarin phonetics). In a third experiment, we explored natural variation in lexical-phonetic congruence as listeners identified talkers with varying degrees of a Mandarin accent. Priming listeners to hear English speech did not improve their ability to identify talkers speaking Mandarin, even after additional training, and talker identification accuracy decreased as talkers' phonetics became increasingly dissimilar to American English. Together, these experiments indicate that unfamiliar sound patterns preclude talker identification benefits otherwise afforded by familiar words. These results suggest that linguistic representations contribute hierarchically to talker identification; the facilitatory effect of familiar words requires the availability of familiar phonological forms.

Collapse

Babel M, McAuliffe M, Norton C, Senior B, Vaughn C. The Goldilocks Zone of Perceptual Learning. PHONETICA 2019;76:179-200. [PMID: 31112962 DOI: 10.1159/000494929] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Accepted: 10/29/2018] [Indexed: 06/09/2023]

Vaughn CR. Expectations about the source of a speaker's accent affect accent adaptation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019;145:3218. [PMID: 31153344 DOI: 10.1121/1.5108831] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2018] [Accepted: 04/30/2019] [Indexed: 06/09/2023]

Khoshkhoo S, Leonard MK, Mesgarani N, Chang EF. Neural correlates of sine-wave speech intelligibility in human frontal and temporal cortex. BRAIN AND LANGUAGE 2018;187:83-91. [PMID: 29397190 PMCID: PMC6067983 DOI: 10.1016/j.bandl.2018.01.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Revised: 12/06/2017] [Accepted: 01/20/2018] [Indexed: 05/09/2023]

Balancing Prediction and Sensory Input in Speech Comprehension: The Spatiotemporal Dynamics of Word Recognition in Context. J Neurosci 2018;39:519-527. [PMID: 30459221 DOI: 10.1523/jneurosci.3573-17.2018] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 10/17/2018] [Accepted: 10/18/2018] [Indexed: 11/21/2022] Open

Abstract

Spoken word recognition in context is remarkably fast and accurate, with recognition times of ∼200 ms, typically well before the end of the word. The neurocomputational mechanisms underlying these contextual effects are still poorly understood. This study combines source-localized electroencephalographic and magnetoencephalographic (EMEG) measures of real-time brain activity with multivariate representational similarity analysis to determine directly the timing and computational content of the processes evoked as spoken words are heard in context, and to evaluate the respective roles of bottom-up and predictive processing mechanisms in the integration of sensory and contextual constraints. Male and female human participants heard simple (modifier-noun) English phrases that varied in the degree of semantic constraint that the modifier (W1) exerted on the noun (W2), as in pairs, such as "yellow banana." We used gating tasks to generate estimates of the probabilistic predictions generated by these constraints as well as measures of their interaction with the bottom-up perceptual input for W2. Representation similarity analysis models of these measures were tested against electroencephalographic and magnetoencephalographic brain data across a bilateral fronto-temporo-parietal language network. Consistent with probabilistic predictive processing accounts, we found early activation of semantic constraints in frontal cortex (LBA45) as W1 was heard. The effects of these constraints (at 100 ms after W2 onset in left middle temporal gyrus and at 140 ms in left Heschl's gyrus) were only detectable, however, after the initial phonemes of W2 had been heard. Within an overall predictive processing framework, bottom-up sensory inputs are still required to achieve early and robust spoken word recognition in context.SIGNIFICANCE STATEMENT Human listeners recognize spoken words in natural speech contexts with remarkable speed and accuracy, often identifying a word well before all of it has been heard. In this study, we investigate the brain systems that support this important capacity, using neuroimaging techniques that can track real-time brain activity during speech comprehension. This makes it possible to locate the brain areas that generate predictions about upcoming words and to show how these expectations are integrated with the evidence provided by the speech being heard. We use the timing and localization of these effects to provide the most specific account to date of how the brain achieves an optimal balance between prediction and sensory input in the interpretation of spoken language.

Collapse