1
|
Zhang Y, Sarmukadam K, Wang Y, Behroozmand R. Effects of attentional instructions on the behavioral and neural mechanisms of speech auditory feedback control. Neuropsychologia 2024; 201:108944. [PMID: 38925511 PMCID: PMC11772217 DOI: 10.1016/j.neuropsychologia.2024.108944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 05/22/2024] [Accepted: 06/23/2024] [Indexed: 06/28/2024]
Abstract
The present study investigated how instructions for paying attention to auditory feedback may affect speech error detection and sensorimotor control. Electroencephalography (EEG) and speech signals were recorded from 21 neurologically intact adult subjects while they produced the speech vowel sound /a/ and received randomized ±100 cents pitch-shift alterations in their real-time auditory feedback. Subjects were instructed to pay attention to their auditory feedback and press a button to indicate whether they detected a pitch-shift stimulus during trials. Data for this group was compared with 22 matched subjects who completed the same speech task under altered auditory feedback condition without attentional instructions. Results revealed a significantly smaller magnitude of speech compensations in the attentional-instruction vs. no-instruction group and a positive linear association between the magnitude of compensations and P2 event-related potential (ERP) amplitudes. In addition, we found that the amplitude of P2 ERP component was significantly larger in the attentional-instruction vs. no-instruction group. Source localization analysis showed that this effect was accounted for by significantly stronger neural activities in the right hemisphere insula, precentral gyrus, postcentral gyrus, transverse temporal gyrus, and superior temporal gyrus in the attentional-instruction group. These findings suggest that attentional instructions may enhance speech auditory feedback error detection, and subsequently improve sensorimotor control via generating more stable speech outputs (i.e., smaller compensations) in response to pitch-shift alterations. Our data are informative for advancing theoretical models and motivating targeted interventions with a focus on the role of attentional instructions for improving treatment outcomes in patients with motor speech disorders.
Collapse
Affiliation(s)
- Yilun Zhang
- Speech Neuroscience Lab, Department of Speech, Language, and Hearing, Callier Center for Communication Disorders, School of Behavioral and Brain Sciences, The University of Texas at Dallas, 2811 N. Floyd Rd, Richardson, TX 75080, USA
| | - Kimaya Sarmukadam
- Department of Communication Sciences and Disorders, Arnold School of Public Health, University of South Carolina, 915 Greene Street, Columbia, SC 29208, USA
| | - Yuan Wang
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, 915 Greene Street, Columbia, SC 29208, USA
| | - Roozbeh Behroozmand
- Speech Neuroscience Lab, Department of Speech, Language, and Hearing, Callier Center for Communication Disorders, School of Behavioral and Brain Sciences, The University of Texas at Dallas, 2811 N. Floyd Rd, Richardson, TX 75080, USA.
| |
Collapse
|
2
|
Mathias SR, Knowles EEM, Mollon J, Rodrigue AL, Woolsey MK, Hernandez AM, Garret AS, Fox PT, Olvera RL, Peralta JM, Kumar S, Göring HHH, Duggirala R, Curran JE, Blangero J, Glahn DC. Cocktail-party listening and cognitive abilities show strong pleiotropy. Front Neurol 2023; 14:1071766. [PMID: 36970519 PMCID: PMC10035755 DOI: 10.3389/fneur.2023.1071766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Accepted: 02/21/2023] [Indexed: 03/11/2023] Open
Abstract
Introduction The cocktail-party problem refers to the difficulty listeners face when trying to attend to relevant sounds that are mixed with irrelevant ones. Previous studies have shown that solving these problems relies on perceptual as well as cognitive processes. Previously, we showed that speech-reception thresholds (SRTs) on a cocktail-party listening task were influenced by genetic factors. Here, we estimated the degree to which these genetic factors overlapped with those influencing cognitive abilities. Methods We measured SRTs and hearing thresholds (HTs) in 493 listeners, who ranged in age from 18 to 91 years old. The same individuals completed a cognitive test battery comprising 18 measures of various cognitive domains. Individuals belonged to large extended pedigrees, which allowed us to use variance component models to estimate the narrow-sense heritability of each trait, followed by phenotypic and genetic correlations between pairs of traits. Results All traits were heritable. The phenotypic and genetic correlations between SRTs and HTs were modest, and only the phenotypic correlation was significant. By contrast, all genetic SRT-cognition correlations were strong and significantly different from 0. For some of these genetic correlations, the hypothesis of complete pleiotropy could not be rejected. Discussion Overall, the results suggest that there was substantial genetic overlap between SRTs and a wide range of cognitive abilities, including abilities without a major auditory or verbal component. The findings highlight the important, yet sometimes overlooked, contribution of higher-order processes to solving the cocktail-party problem, raising an important caveat for future studies aiming to identify specific genetic factors that influence cocktail-party listening.
Collapse
Affiliation(s)
- Samuel R. Mathias
- Department of Psychiatry, Boston Children's Hospital, Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Emma E. M. Knowles
- Department of Psychiatry, Boston Children's Hospital, Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Josephine Mollon
- Department of Psychiatry, Boston Children's Hospital, Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Amanda L. Rodrigue
- Department of Psychiatry, Boston Children's Hospital, Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| | - Mary K. Woolsey
- Research Imaging Institute, University of Texas Health Science Center, San Antonio, TX, United States
| | - Alyssa M. Hernandez
- Research Imaging Institute, University of Texas Health Science Center, San Antonio, TX, United States
| | - Amy S. Garret
- Research Imaging Institute, University of Texas Health Science Center, San Antonio, TX, United States
| | - Peter T. Fox
- Research Imaging Institute, University of Texas Health Science Center, San Antonio, TX, United States
- South Texas Veterans Health Care System, San Antonio, TX, United States
| | - Rene L. Olvera
- Research Imaging Institute, University of Texas Health Science Center, San Antonio, TX, United States
| | - Juan M. Peralta
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, United States
| | - Satish Kumar
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, United States
| | - Harald H. H. Göring
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, United States
| | - Ravi Duggirala
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, United States
| | - Joanne E. Curran
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, United States
| | - John Blangero
- Department of Human Genetics, South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, United States
| | - David C. Glahn
- Department of Psychiatry, Boston Children's Hospital, Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| |
Collapse
|
3
|
Azaiez N, Loberg O, Hämäläinen JA, Leppänen PHT. Auditory P3a response to native and foreign speech in children with or without attentional deficit. Neuropsychologia 2023; 183:108506. [PMID: 36773807 DOI: 10.1016/j.neuropsychologia.2023.108506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 01/24/2023] [Accepted: 02/06/2023] [Indexed: 02/11/2023]
Abstract
The aim of this study was to investigate the attentional mechanism in speech processing of native and foreign language in children with and without attentional deficit. For this purpose, the P3a component, cognitive neuromarker of the attentional processes, was investigated in a two-sequence two-deviant oddball paradigm using Finnish and English speech items via event-related potentials (ERP) technique. The difference waves reflected the temporal brain dynamics of the P3a response in native and foreign language contexts. Cluster-based permutation tests evaluated the group differences over the P3a time window. A correlation analysis was conducted between the P3a response and the attention score (ATTEX) to evaluate whether the behavioral assessment reflected the neural activity. The source reconstruction method (CLARA) was used to investigate the neural origins of the attentional differences between groups and conditions. The ERP results showed a larger P3a response in the group of children with attentional problems (AP) compared to controls (CTR). The P3a response differed statistically between the two groups in the native language processing, but not in the foreign language. The ATTEX score correlated with the P3a amplitude in the native language contrasts. The correlation analyses hint at some hemispheric brain activity difference in the frontal area. The group-level CLARA reconstruction showed activation in the speech perception and attention networks over the frontal, parietal, and temporal areas. Differences in activations of these networks were found between the groups and conditions, with the AP group showing higher activity in the source level, being the origin of the ERP enhancement observed on the scalp level.
Collapse
Affiliation(s)
- Najla Azaiez
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Finland.
| | - Otto Loberg
- Department of Psychology, Faculty of Science and Technology, Bournemouth University, United Kingdom
| | - Jarmo A Hämäläinen
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Finland; Jyväskylä Center for Interdisciplinary Brain Research, Department of Psychology, University of Jyväskylä, Finland
| | - Paavo H T Leppänen
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Finland; Jyväskylä Center for Interdisciplinary Brain Research, Department of Psychology, University of Jyväskylä, Finland
| |
Collapse
|
4
|
Makov S, Pinto D, Har-Shai Yahav P, Miller LM, Zion Golumbic E. "Unattended, distracting or irrelevant": Theoretical implications of terminological choices in auditory selective attention research. Cognition 2023; 231:105313. [PMID: 36344304 DOI: 10.1016/j.cognition.2022.105313] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 09/30/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022]
Abstract
For seventy years, auditory selective attention research has focused on studying the cognitive mechanisms of prioritizing the processing a 'main' task-relevant stimulus, in the presence of 'other' stimuli. However, a closer look at this body of literature reveals deep empirical inconsistencies and theoretical confusion regarding the extent to which this 'other' stimulus is processed. We argue that many key debates regarding attention arise, at least in part, from inappropriate terminological choices for experimental variables that may not accurately map onto the cognitive constructs they are meant to describe. Here we critically review the more common or disruptive terminological ambiguities, differentiate between methodology-based and theory-derived terms, and unpack the theoretical assumptions underlying different terminological choices. Particularly, we offer an in-depth analysis of the terms 'unattended' and 'distractor' and demonstrate how their use can lead to conflicting theoretical inferences. We also offer a framework for thinking about terminology in a more productive and precise way, in hope of fostering more productive debates and promoting more nuanced and accurate cognitive models of selective attention.
Collapse
Affiliation(s)
- Shiri Makov
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Danna Pinto
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Paz Har-Shai Yahav
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Lee M Miller
- The Center for Mind and Brain, University of California, Davis, CA, United States of America; Department of Neurobiology, Physiology, & Behavior, University of California, Davis, CA, United States of America; Department of Otolaryngology / Head and Neck Surgery, University of California, Davis, CA, United States of America
| | - Elana Zion Golumbic
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel.
| |
Collapse
|
5
|
Begau A, Arnau S, Klatt LI, Wascher E, Getzmann S. Using visual speech at the cocktail-party: CNV evidence for early speech extraction in younger and older adults. Hear Res 2022; 426:108636. [DOI: 10.1016/j.heares.2022.108636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 09/26/2022] [Accepted: 10/18/2022] [Indexed: 11/04/2022]
|
6
|
Brain activity during shadowing of audiovisual cocktail party speech, contributions of auditory-motor integration and selective attention. Sci Rep 2022; 12:18789. [PMID: 36335137 PMCID: PMC9637225 DOI: 10.1038/s41598-022-22041-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 10/07/2022] [Indexed: 11/06/2022] Open
Abstract
Selective listening to cocktail-party speech involves a network of auditory and inferior frontal cortical regions. However, cognitive and motor cortical regions are differentially activated depending on whether the task emphasizes semantic or phonological aspects of speech. Here we tested whether processing of cocktail-party speech differs when participants perform a shadowing (immediate speech repetition) task compared to an attentive listening task in the presence of irrelevant speech. Participants viewed audiovisual dialogues with concurrent distracting speech during functional imaging. Participants either attentively listened to the dialogue, overtly repeated (i.e., shadowed) attended speech, or performed visual or speech motor control tasks where they did not attend to speech and responses were not related to the speech input. Dialogues were presented with good or poor auditory and visual quality. As a novel result, we show that attentive processing of speech activated the same network of sensory and frontal regions during listening and shadowing. However, in the superior temporal gyrus (STG), peak activations during shadowing were posterior to those during listening, suggesting that an anterior-posterior distinction is present for motor vs. perceptual processing of speech already at the level of the auditory cortex. We also found that activations along the dorsal auditory processing stream were specifically associated with the shadowing task. These activations are likely to be due to complex interactions between perceptual, attention dependent speech processing and motor speech generation that matches the heard speech. Our results suggest that interactions between perceptual and motor processing of speech relies on a distributed network of temporal and motor regions rather than any specific anatomical landmark as suggested by some previous studies.
Collapse
|
7
|
Ylinen A, Wikman P, Leminen M, Alho K. Task-dependent cortical activations during selective attention to audiovisual speech. Brain Res 2022; 1775:147739. [PMID: 34843702 DOI: 10.1016/j.brainres.2021.147739] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 10/21/2021] [Accepted: 11/21/2021] [Indexed: 11/28/2022]
Abstract
Selective listening to speech depends on widespread networks of the brain, but how the involvement of different neural systems in speech processing is affected by factors such as the task performed by a listener and speech intelligibility remains poorly understood. We used functional magnetic resonance imaging to systematically examine the effects that performing different tasks has on neural activations during selective attention to continuous audiovisual speech in the presence of task-irrelevant speech. Participants viewed audiovisual dialogues and attended either to the semantic or the phonological content of speech, or ignored speech altogether and performed a visual control task. The tasks were factorially combined with good and poor auditory and visual speech qualities. Selective attention to speech engaged superior temporal regions and the left inferior frontal gyrus regardless of the task. Frontoparietal regions implicated in selective auditory attention to simple sounds (e.g., tones, syllables) were not engaged by the semantic task, suggesting that this network may not be not as crucial when attending to continuous speech. The medial orbitofrontal cortex, implicated in social cognition, was most activated by the semantic task. Activity levels during the phonological task in the left prefrontal, premotor, and secondary somatosensory regions had a distinct temporal profile as well as the highest overall activity, possibly relating to the role of the dorsal speech processing stream in sub-lexical processing. Our results demonstrate that the task type influences neural activations during selective attention to speech, and emphasize the importance of ecologically valid experimental designs.
Collapse
Affiliation(s)
- Artturi Ylinen
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland.
| | - Patrik Wikman
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Department of Neuroscience, Georgetown University, Washington D.C., USA
| | - Miika Leminen
- Analytics and Data Services, HUS Helsinki University Hospital, Helsinki, Finland
| | - Kimmo Alho
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland; Advanced Magnetic Imaging Centre, Aalto NeuroImaging, Aalto University, Espoo, Finland
| |
Collapse
|
8
|
Kiremitçi I, Yilmaz Ö, Çelik E, Shahdloo M, Huth AG, Çukur T. Attentional Modulation of Hierarchical Speech Representations in a Multitalker Environment. Cereb Cortex 2021; 31:4986-5005. [PMID: 34115102 PMCID: PMC8491717 DOI: 10.1093/cercor/bhab136] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 04/01/2021] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Humans are remarkably adept in listening to a desired speaker in a crowded environment, while filtering out nontarget speakers in the background. Attention is key to solving this difficult cocktail-party task, yet a detailed characterization of attentional effects on speech representations is lacking. It remains unclear across what levels of speech features and how much attentional modulation occurs in each brain area during the cocktail-party task. To address these questions, we recorded whole-brain blood-oxygen-level-dependent (BOLD) responses while subjects either passively listened to single-speaker stories, or selectively attended to a male or a female speaker in temporally overlaid stories in separate experiments. Spectral, articulatory, and semantic models of the natural stories were constructed. Intrinsic selectivity profiles were identified via voxelwise models fit to passive listening responses. Attentional modulations were then quantified based on model predictions for attended and unattended stories in the cocktail-party task. We find that attention causes broad modulations at multiple levels of speech representations while growing stronger toward later stages of processing, and that unattended speech is represented up to the semantic level in parabelt auditory cortex. These results provide insights on attentional mechanisms that underlie the ability to selectively listen to a desired speaker in noisy multispeaker environments.
Collapse
Affiliation(s)
- Ibrahim Kiremitçi
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
| | - Özgür Yilmaz
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara TR-06800, Turkey
| | - Emin Çelik
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
| | - Mo Shahdloo
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX3 9DU, UK
| | - Alexander G Huth
- Department of Neuroscience, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94702, USA
| | - Tolga Çukur
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara TR-06800, Turkey
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94702, USA
| |
Collapse
|
9
|
Li J, Yang J, Qin Y, Zhang Y. Expert and Novice Goalkeepers' Perceptions of Changes During Open Play Soccer. Percept Mot Skills 2021; 128:2725-2744. [PMID: 34459301 DOI: 10.1177/00315125211040750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In the present study we investigated expert and novice football (i.e., soccer) goalkeepers' three stages of perceiving changes in open play situations-detection, localization, and identification-with and without time constraints. We adopted the continual cycling flicker paradigm to investigate goalkeepers' perceptions when provided with sufficient time (Experiment 1), and we utilized the limited display one-shot change detection paradigm to study their perceptions under time constraints (Experiment 2). Images of goalkeepers' first-person views of open play soccer scenes were used as stimuli. Semantic or non-semantic changes in these scenes were produced by modifying one element in each image. Separate groups of expert and novice goalkeepers were required to detect, localize, and identify the scene changes. We found that expert goalkeepers detected scene changes more quickly than novices under both time allowances. Furthermore, compared to novices, experts localized the changes more accurately under time constraints and identified the changes more quickly when given sufficient time. Additionally, semantic changes were detected more quickly and localized and identified more accurately than non-semantic changes when there was sufficient time. Under time constraints expert goalkeepers' greater efficiency was likely due to pre-attentive processing; with sufficient time, they were able to focus attention to extracting detailed information for identification.
Collapse
Affiliation(s)
- Jie Li
- Center for Cognition and Brain Disorders, the Affiliated Hospital, Hangzhou Normal University, Hangzhou, China.,Institutes of Psychological Sciences, Hangzhou Normal University, Hangzhou, China.,Zhejiang Key Laboratory for Research in Assessment of Cognitive Impairments, Hangzhou Normal University, Hangzhou, China.,School of Psychology, 47838Beijing Sport University, Beijing Sport University, Beijing, China
| | - Jing Yang
- School of Psychology, 47838Beijing Sport University, Beijing Sport University, Beijing, China.,Beijing Jianhua Experimental Etown School, Beijing, China
| | - Yue Qin
- School of Psychology, 47838Beijing Sport University, Beijing Sport University, Beijing, China
| | - Yu Zhang
- School of Psychology, 47838Beijing Sport University, Beijing Sport University, Beijing, China
| |
Collapse
|
10
|
AIM: A network model of attention in auditory cortex. PLoS Comput Biol 2021; 17:e1009356. [PMID: 34449761 PMCID: PMC8462696 DOI: 10.1371/journal.pcbi.1009356] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 09/24/2021] [Accepted: 08/18/2021] [Indexed: 11/19/2022] Open
Abstract
Attentional modulation of cortical networks is critical for the cognitive flexibility required to process complex scenes. Current theoretical frameworks for attention are based almost exclusively on studies in visual cortex, where attentional effects are typically modest and excitatory. In contrast, attentional effects in auditory cortex can be large and suppressive. A theoretical framework for explaining attentional effects in auditory cortex is lacking, preventing a broader understanding of cortical mechanisms underlying attention. Here, we present a cortical network model of attention in primary auditory cortex (A1). A key mechanism in our network is attentional inhibitory modulation (AIM) of cortical inhibitory neurons. In this mechanism, top-down inhibitory neurons disinhibit bottom-up cortical circuits, a prominent circuit motif observed in sensory cortex. Our results reveal that the same underlying mechanisms in the AIM network can explain diverse attentional effects on both spatial and frequency tuning in A1. We find that a dominant effect of disinhibition on cortical tuning is suppressive, consistent with experimental observations. Functionally, the AIM network may play a key role in solving the cocktail party problem. We demonstrate how attention can guide the AIM network to monitor an acoustic scene, select a specific target, or switch to a different target, providing flexible outputs for solving the cocktail party problem. Selective attention plays a key role in how we navigate our everyday lives. For example, at a cocktail party, we can attend to friend’s speech amidst other speakers, music, and background noise. In stark contrast, hundreds of millions of people with hearing impairment and other disorders find such environments overwhelming and debilitating. Understanding the mechanisms underlying selective attention may lead to breakthroughs in improving the quality of life for those negatively affected. Here, we propose a mechanistic network model of attention in primary auditory cortex based on attentional inhibitory modulation (AIM). In the AIM model, attention targets specific cortical inhibitory neurons, which then modulate local cortical circuits to emphasize a particular feature of sounds and suppress competing features. We show that the AIM model can account for experimental observations across different species and stimulus domains. We also demonstrate that the same mechanisms can enable listeners to flexibly switch between attending to specific targets sounds and monitoring the environment in complex acoustic scenes, such as a cocktail party. The AIM network provides a theoretical framework which can work in tandem with new experiments to help unravel cortical circuits underlying attention.
Collapse
|