Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Kelley MC, Perry SJ, Tucker BV. The Mason-Alberta Phonetic Segmenter: a forced alignment system based on deep neural networks and interpolation. PHONETICA 2024:phon-2024-0015. [PMID: 39248125 DOI: 10.1515/phon-2024-0015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 08/08/2024] [Indexed: 09/10/2024]

Shahmohammadi H, Heitmeier M, Shafaei-Bajestan E, Lensch HPA, Baayen RH. Language with vision: A study on grounded word and sentence embeddings. Behav Res Methods 2024;56:5622-5646. [PMID: 38114881 PMCID: PMC11335852 DOI: 10.3758/s13428-023-02294-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/09/2023] [Indexed: 12/21/2023]

Abstract

Grounding language in vision is an active field of research seeking to construct cognitively plausible word and sentence representations by incorporating perceptual knowledge from vision into text-based representations. Despite many attempts at language grounding, achieving an optimal equilibrium between textual representations of the language and our embodied experiences remains an open field. Some common concerns are the following. Is visual grounding advantageous for abstract words, or is its effectiveness restricted to concrete words? What is the optimal way of bridging the gap between text and vision? To what extent is perceptual knowledge from images advantageous for acquiring high-quality embeddings? Leveraging the current advances in machine learning and natural language processing, the present study addresses these questions by proposing a simple yet very effective computational grounding model for pre-trained word embeddings. Our model effectively balances the interplay between language and vision by aligning textual embeddings with visual information while simultaneously preserving the distributional statistics that characterize word usage in text corpora. By applying a learned alignment, we are able to indirectly ground unseen words including abstract words. A series of evaluations on a range of behavioral datasets shows that visual grounding is beneficial not only for concrete words but also for abstract words, lending support to the indirect theory of abstract concepts. Moreover, our approach offers advantages for contextualized embeddings, such as those generated by BERT (Devlin et al, 2018), but only when trained on corpora of modest, cognitively plausible sizes. Code and grounded embeddings for English are available at ( https://github.com/Hazel1994/Visually_Grounded_Word_Embeddings_2 ).

Collapse

Hsiao JHW. Understanding Human Cognition Through Computational Modeling. Top Cogn Sci 2024;16:349-376. [PMID: 38781432 DOI: 10.1111/tops.12737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 05/07/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024]

Fitz H, Hagoort P, Petersson KM. Neurobiological Causal Models of Language Processing. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024;5:225-247. [PMID: 38645618 PMCID: PMC11025648 DOI: 10.1162/nol_a_00133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 12/18/2023] [Indexed: 04/23/2024]

Luthra S. Why are listeners hindered by talker variability? Psychon Bull Rev 2024;31:104-121. [PMID: 37580454 PMCID: PMC10864679 DOI: 10.3758/s13423-023-02355-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/27/2023] [Indexed: 08/16/2023]

Magnuson JS, Crinnion AM, Luthra S, Gaston P, Grubb S. Contra assertions, feedback improves word recognition: How feedback and lateral inhibition sharpen signals over noise. Cognition 2024;242:105661. [PMID: 37944313 PMCID: PMC11238470 DOI: 10.1016/j.cognition.2023.105661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 10/17/2023] [Accepted: 11/02/2023] [Indexed: 11/12/2023]

Xie X, Jaeger TF, Kurumada C. What we do (not) know about the mechanisms underlying adaptive speech perception: A computational framework and review. Cortex 2023;166:377-424. [PMID: 37506665 DOI: 10.1016/j.cortex.2023.05.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 12/23/2022] [Accepted: 05/05/2023] [Indexed: 07/30/2023]

Fernandez-Duque M, Hayakawa S, Marian V. Speakers of different languages remember visual scenes differently. SCIENCE ADVANCES 2023;9:eadh0064. [PMID: 37585537 PMCID: PMC10431704 DOI: 10.1126/sciadv.adh0064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 07/14/2023] [Indexed: 08/18/2023]

Hintz F, Voeten CC, Scharenborg O. Recognizing non-native spoken words in background noise increases interference from the native language. Psychon Bull Rev 2023;30:1549-1563. [PMID: 36544064 PMCID: PMC10482792 DOI: 10.3758/s13423-022-02233-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2022] [Indexed: 12/24/2022]

Persson A, Jaeger TF. Evaluating normalization accounts against the dense vowel space of Central Swedish. Front Psychol 2023;14:1165742. [PMID: 37416548 PMCID: PMC10322199 DOI: 10.3389/fpsyg.2023.1165742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 05/23/2023] [Indexed: 07/08/2023] Open

Ishikawa K, Pietrowicz M, Charney S, Orbelo D. Landmark-based analysis of speech differentiates conversational from clear speech in speakers with muscle tension dysphonia. JASA EXPRESS LETTERS 2023;3:2888596. [PMID: 37140265 DOI: 10.1121/10.0019354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 04/18/2023] [Indexed: 05/05/2023]

Beguš G, Zhou A, Zhao TC. Encoding of speech in convolutional layers and the brain stem based on language experience. Sci Rep 2023;13:6480. [PMID: 37081119 PMCID: PMC10119295 DOI: 10.1038/s41598-023-33384-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 04/12/2023] [Indexed: 04/22/2023] Open

Abstract

Comparing artificial neural networks with outputs of neuroimaging techniques has recently seen substantial advances in (computer) vision and text-based language models. Here, we propose a framework to compare biological and artificial neural computations of spoken language representations and propose several new challenges to this paradigm. The proposed technique is based on a similar principle that underlies electroencephalography (EEG): averaging of neural (artificial or biological) activity across neurons in the time domain, and allows to compare encoding of any acoustic property in the brain and in intermediate convolutional layers of an artificial neural network. Our approach allows a direct comparison of responses to a phonetic property in the brain and in deep neural networks that requires no linear transformations between the signals. We argue that the brain stem response (cABR) and the response in intermediate convolutional layers to the exact same stimulus are highly similar without applying any transformations, and we quantify this observation. The proposed technique not only reveals similarities, but also allows for analysis of the encoding of actual acoustic properties in the two signals: we compare peak latency (i) in cABR relative to the stimulus in the brain stem and in (ii) intermediate convolutional layers relative to the input/output in deep convolutional networks. We also examine and compare the effect of prior language exposure on the peak latency in cABR and in intermediate convolutional layers. Substantial similarities in peak latency encoding between the human brain and intermediate convolutional networks emerge based on results from eight trained networks (including a replication experiment). The proposed technique can be used to compare encoding between the human brain and intermediate convolutional layers for any acoustic property and for other neuroimaging techniques.

Collapse

Spivey MJ. Cognitive Science Progresses Toward Interactive Frameworks. Top Cogn Sci 2023;15:219-254. [PMID: 36949655 PMCID: PMC10123086 DOI: 10.1111/tops.12645] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 02/27/2023] [Accepted: 02/27/2023] [Indexed: 03/24/2023]

Avcu E, Hwang M, Brown KS, Gow DW. A tale of two lexica: Investigating computational pressures on word representation with neural networks. Front Artif Intell 2023;6:1062230. [PMID: 37051161 PMCID: PMC10083378 DOI: 10.3389/frai.2023.1062230] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 03/10/2023] [Indexed: 03/28/2023] Open

Sadagopan S, Kar M, Parida S. Quantitative models of auditory cortical processing. Hear Res 2023;429:108697. [PMID: 36696724 PMCID: PMC9928778 DOI: 10.1016/j.heares.2023.108697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 12/17/2022] [Accepted: 01/12/2023] [Indexed: 01/15/2023]

Nenadić F, Tucker BV, Ten Bosch L. Computational Modeling of an Auditory Lexical Decision Experiment Using DIANA. LANGUAGE AND SPEECH 2022:238309221111752. [PMID: 36000386 PMCID: PMC10394956 DOI: 10.1177/00238309221111752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

DIANA, a Process-Oriented Model of Human Auditory Word Recognition. Brain Sci 2022;12:brainsci12050681. [PMID: 35625067 PMCID: PMC9140177 DOI: 10.3390/brainsci12050681] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 05/05/2022] [Accepted: 05/10/2022] [Indexed: 02/04/2023] Open

Strauß A, Wu T, McQueen JM, Scharenborg O, Hintz F. The differential roles of lexical and sublexical processing during spoken-word recognition in clear and in noise. Cortex 2022;151:70-88. [DOI: 10.1016/j.cortex.2022.02.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/21/2022] [Accepted: 02/13/2022] [Indexed: 02/03/2023]

Karaminis T, Hintz F, Scharenborg O. The Presence of Background Noise Extends the Competitor Space in Native and Non-Native Spoken-Word Recognition: Insights from Computational Modeling. Cogn Sci 2022;46:e13110. [PMID: 35188686 PMCID: PMC9286693 DOI: 10.1111/cogs.13110] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 12/17/2021] [Accepted: 01/23/2022] [Indexed: 11/29/2022]

Kurumada C, Roettger TB. Thinking probabilistically in the study of intonational speech prosody. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2021;13:e1579. [PMID: 34599647 DOI: 10.1002/wcs.1579] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/09/2021] [Accepted: 08/26/2021] [Indexed: 11/07/2022]

Falandays JB, Nguyen B, Spivey MJ. Is prediction nothing more than multi-scale pattern completion of the future? Brain Res 2021;1768:147578. [PMID: 34284021 DOI: 10.1016/j.brainres.2021.147578] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 05/28/2021] [Accepted: 06/29/2021] [Indexed: 11/18/2022]

Does signal reduction imply predictive coding in models of spoken word recognition? Psychon Bull Rev 2021;28:1381-1389. [PMID: 33852158 PMCID: PMC8367925 DOI: 10.3758/s13423-021-01924-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/24/2021] [Indexed: 12/29/2022]

Fox NP, Leonard M, Sjerps MJ, Chang EF. Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. eLife 2020;9:e53051. [PMID: 32840483 PMCID: PMC7556862 DOI: 10.7554/elife.53051] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 08/21/2020] [Indexed: 11/28/2022] Open