Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

31
(from Reference Citation Analysis)

Article PDFs (15)

Cited by > 0 (29)

Searched Name

speech segmentation

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Naeini SA, Simmatis L, Jafari D, Yunusova Y, Taati B. Improving Dysarthric Speech Segmentation With Emulated and Synthetic Augmentation. IEEE J Transl Eng Health Med 2024;12:382-389. [PMID: 38606392 PMCID: PMC11008804 DOI: 10.1109/jtehm.2024.3375323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 02/21/2024] [Accepted: 03/02/2024] [Indexed: 04/13/2024]

Abstract

Acoustic features extracted from speech can help with the diagnosis of neurological diseases and monitoring of symptoms over time. Temporal segmentation of audio signals into individual words is an important pre-processing step needed prior to extracting acoustic features. Machine learning techniques could be used to automate speech segmentation via automatic speech recognition (ASR) and sequence to sequence alignment. While state-of-the-art ASR models achieve good performance on healthy speech, their performance significantly drops when evaluated on dysarthric speech. Fine-tuning ASR models on impaired speech can improve performance in dysarthric individuals, but it requires representative clinical data, which is difficult to collect and may raise privacy concerns. This study explores the feasibility of using two augmentation methods to increase ASR performance on dysarthric speech: 1) healthy individuals varying their speaking rate and loudness (as is often used in assessments of pathological speech); 2) synthetic speech with variations in speaking rate and accent (to ensure more diverse vocal representations and fairness). Experimental evaluations showed that fine-tuning a pre-trained ASR model with data from these two sources outperformed a model fine-tuned only on real clinical data and matched the performance of a model fine-tuned on the combination of real clinical data and synthetic speech. When evaluated on held-out acoustic data from 24 individuals with various neurological diseases, the best performing model achieved an average word error rate of 5.7% and a mean correct count accuracy of 94.4%. In segmenting the data into individual words, a mean intersection-over-union of 89.2% was obtained against manual parsing (ground truth). It can be concluded that emulated and synthetic augmentations can significantly reduce the need for real clinical data of dysarthric speech when fine-tuning ASR models and, in turn, for speech segmentation.

Collapse

Dal Ben R, Prequero IT, Souza DDH, Hay JF. Speech Segmentation and Cross-Situational Word Learning in Parallel. Open Mind (Camb) 2023;7:510-533. [PMID: 37637304 PMCID: PMC10449405 DOI: 10.1162/opmi_a_00095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 07/06/2023] [Indexed: 08/29/2023] Open

Menn KH, Ward EK, Braukmann R, van den Boomen C, Buitelaar J, Hunnius S, Snijders TM. Neural Tracking in Infancy Predicts Language Development in Children With and Without Family History of Autism. Neurobiol Lang (Camb) 2022;3:495-514. [PMID: 37216063 PMCID: PMC10158647 DOI: 10.1162/nol_a_00074] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 05/16/2022] [Indexed: 05/24/2023]

Stärk K, Kidd E, Frost RLA. Word Segmentation Cues in German Child-Directed Speech: A Corpus Analysis. Lang Speech 2022;65:3-27. [PMID: 33517856 PMCID: PMC8886305 DOI: 10.1177/0023830920979016] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Matzinger T, Fitch WT. Voice modulatory cues to structure across languages and species. Philos Trans R Soc Lond B Biol Sci 2021;376:20200393. [PMID: 34719253 PMCID: PMC8558770 DOI: 10.1098/rstb.2020.0393] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2021] [Indexed: 12/21/2022] Open

Gilbert AC, Lee JG, Coulter K, Wolpert MA, Kousaie S, Gracco VL, Klein D, Titone D, Phillips NA, Baum SR. Spoken Word Segmentation in First and Second Language: When ERP and Behavioral Measures Diverge. Front Psychol 2021;12:705668. [PMID: 34603133 PMCID: PMC8485064 DOI: 10.3389/fpsyg.2021.705668] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 08/18/2021] [Indexed: 11/24/2022] Open

Affiliation(s)

Annie C Gilbert School of Communication Sciences and Disorders, McGill University, Montréal, QC, Canada.,Center for Research on Brain, Language and Music, Montréal, QC, Canada
Jasmine G Lee Center for Research on Brain, Language and Music, Montréal, QC, Canada.,Integrated Program in Neuroscience, McGill University, Montréal, QC, Canada
Kristina Coulter Center for Research on Brain, Language and Music, Montréal, QC, Canada.,Department of Psychology, Concordia University, Montréal, QC, Canada.,Center for Research in Human Development, Montréal, QC, Canada
Max A Wolpert Center for Research on Brain, Language and Music, Montréal, QC, Canada.,Integrated Program in Neuroscience, McGill University, Montréal, QC, Canada
Shanna Kousaie Center for Research on Brain, Language and Music, Montréal, QC, Canada.,Montreal Neurological Institute, McGill University, Montréal, QC, Canada.,School of Psychology, University of Ottawa, Ottawa, ON, Canada
Vincent L Gracco School of Communication Sciences and Disorders, McGill University, Montréal, QC, Canada.,Haskins Laboratories, Yale University, New Haven, CT, United States
Denise Klein Center for Research on Brain, Language and Music, Montréal, QC, Canada.,Montreal Neurological Institute, McGill University, Montréal, QC, Canada
Debra Titone Center for Research on Brain, Language and Music, Montréal, QC, Canada.,Department of Psychology, McGill University, Montréal, QC, Canada
Natalie A Phillips Center for Research on Brain, Language and Music, Montréal, QC, Canada.,Department of Psychology, Concordia University, Montréal, QC, Canada.,Center for Research in Human Development, Montréal, QC, Canada
Shari R Baum School of Communication Sciences and Disorders, McGill University, Montréal, QC, Canada.,Center for Research on Brain, Language and Music, Montréal, QC, Canada

Collapse

Matzinger T, Ritt N, Fitch WT. The Influence of Different Prosodic Cues on Word Segmentation. Front Psychol 2021;12:622042. [PMID: 33796045 PMCID: PMC8007974 DOI: 10.3389/fpsyg.2021.622042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 02/02/2021] [Indexed: 12/02/2022] Open

Hu G, Determan SC, Dong Y, Beeve AT, Collins JE, Gai Y. Spectral and Temporal Envelope Cues for Human and Automatic Speech Recognition in Noise. J Assoc Res Otolaryngol 2019;21:73-87. [PMID: 31758279 DOI: 10.1007/s10162-019-00737-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 09/16/2019] [Indexed: 11/30/2022] Open

Frost RLA, Monaghan P, Christiansen MH. Mark my words: High frequency marker words impact early stages of language learning. J Exp Psychol Learn Mem Cogn 2019;45:1883-1898. [PMID: 30652894 PMCID: PMC6746567 DOI: 10.1037/xlm0000683] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2018] [Revised: 11/08/2018] [Accepted: 11/08/2018] [Indexed: 11/17/2022]

Hilton M, Räling R, Wartenburger I, Elsner B. Parallels in Processing Boundary Cues in Speech and Action. Front Psychol 2019;10:1566. [PMID: 31379649 PMCID: PMC6646704 DOI: 10.3389/fpsyg.2019.01566] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 06/20/2019] [Indexed: 11/13/2022] Open

Fló A, Brusini P, Macagno F, Nespor M, Mehler J, Ferry AL. Newborns are sensitive to multiple cues for word segmentation in continuous speech. Dev Sci 2019;22:e12802. [PMID: 30681763 DOI: 10.1111/desc.12802] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 01/19/2019] [Accepted: 01/21/2019] [Indexed: 11/30/2022]

Gómez DM, Mok P, Ordin M, Mehler J, Nespor M. Statistical Speech Segmentation in Tone Languages: The Role of Lexical Tones. Lang Speech 2018;61:84-96. [PMID: 28486862 DOI: 10.1177/0023830917706529] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Sidiras C, Iliadou V, Nimatoudis I, Reichenbach T, Bamiou DE. Spoken Word Recognition Enhancement Due to Preceding Synchronized Beats Compared to Unsynchronized or Unrhythmic Beats. Front Neurosci 2017;11:415. [PMID: 28769752 PMCID: PMC5513984 DOI: 10.3389/fnins.2017.00415] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Accepted: 07/04/2017] [Indexed: 11/16/2022] Open

Bosker HR, Reinisch E. Foreign Languages Sound Fast: Evidence from Implicit Rate Normalization. Front Psychol 2017;8:1063. [PMID: 28701977 PMCID: PMC5487441 DOI: 10.3389/fpsyg.2017.01063] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 06/08/2017] [Indexed: 12/02/2022] Open

Abstract

Anecdotal evidence suggests that unfamiliar languages sound faster than one’s native language. Empirical evidence for this impression has, so far, come from explicit rate judgments. The aim of the present study was to test whether such perceived rate differences between native and foreign languages (FLs) have effects on implicit speech processing. Our measure of implicit rate perception was “normalization for speech rate”: an ambiguous vowel between short /a/ and long /a:/ is interpreted as /a:/ following a fast but as /a/ following a slow carrier sentence. That is, listeners did not judge speech rate itself; instead, they categorized ambiguous vowels whose perception was implicitly affected by the rate of the context. We asked whether a bias towards long /a:/ might be observed when the context is not actually faster but simply spoken in a FL. A fully symmetrical experimental design was used: Dutch and German participants listened to rate matched (fast and slow) sentences in both languages spoken by the same bilingual speaker. Sentences were followed by non-words that contained vowels from an /a-a:/ duration continuum. Results from Experiments 1 and 2 showed a consistent effect of rate normalization for both listener groups. Moreover, for German listeners, across the two experiments, foreign sentences triggered more /a:/ responses than (rate matched) native sentences, suggesting that foreign sentences were indeed perceived as faster. Moreover, this FL effect was modulated by participants’ ability to understand the FL: those participants that scored higher on a FL translation task showed less of a FL effect. However, opposite effects were found for the Dutch listeners. For them, their native rather than the FL induced more /a:/ responses. Nevertheless, this reversed effect could be reduced when additional spectral properties of the context were controlled for. Experiment 3, using explicit rate judgments, replicated the effect for German but not Dutch listeners. We therefore conclude that the subjective impression that FLs sound fast may have an effect on implicit speech processing, with implications for how language learners perceive spoken segments in a FL.

Collapse

Batterink LJ. Rapid Statistical Learning Supporting Word Extraction From Continuous Speech. Psychol Sci 2017;28:921-928. [PMID: 28493810 DOI: 10.1177/0956797617698226] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Kösem A, Basirat A, Azizi L, van Wassenhove V. High-frequency neural activity predicts word parsing in ambiguous speech streams. J Neurophysiol 2016;116:2497-2512. [PMID: 27605528 DOI: 10.1152/jn.00074.2016] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 09/03/2016] [Indexed: 11/22/2022] Open

Tremblay A, Broersma M, Coughlin CE, Choi J. Effects of the Native Language on the Learning of Fundamental Frequency in Second-Language Speech Segmentation. Front Psychol 2016;7:985. [PMID: 27445943 PMCID: PMC4925665 DOI: 10.3389/fpsyg.2016.00985] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 06/14/2016] [Indexed: 11/13/2022] Open

Lusk LG, Mitchel AD. Differential Gaze Patterns on Eyes and Mouth During Audiovisual Speech Segmentation. Front Psychol 2016;7:52. [PMID: 26869959 PMCID: PMC4735377 DOI: 10.3389/fpsyg.2016.00052] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 01/11/2016] [Indexed: 11/17/2022] Open

Chait M, Greenberg S, Arai T, Simon JZ, Poeppel D. Multi-time resolution analysis of speech: evidence from psychophysics. Front Neurosci 2015;9:214. [PMID: 26136650 PMCID: PMC4468943 DOI: 10.3389/fnins.2015.00214] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 05/28/2015] [Indexed: 11/13/2022] Open

Abstract

How speech signals are analyzed and represented remains a foundational challenge both for cognitive science and neuroscience. A growing body of research, employing various behavioral and neurobiological experimental techniques, now points to the perceptual relevance of both phoneme-sized (10-40 Hz modulation frequency) and syllable-sized (2-10 Hz modulation frequency) units in speech processing. However, it is not clear how information associated with such different time scales interacts in a manner relevant for speech perception. We report behavioral experiments on speech intelligibility employing a stimulus that allows us to investigate how distinct temporal modulations in speech are treated separately and whether they are combined. We created sentences in which the slow (~4 Hz; Slow) and rapid (~33 Hz; Shigh) modulations-corresponding to ~250 and ~30 ms, the average duration of syllables and certain phonetic properties, respectively-were selectively extracted. Although Slow and Shigh have low intelligibility when presented separately, dichotic presentation of Shigh with Slow results in supra-additive performance, suggesting a synergistic relationship between low- and high-modulation frequencies. A second experiment desynchronized presentation of the Slow and Shigh signals. Desynchronizing signals relative to one another had no impact on intelligibility when delays were less than ~45 ms. Longer delays resulted in a steep intelligibility decline, providing further evidence of integration or binding of information within restricted temporal windows. Our data suggest that human speech perception uses multi-time resolution processing. Signals are concurrently analyzed on at least two separate time scales, the intermediate representations of these analyses are integrated, and the resulting bound percept has significant consequences for speech intelligibility-a view compatible with recent insights from neuroscience implicating multi-timescale auditory processing.

Collapse

Peñaloza C, Benetello A, Tuomiranta L, Heikius IM, Järvinen S, Majos MC, Cardona P, Juncadella M, Laine M, Martin N, Rodríguez-Fornells A. Speech segmentation in aphasia. Aphasiology 2014;29:724-743. [PMID: 28824218 PMCID: PMC5560767 DOI: 10.1080/02687038.2014.982500] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Abstract

BACKGROUND

Speech segmentation is one of the initial and mandatory phases of language learning. Although some people with aphasia have shown a preserved ability to learn novel words, their speech segmentation abilities have not been explored.

AIMS

We examined the ability of individuals with chronic aphasia to segment words from running speech via statistical learning. We also explored the relationships between speech segmentation and aphasia severity, and short-term memory capacity. We further examined the role of lesion location in speech segmentation and short-term memory performance.

METHODS & PROCEDURES

The experimental task was first validated with a group of young adults (n = 120). Participants with chronic aphasia (n = 14) were exposed to an artificial language and were evaluated in their ability to segment words using a speech segmentation test. Their performance was contrasted against chance level and compared to that of a group of elderly matched controls (n = 14) using group and case-by-case analyses.

OUTCOMES & RESULTS

As a group, participants with aphasia were significantly above chance level in their ability to segment words from the novel language and did not significantly differ from the group of elderly controls. Speech segmentation ability in the aphasic participants was not associated with aphasia severity although it significantly correlated with word pointing span, a measure of verbal short-term memory. Case-by-case analyses identified four individuals with aphasia who performed above chance level on the speech segmentation task, all with predominantly posterior lesions and mild fluent aphasia. Their short-term memory capacity was also better preserved than in the rest of the group.

CONCLUSIONS

Our findings indicate that speech segmentation via statistical learning can remain functional in people with chronic aphasia and suggest that this initial language learning mechanism is associated with the functionality of the verbal short-term memory system and the integrity of the left inferior frontal region.

Collapse

Junge C, Cutler A. Early word recognition and later language skills. Brain Sci 2014;4:532-59. [PMID: 25347057 PMCID: PMC4279141 DOI: 10.3390/brainsci4040532] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 09/10/2014] [Accepted: 10/08/2014] [Indexed: 11/16/2022] Open

Breen M, Dilley LC, McAuley JD, Sanders LD. Auditory evoked potentials reveal early perceptual effects of distal prosody on speech segmentation. Lang Cogn Neurosci 2014;29:1132-1146. [PMID: 29911124 PMCID: PMC5998818 DOI: 10.1080/23273798.2014.894642] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Mitchel AD, Weiss DJ. Visual speech segmentation: using facial cues to locate word boundaries in continuous speech. Lang Cogn Process 2014;29:771-780. [PMID: 25018577 PMCID: PMC4091796 DOI: 10.1080/01690965.2013.791703] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

Kooijman V, Junge C, Johnson EK, Hagoort P, Cutler A. Predictive brain signals of linguistic development. Front Psychol 2013;4:25. [PMID: 23404161 PMCID: PMC3567457 DOI: 10.3389/fpsyg.2013.00025] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2012] [Accepted: 01/10/2013] [Indexed: 11/13/2022] Open

White L, Mattys SL, Wiget L. Segmentation cues in conversational speech: robust semantics and fragile phonotactics. Front Psychol 2012;3:375. [PMID: 23060839 PMCID: PMC3464055 DOI: 10.3389/fpsyg.2012.00375] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2012] [Accepted: 09/12/2012] [Indexed: 11/13/2022] Open

Abstract

Multiple cues influence listeners' segmentation of connected speech into words, but most previous studies have used stimuli elicited in careful readings rather than natural conversation. Discerning word boundaries in conversational speech may differ from the laboratory setting. In particular, a speaker's articulatory effort - hyperarticulation vs. hypoarticulation (H&H) - may vary according to communicative demands, suggesting a compensatory relationship whereby acoustic-phonetic cues are attenuated when other information sources strongly guide segmentation. We examined how listeners' interpretation of segmentation cues is affected by speech style (spontaneous conversation vs. read), using cross-modal identity priming. To elicit spontaneous stimuli, we used a map task in which speakers discussed routes around stylized landmarks. These landmarks were two-word phrases in which the strength of potential segmentation cues - semantic likelihood and cross-boundary diphone phonotactics - was systematically varied. Landmark-carrying utterances were transcribed and later re-recorded as read speech. Independent of speech style, we found an interaction between cue valence (favorable/unfavorable) and cue type (phonotactics/semantics). Thus, there was an effect of semantic plausibility, but no effect of cross-boundary phonotactics, indicating that the importance of phonotactic segmentation may have been overstated in studies where lexical information was artificially suppressed. These patterns were unaffected by whether the stimuli were elicited in a spontaneous or read context, even though the difference in speech styles was evident in a main effect. Durational analyses suggested speaker-driven cue trade-offs congruent with an H&H account, but these modulations did not impact on listener behavior. We conclude that previous research exploiting read speech is reliable in indicating the primacy of lexically based cues in the segmentation of natural conversational speech.

Collapse

Yurovsky D, Yu C, Smith LB. Statistical speech segmentation and word learning in parallel: scaffolding from child-directed speech. Front Psychol 2012;3:374. [PMID: 23162487 PMCID: PMC3498894 DOI: 10.3389/fpsyg.2012.00374] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Accepted: 09/11/2012] [Indexed: 11/29/2022] Open

Hay JF, Saffran JR. Rhythmic grouping biases constrain infant statistical learning. Infancy 2011;17:610-641. [PMID: 23730217 DOI: 10.1111/j.1532-7078.2011.00110.x] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Mitchel AD, Weiss DJ. Learning across senses: cross-modal effects in multisensory statistical learning. J Exp Psychol Learn Mem Cogn 2011;37:1081-91. [PMID: 21574745 PMCID: PMC4041380 DOI: 10.1037/a0023700] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Schmidt-Kassow M, Roncaglia-Denissen MP, Kotz SA. Why pitch sensitivity matters: event-related potential evidence of metric and syntactic violation detection among spanish late learners of german. Front Psychol 2011;2:131. [PMID: 21734898 PMCID: PMC3120976 DOI: 10.3389/fpsyg.2011.00131] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Accepted: 06/05/2011] [Indexed: 11/13/2022] Open

Rodríguez-Fornells A, Cunillera T, Mestres-Missé A, de Diego-Balaguer R. Neurophysiological mechanisms involved in language learning in adults. Philos Trans R Soc Lond B Biol Sci 2009;364:3711-35. [PMID: 19933142 PMCID: PMC2846313 DOI: 10.1098/rstb.2009.0130] [Citation(s) in RCA: 136] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Sanders LD, Neville HJ, Woldorff MG. Speech segmentation by native and non-native speakers: the use of lexical, syntactic, and stress-pattern cues. J Speech Lang Hear Res 2002;45:519-530. [PMID: 12069004 PMCID: PMC2532534 DOI: 10.1044/1092-4388(2002/041)] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]