1
|
The Language-Specificity of Phonetic Adaptation to Talkers. LANGUAGE AND SPEECH 2023:238309231214244. [PMID: 38054422 DOI: 10.1177/00238309231214244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Listeners adapt efficiently to new talkers by using lexical knowledge to resolve perceptual uncertainty. This adaptation has been widely observed, both in first (L1) and in second languages (L2). Here, adaptation was tested in both the L1 and L2 of speakers of Mandarin and English, two very dissimilar languages. A sound midway between /f/ and /s/ replacing either /f/ or /s/ in Mandarin words presented for lexical decision (e.g., bu4fa3 "illegal"; kuan1song1 "loose") prompted the expected adaptation; it induced an expanded /f/ category in phoneme categorization when it had replaced /f/, but an expanded /s/ category when it had replaced /s/. Both L1 listeners and English-native listeners with L2 Mandarin showed this effect. In English, however (with e.g., traffic; insane), we observed adaptation in L1 but not in L2; Mandarin-native listeners, despite scoring highly in the English lexical decision training, did not adapt their category boundaries for /f/ and /s/. Whether the ambiguous sound appeared syllable-initially (as in Mandarin phonology) versus word-finally (providing more word identity information) made no difference. Perceptual learning for talker adaptation is language-specific in that successful lexically guided adaptation in one language does not guarantee adaptation in other known languages; the enabling conditions for adaptation may be multiple and diverse.
Collapse
|
2
|
Early neuro-electric indication of lexical match in English spoken-word recognition. PLoS One 2023; 18:e0285286. [PMID: 37200324 DOI: 10.1371/journal.pone.0285286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 04/19/2023] [Indexed: 05/20/2023] Open
Abstract
We investigated early electrophysiological responses to spoken English words embedded in neutral sentence frames, using a lexical decision paradigm. As words unfold in time, similar-sounding lexical items compete for recognition within 200 milliseconds after word onset. A small number of studies have previously investigated event-related potentials in this time window in English and French, with results differing in direction of effects as well as component scalp distribution. Investigations of spoken-word recognition in Swedish have reported an early left-frontally distributed event-related potential that increases in amplitude as a function of the probability of a successful lexical match as the word unfolds. Results from the present study indicate that the same process may occur in English: we propose that increased certainty of a 'word' response in a lexical decision task is reflected in the amplitude of an early left-anterior brain potential beginning around 150 milliseconds after word onset. This in turn is proposed to be connected to the probabilistically driven activation of possible upcoming word forms.
Collapse
|
3
|
312 Racial Differences in Adult Emergency Department Triage Assignment Across An Urban-Rural Health System. Ann Emerg Med 2022. [DOI: 10.1016/j.annemergmed.2022.08.340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
4
|
In Search of Salience: Focus Detection in the Speech of Different Talkers. LANGUAGE AND SPEECH 2022; 65:650-680. [PMID: 34841933 DOI: 10.1177/00238309211046029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Many different prosodic cues can help listeners predict upcoming speech. However, no research to date has assessed listeners' processing of preceding prosody from different speakers. The present experiments examine (1) whether individual speakers (of the same language variety) are likely to vary in their production of preceding prosody; (2) to the extent that there is talker variability, whether listeners are flexible enough to use any prosodic cues signaled by the individual speaker; and (3) whether types of prosodic cues (e.g., F0 versus duration) vary in informativeness. Using a phoneme-detection task, we examined whether listeners can entrain to different combinations of preceding prosodic cues to predict where focus will fall in an utterance. We used unsynthesized sentences recorded by four female native speakers of Australian English who happened to have used different preceding cues to produce sentences with prosodic focus: a combination of pre-focus overall duration cues, F0 and intensity (mean, maximum, range), and longer pre-target interval before the focused word onset (Speaker 1), only mean F0 cues, mean and maximum intensity, and longer pre-target interval (Speaker 2), only pre-target interval duration (Speaker 3), and only pre-focus overall duration and maximum intensity (Speaker 4). Listeners could entrain to almost every speaker's cues (the exception being Speaker 4's use of only pre-focus overall duration and maximum intensity), and could use whatever cues were available even when one of the cue sources was rendered uninformative. Our findings demonstrate both speaker variability and listener flexibility in the processing of prosodic focus.
Collapse
|
5
|
Auditory perceptual learning in autistic adults. Autism Res 2022; 15:1495-1507. [DOI: 10.1002/aur.2778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 06/20/2022] [Indexed: 11/09/2022]
|
6
|
Asymmetric memory for birth language perception versus production in young international adoptees. Cognition 2021; 213:104788. [PMID: 34226063 DOI: 10.1016/j.cognition.2021.104788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 05/18/2021] [Accepted: 05/20/2021] [Indexed: 10/21/2022]
Abstract
Adults who as children were adopted into a different linguistic community retain knowledge of their birth language. The possession (without awareness) of such knowledge is known to facilitate the (re)learning of birth-language speech patterns; this perceptual learning predicts such adults' production success as well, indicating that the retained linguistic knowledge is abstract in nature. Adoptees' acquisition of their adopted language is fast and complete; birth-language mastery disappears rapidly, although this latter process has been little studied. Here, 46 international adoptees from China aged four to 10 years, with Dutch as their new language, plus 47 matched non-adopted Dutch-native controls and 40 matched non-adopted Chinese controls, undertook across a two-week period 10 blocks of training in perceptually identifying Chinese speech contrasts (one segmental, one tonal) which were unlike any Dutch contrasts. Chinese controls easily accomplished all these tasks. The same participants also provided speech production data in an imitation task. In perception, adoptees and Dutch controls scored equivalently poorly at the outset of training; with training, the adoptees significantly improved while the Dutch controls did not. In production, adoptees' imitations both before and after training could be better identified, and received higher goodness ratings, than those of Dutch controls. The perception results confirm that birth-language knowledge is stored and can facilitate re-learning in post-adoption childhood; the production results suggest that although processing of phonological category detail appears to depend on access to the stored knowledge, general articulatory dimensions can at this age also still be remembered, and may facilitate spoken imitation.
Collapse
|
7
|
Special issue in honor of Jacques Mehler, Cognition's founding editor. Cognition 2021; 213:104786. [PMID: 34116795 DOI: 10.1016/j.cognition.2021.104786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
8
|
Abstract
Prominence, the expression of informational weight within utterances, can be signaled by prosodic highlighting (head-prominence, as in English) or by position (as in Korean edge-prominence). Prominence confers processing advantages, even if conveyed only by discourse manipulations. Here we compared processing of prominence in English and Korean, using a task that indexes processing success, namely recognition memory. In each language, participants' memory was tested for target words heard in sentences in which they were prominent due to prosody, position, both or neither. Prominence produced recall advantage, but the relative effects differed across language. For Korean listeners the positional advantage was greater, but for English listeners prosodic and syntactic prominence had equivalent and additive effects. In a further experiment semantic and phonological foils tested depth of processing of the recall targets. Both foil types were correctly rejected, suggesting that semantic processing had not reached the level at which word form was no longer available. Together the results suggest that prominence processing is primarily driven by universal effects of information structure; but language-specific differences in frequency of experience prompt different relative advantages of prominence signal types. Processing efficiency increases in each case, however, creating more accurate and more rapidly contactable memory representations.
Collapse
|
9
|
More why, less how: What we need from models of cognition. Cognition 2021; 213:104688. [PMID: 33775402 PMCID: PMC8346944 DOI: 10.1016/j.cognition.2021.104688] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 03/13/2021] [Accepted: 03/17/2021] [Indexed: 11/29/2022]
Abstract
Science regularly experiences periods in which simply describing the world is prioritised over attempting to explain it. Cognition, this journal, came into being some 45 years ago as an attempt to lay one such period to rest; without doubt, it has helped create the current cognitive science climate in which theory is decidedly welcome. Here we summarise the reasons why a theoretical approach is imperative in our field, and call attention to some potentially counter-productive trends in which cognitive models are concerned too exclusively with how processes work at the expense of why the processes exist in the first place and thus what the goal of modelling them must be. We suggest that theories should try to explain ‘why’ not just ‘how’. We reinforce Darwin’s (1861) point that “All [scientific] observation must be for or against some view if it is to be of [scientific] service!”. Jacques Mehler and Cognition revitalised theory in cognitive psychology.
Collapse
|
10
|
P37 Characterizing community-level abortion stigma in the US. Contraception 2020. [DOI: 10.1016/j.contraception.2020.07.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
11
|
Abstract
When listeners experience difficulty in understanding a speaker, lexical and audiovisual (or lipreading) information can be a helpful source of guidance. These two types of information embedded in speech can also guide perceptual adjustment, also known as recalibration or perceptual retuning. With retuning or recalibration, listeners can use these contextual cues to temporarily or permanently reconfigure internal representations of phoneme categories to adjust to and understand novel interlocutors more easily. These two types of perceptual learning, previously investigated in large part separately, are highly similar in allowing listeners to use speech-external information to make phoneme boundary adjustments. This study explored whether the two sources may work in conjunction to induce adaptation, thus emulating real life, in which listeners are indeed likely to encounter both types of cue together. Listeners who received combined audiovisual and lexical cues showed perceptual learning effects similar to listeners who only received audiovisual cues, while listeners who received only lexical cues showed weaker effects compared with the two other groups. The combination of cues did not lead to additive retuning or recalibration effects, suggesting that lexical and audiovisual cues operate differently with regard to how listeners use them for reshaping perceptual categories. Reaction times did not significantly differ across the three conditions, so none of the forms of adjustment were either aided or hindered by processing time differences. Mechanisms underlying these forms of perceptual learning may diverge in numerous ways despite similarities in experimental applications.
Collapse
|
12
|
Neural Correlates of Phonetic Adaptation as Induced by Lexical and Audiovisual Context. J Cogn Neurosci 2020; 32:2145-2158. [PMID: 32662723 DOI: 10.1162/jocn_a_01608] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
When speech perception is difficult, one way listeners adjust is by reconfiguring phoneme category boundaries, drawing on contextual information. Both lexical knowledge and lipreading cues are used in this way, but it remains unknown whether these two differing forms of perceptual learning are similar at a neural level. This study compared phoneme boundary adjustments driven by lexical or audiovisual cues, using ultra-high-field 7-T fMRI. During imaging, participants heard exposure stimuli and test stimuli. Exposure stimuli for lexical retuning were audio recordings of words, and those for audiovisual recalibration were audio-video recordings of lip movements during utterances of pseudowords. Test stimuli were ambiguous phonetic strings presented without context, and listeners reported what phoneme they heard. Reports reflected phoneme biases in preceding exposure blocks (e.g., more reported /p/ after /p/-biased exposure). Analysis of corresponding brain responses indicated that both forms of cue use were associated with a network of activity across the temporal cortex, plus parietal, insula, and motor areas. Audiovisual recalibration also elicited significant occipital cortex activity despite the lack of visual stimuli. Activity levels in several ROIs also covaried with strength of audiovisual recalibration, with greater activity accompanying larger recalibration shifts. Similar activation patterns appeared for lexical retuning, but here, no significant ROIs were identified. Audiovisual and lexical forms of perceptual learning thus induce largely similar brain response patterns. However, audiovisual recalibration involves additional visual cortex contributions, suggesting that previously acquired visual information (on lip movements) is retrieved and deployed to disambiguate auditory perception.
Collapse
|
13
|
Universals of listening: Equivalent prosodic entrainment in tone and non-tone languages. Cognition 2020; 202:104311. [PMID: 32502869 DOI: 10.1016/j.cognition.2020.104311] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 04/12/2020] [Accepted: 04/18/2020] [Indexed: 11/29/2022]
Abstract
In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. Here, we ask whether this strategy is universally available, even in languages with very different phonological systems (e.g., tone versus non-tone languages). In a phoneme detection experiment, we examined whether prosodic entrainment also occurs in Mandarin Chinese, a tone language, where the use of various suprasegmental cues to lexical identity may take precedence over their use in salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted high stress on the target-bearing word, and the lexical tone of the target word (i.e., rising versus falling) did not affect the Mandarin listeners' response. Further, the extent to which prosodic entrainment was used to detect the target phoneme was the same in both English and Mandarin listeners. Nevertheless, native Mandarin speakers did not adopt an entrainment strategy when the sentences were presented in English, consistent with the suggestion that L2 listening may be strained by additional functional load from prosodic processing. These findings have implications for how universal and language-specific mechanisms interact in the perception of focus structure in everyday discourse.
Collapse
|
14
|
Commentary on "Interaction in Spoken Word Recognition Models". Front Psychol 2018; 9:1568. [PMID: 30233453 PMCID: PMC6129619 DOI: 10.3389/fpsyg.2018.01568] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 08/07/2018] [Indexed: 11/26/2022] Open
|
15
|
|
16
|
Abstract
Participants made visual lexical decisions to upper-case words and nonwords, and then categorized an ambiguous N–H letter continuum. The lexical decision phase included different exposure conditions: Some participants saw an ambiguous letter “?”, midway between N and H, in N-biased lexical contexts (e.g., REIG?), plus words with unambiguous H (e.g., WEIGH); others saw the reverse (e.g., WEIG?, REIGN). The first group categorized more of the test continuum as N than did the second group. Control groups, who saw “?” in nonword contexts (e.g., SMIG?), plus either of the unambiguous word sets (e.g., WEIGH or REIGN), showed no such subsequent effects. Perceptual learning about ambiguous letters therefore appears to be based on lexical knowledge, just as in an analogous speech experiment (Norris, McQueen, & Cutler, 2003) which showed similar lexical influence in learning about ambiguous phonemes. We argue that lexically guided learning is an efficient general strategy available for exploitation by different specific perceptual tasks.
Collapse
|
17
|
Abstraction and the (Misnamed) Language Familiarity Effect. Cogn Sci 2017; 42:633-645. [DOI: 10.1111/cogs.12520] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 02/25/2017] [Accepted: 04/10/2017] [Indexed: 11/27/2022]
|
18
|
Early development of abstract language knowledge: evidence from perception-production transfer of birth-language memory. ROYAL SOCIETY OPEN SCIENCE 2017; 4:160660. [PMID: 28280567 PMCID: PMC5319333 DOI: 10.1098/rsos.160660] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 12/13/2016] [Indexed: 05/21/2023]
Abstract
Children adopted early in life into another linguistic community typically forget their birth language but retain, unaware, relevant linguistic knowledge that may facilitate (re)learning of birth-language patterns. Understanding the nature of this knowledge can shed light on how language is acquired. Here, international adoptees from Korea with Dutch as their current language, and matched Dutch-native controls, provided speech production data on a Korean consonantal distinction unlike any Dutch distinctions, at the outset and end of an intensive perceptual training. The productions, elicited in a repetition task, were identified and rated by Korean listeners. Adoptees' production scores improved significantly more across the training period than control participants' scores, and, for adoptees only, relative production success correlated significantly with the rate of learning in perception (which had, as predicted, also surpassed that of the controls). Of the adoptee group, half had been adopted at 17 months or older (when talking would have begun), while half had been prelinguistic (under six months). The former group, with production experience, showed no advantage over the group without. Thus the adoptees' retained knowledge of Korean transferred from perception to production and appears to be abstract in nature rather than dependent on the amount of experience.
Collapse
|
19
|
Stress Effects in Vowel Perception as a Function of Language-Specific Vocabulary Patterns. PHONETICA 2016; 74:81-106. [PMID: 27710964 DOI: 10.1159/000447428] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 06/04/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND/AIMS Evidence from spoken word recognition suggests that for English listeners, distinguishing full versus reduced vowels is important, but discerning stress differences involving the same full vowel (as in mu- from music or museum) is not. In Dutch, in contrast, the latter distinction is important. This difference arises from the relative frequency of unstressed full vowels in the two vocabularies. The goal of this paper is to determine how this difference in the lexicon influences the perception of stressed versus unstressed vowels. METHODS All possible sequences of two segments (diphones) in Dutch and in English were presented to native listeners in gated fragments. We recorded identification performance over time throughout the speech signal. The data were here analysed specifically for patterns in perception of stressed versus unstressed vowels. RESULTS The data reveal significantly larger stress effects (whereby unstressed vowels are harder to identify than stressed vowels) in English than in Dutch. Both language-specific and shared patterns appear regarding which vowels show stress effects. CONCLUSION We explain the larger stress effect in English as reflecting the processing demands caused by the difference in use of unstressed vowels in the lexicon. The larger stress effect in English is due to relative inexperience with processing unstressed full vowels.
Collapse
|
20
|
Prediction, Bayesian inference and feedback in speech recognition. LANGUAGE, COGNITION AND NEUROSCIENCE 2016; 31:4-18. [PMID: 26740960 PMCID: PMC4685608 DOI: 10.1080/23273798.2015.1081703] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Accepted: 08/05/2015] [Indexed: 05/19/2023]
Abstract
Speech perception involves prediction, but how is that prediction implemented? In cognitive models prediction has often been taken to imply that there is feedback of activation from lexical to pre-lexical processes as implemented in interactive-activation models (IAMs). We show that simple activation feedback does not actually improve speech recognition. However, other forms of feedback can be beneficial. In particular, feedback can enable the listener to adapt to changing input, and can potentially help the listener to recognise unusual input, or recognise speech in the presence of competing sounds. The common feature of these helpful forms of feedback is that they are all ways of optimising the performance of speech recognition using Bayesian inference. That is, listeners make predictions about speech because speech recognition is optimal in the sense captured in Bayesian models.
Collapse
|
21
|
Abstract
Not only can the pitfalls that Firestone & Scholl (F&S) identify be generalised across multiple studies within the field of visual perception, but also they have general application outside the field wherever perceptual and cognitive processing are compared. We call attention to the widespread susceptibility of research on the perception of speech to versions of the same pitfalls.
Collapse
|
22
|
The impact of pregnancy diagnosis on depression: does it differ by self-assessment of pregnancy context? Contraception 2015. [DOI: 10.1016/j.contraception.2015.06.146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
23
|
Celebrity Chef Recipes Compared to the Dietary Guidelines: Lessons for Americans. J Acad Nutr Diet 2015. [DOI: 10.1016/j.jand.2015.06.151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
24
|
Abstract
In an auditory lexical decision experiment, 5541 spoken content words and pseudowords were presented to 20 native speakers of Dutch. The words vary in phonological make-up and in number of syllables and stress pattern, and are further representative of the native Dutch vocabulary in that most are morphologically complex, comprising two stems or one stem plus derivational and inflectional suffixes, with inflections representing both regular and irregular paradigms; the pseudowords were matched in these respects to the real words. The BALDEY ("biggest auditory lexical decision experiment yet") data file includes response times and accuracy rates, with for each item morphological information plus phonological and acoustic information derived from automatic phonemic segmentation of the stimuli. Two initial analyses illustrate how this data set can be used. First, we discuss several measures of the point at which a word has no further neighbours and compare the degree to which each measure predicts our lexical decision response outcomes. Second, we investigate how well four different measures of frequency of occurrence (from written corpora, spoken corpora, subtitles, and frequency ratings by 75 participants) predict the same outcomes. These analyses motivate general conclusions about the auditory lexical decision task. The (publicly available) BALDEY database lends itself to many further analyses.
Collapse
|
25
|
Early word recognition and later language skills. Brain Sci 2014; 4:532-59. [PMID: 25347057 PMCID: PMC4279141 DOI: 10.3390/brainsci4040532] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 09/10/2014] [Accepted: 10/08/2014] [Indexed: 11/16/2022] Open
Abstract
Recent behavioral and electrophysiological evidence has highlighted the long-term importance for language skills of an early ability to recognize words in continuous speech. We here present further tests of this long-term link in the form of follow-up studies conducted with two (separate) groups of infants who had earlier participated in speech segmentation tasks. Each study extends prior follow-up tests: Study 1 by using a novel follow-up measure that taps into online processing, Study 2 by assessing language performance relationships over a longer time span than previously tested. Results of Study 1 show that brain correlates of speech segmentation ability at 10 months are positively related to 16-month-olds' target fixations in a looking-while-listening task. Results of Study 2 show that infant speech segmentation ability no longer directly predicts language profiles at the age of five. However, a meta-analysis across our results and those of similar studies (Study 3) reveals that age at follow-up does not moderate effect size. Together, the results suggest that infants' ability to recognize words in speech certainly benefits early vocabulary development; further observed relationships of later language skills to early word recognition may be consequent upon this vocabulary size effect.
Collapse
|
26
|
Tracking perception of the sounds of English. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 135:2995-3006. [PMID: 24815279 DOI: 10.1121/1.4870486] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Twenty American English listeners identified gated fragments of all 2288 possible English within-word and cross-word diphones, providing a total of 538,560 phoneme categorizations. The results show orderly uptake of acoustic information in the signal and provide a view of where information about segments occurs in time. Information locus depends on each speech sound's identity and phonological features. Affricates and diphthongs have highly localized information so that listeners' perceptual accuracy rises during a confined time range. Stops and sonorants have more distributed and gradually appearing information. The identity and phonological features (e.g., vowel vs consonant) of the neighboring segment also influences when acoustic information about a segment is available. Stressed vowels are perceived significantly more accurately than unstressed vowels, but this effect is greater for lax vowels than for tense vowels or diphthongs. The dataset charts the availability of perceptual cues to segment identity across time for the full phoneme repertoire of English in all attested phonetic contexts.
Collapse
|
27
|
Abstract
Listeners resolve ambiguity in speech by consulting context. Extensive research on this issue has largely relied on continua of sounds constructed to vary incrementally between two phonemic endpoints. In this study we presented listeners instead with phonetic ambiguity of a kind with which they have natural experience: varying degrees of word-final /t/-reduction. In two experiments, Dutch listeners decided whether or not the verb in a sentence such as Maar zij ren(t) soms 'But she sometimes run(s)' ended in /t/. In Dutch, presence versus absence of final /t/ distinguishes third- from first-person singular present-tense verbs. Acoustic evidence for /t/ varied from clear to absent, and immediately preceding phonetic context was consistent with more versus less likely deletion of /t/. In both experiments, listeners reported more /t/s in sentences in which /t/ would be syntactically correct. In Experiment 1, the disambiguating syntactic information preceded the target verb, as above, while in Experiment 2, it followed the verb. The syntactic bias was greater for fast than for slow responses in Experiment 1, but no such difference appeared in Experiment 2. We conclude that syntactic information does not directly influence pre-lexical processing, but is called upon in making phoneme decisions.
Collapse
|
28
|
Successful Word Recognition by 10-Month-Olds Given Continuous Speech Both at Initial Exposure and Test. INFANCY 2013. [DOI: 10.1111/infa.12040] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
29
|
Abstract
Analysis of a corpus of spontaneously produced Japanese puns from a single speaker over a two-year period provides a view of how a punster selects a source word for a pun and transforms it into another word for humorous effect. The pun-making process is driven by a principle of similarity: the source word should as far as possible be preserved (in terms of segmental sequence) in the pun. This renders homophones (English example: band-banned) the pun type of choice, with part-whole relationships of embedding (cap-capture), and mutations of the source word (peas-bees) rather less favored. Similarity also governs mutations in that single-phoneme substitutions outnumber larger changes, and in phoneme substitutions, subphonemic features tend to be preserved. The process of spontaneous punning thus applies, on line, the same similarity criteria as govern explicit similarity judgments and offline decisions about pun success (e.g., for inclusion in published collections). Finally, the process of spoken-word recognition is word-play-friendly in that it involves multiple word-form activation and competition, which, coupled with known techniques in use in difficult listening conditions, enables listeners to generate most pun types as offshoots of normal listening procedures.
Collapse
|
30
|
A multimodal corpus of speech to infant and adult listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:EL534. [PMID: 25669300 DOI: 10.1121/1.4828977] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
An audio and video corpus of speech addressed to 28 11-month-olds is described. The corpus allows comparisons between adult speech directed toward infants, familiar adults, and unfamiliar adult addressees as well as of caregivers' word teaching strategies across word classes. Summary data show that infant-directed speech differed more from speech to unfamiliar than familiar adults, that word teaching strategies for nominals versus verbs and adjectives differed, that mothers mostly addressed infants with multi-word utterances, and that infants' vocabulary size was unrelated to speech rate, but correlated positively with predominance of continuous caregiver speech (not of isolated words) in the input.
Collapse
|
31
|
Lexically guided retuning of visual phonetic categories. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:562-571. [PMID: 23862831 DOI: 10.1121/1.4807814] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Listeners retune the boundaries between phonetic categories to adjust to individual speakers' productions. Lexical information, for example, indicates what an unusual sound is supposed to be, and boundary retuning then enables the speaker's sound to be included in the appropriate auditory phonetic category. In this study, it was investigated whether lexical knowledge that is known to guide the retuning of auditory phonetic categories, can also retune visual phonetic categories. In Experiment 1, exposure to a visual idiosyncrasy in ambiguous audiovisually presented target words in a lexical decision task indeed resulted in retuning of the visual category boundary based on the disambiguating lexical context. In Experiment 2 it was tested whether lexical information retunes visual categories directly, or indirectly through the generalization from retuned auditory phonetic categories. Here, participants were exposed to auditory-only versions of the same ambiguous target words as in Experiment 1. Auditory phonetic categories were retuned by lexical knowledge, but no shifts were observed for the visual phonetic categories. Lexical knowledge can therefore guide retuning of visual phonetic categories, but lexically guided retuning of auditory phonetic categories is not generalized to visual categories. Rather, listeners adjust auditory and visual phonetic categories to talker idiosyncrasies separately.
Collapse
|
32
|
Laboratory Investigation of 3-D versus 2-D Visualization in Endoscopic Skull Base Surgery. Skull Base Surg 2013. [DOI: 10.1055/s-0033-1336239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
33
|
Abstract
The ability to extract word forms from continuous speech is a prerequisite for constructing a vocabulary and emerges in the first year of life. Electrophysiological (ERP) studies of speech segmentation by 9- to 12-month-old listeners in several languages have found a left-localized negativity linked to word onset as a marker of word detection. We report an ERP study showing significant evidence of speech segmentation in Dutch-learning 7-month-olds. In contrast to the left-localized negative effect reported with older infants, the observed overall mean effect had a positive polarity. Inspection of individual results revealed two participant sub-groups: a majority showing a positive-going response, and a minority showing the left negativity observed in older age groups. We retested participants at age three, on vocabulary comprehension and word and sentence production. On every test, children who at 7 months had shown the negativity associated with segmentation of words from speech outperformed those who had produced positive-going brain responses to the same input. The earlier that infants show the left-localized brain responses typically indicating detection of words in speech, the better their early childhood language skills.
Collapse
|
34
|
Electrophysiological evidence of early word learning. Neuropsychologia 2012; 50:3702-12. [DOI: 10.1016/j.neuropsychologia.2012.10.012] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Revised: 10/10/2012] [Accepted: 10/12/2012] [Indexed: 10/27/2022]
|
35
|
Phonologically determined asymmetries in vocabulary structure across languages. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:EL155-EL160. [PMID: 22894315 DOI: 10.1121/1.4737596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Studies of spoken-word recognition have revealed that competition from embedded words differs in strength as a function of where in the carrier word the embedded word is found and have further shown embedding patterns to be skewed such that embeddings in initial position in carriers outnumber embeddings in final position. Lexico-statistical analyses show that this skew is highly attenuated in Japanese, a noninflectional language. Comparison of the extent of the asymmetry in the three Germanic languages English, Dutch, and German allows the source to be traced to a combination of suffixal morphology and vowel reduction in unstressed syllables.
Collapse
|
36
|
Abstract
Infants' ability to recognize words in continuous speech is vital for building a vocabulary. We here examined the amount and type of exposure needed for 10-month-olds to recognize words. Infants first heard a word, either embedded within an utterance or in isolation, then recognition was assessed by comparing event-related potentials to this word versus a word that they had not heard directly before. Although all 10-month-olds showed recognition responses to words first heard in isolation, not all infants showed such responses to words they had first heard within an utterance. Those that did succeed in the latter, harder, task, however, understood more words and utterances when re-tested at 12 months, and understood more words and produced more words at 24 months, compared with those who had shown no such recognition response at 10 months. The ability to rapidly recognize the words in continuous utterances is clearly linked to future language development.
Collapse
|
37
|
Abstract
In three phoneme goodness rating experiments, listeners heard phonetic tokens varying along a continuum centered on /s/, occurring finally in isolated word or non-word tokens. An effect of spelling appeared in Experiment 1: native English-speakers' goodness ratings for the best /s/ tokens were significantly higher in words spelled with S (e.g., bless) than in words spelled with C (e.g., voice). Since the tokens were in fact identical in each word, this effect indicates less than optimal evaluation performance. No spelling effect appeared when non-native speakers rated the same materials in Experiment 2, indicating that the observed difference could not be due to acoustic characteristics of the S- versus C-words. In Experiment 3, native English-speakers' ratings for /s/ did not differ in non-words rhyming with words consistently spelled with S (e.g., pless) or with words consistently spelled with C (e.g., floice); i.e., no effects of lexical rhyme analogs appeared. It is concluded that the findings are better explained in terms of phonemic decisions drawing upon lexical information where convenient than by obligatory influence of lexical knowledge upon pre-lexical processing.
Collapse
|
38
|
An orthographic effect in phoneme processing, and its limitations. Front Psychol 2012; 3:18. [PMID: 22347203 PMCID: PMC3273718 DOI: 10.3389/fpsyg.2012.00018] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2011] [Accepted: 01/14/2012] [Indexed: 11/13/2022] Open
Abstract
In three phoneme goodness rating experiments, listeners heard phonetic tokens varying along a continuum centered on /s/, occurring finally in isolated word or non-word tokens. An effect of spelling appeared in Experiment 1: native English-speakers' goodness ratings for the best /s/ tokens were significantly higher in words spelled with S (e.g., bless) than in words spelled with C (e.g., voice). Since the tokens were in fact identical in each word, this effect indicates less than optimal evaluation performance. No spelling effect appeared when non-native speakers rated the same materials in Experiment 2, indicating that the observed difference could not be due to acoustic characteristics of the S- versus C-words. In Experiment 3, native English-speakers' ratings for /s/ did not differ in non-words rhyming with words consistently spelled with S (e.g., pless) or with words consistently spelled with C (e.g., floice); i.e., no effects of lexical rhyme analogs appeared. It is concluded that the findings are better explained in terms of phonemic decisions drawing upon lexical information where convenient than by obligatory influence of lexical knowledge upon pre-lexical processing.
Collapse
|
39
|
An orthographic effect in phoneme processing, and its limitations. Front Psychol 2012. [PMID: 22347203 DOI: 10.3389/fpsyg.2012.00018)] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023] Open
Abstract
In three phoneme goodness rating experiments, listeners heard phonetic tokens varying along a continuum centered on /s/, occurring finally in isolated word or non-word tokens. An effect of spelling appeared in Experiment 1: native English-speakers' goodness ratings for the best /s/ tokens were significantly higher in words spelled with S (e.g., bless) than in words spelled with C (e.g., voice). Since the tokens were in fact identical in each word, this effect indicates less than optimal evaluation performance. No spelling effect appeared when non-native speakers rated the same materials in Experiment 2, indicating that the observed difference could not be due to acoustic characteristics of the S- versus C-words. In Experiment 3, native English-speakers' ratings for /s/ did not differ in non-words rhyming with words consistently spelled with S (e.g., pless) or with words consistently spelled with C (e.g., floice); i.e., no effects of lexical rhyme analogs appeared. It is concluded that the findings are better explained in terms of phonemic decisions drawing upon lexical information where convenient than by obligatory influence of lexical knowledge upon pre-lexical processing.
Collapse
|
40
|
Perception of intrusive /r/ in English by native, cross-language and cross-dialect listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:1643-1652. [PMID: 21895101 DOI: 10.1121/1.3619793] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
In sequences such as law and order, speakers of British English often insert /r/ between law and and. Acoustic analyses revealed such "intrusive" /r/ to be significantly shorter than canonical /r/. In a 2AFC experiment, native listeners heard British English sentences in which /r/ duration was manipulated across a word boundary [e.g., saw (r)ice], and orthographic and semantic factors were varied. These listeners responded categorically on the basis of acoustic evidence for /r/ alone, reporting ice after short /r/s, rice after long /r/s; orthographic and semantic factors had no effect. Dutch listeners proficient in English who heard the same materials relied less on durational cues than the native listeners, and were affected by both orthography and semantic bias. American English listeners produced intermediate responses to the same materials, being sensitive to duration (less so than native, more so than Dutch listeners), and to orthography (less so than the Dutch), but insensitive to the semantic manipulation. Listeners from language communities without common use of intrusive /r/ may thus interpret intrusive /r/ as canonical /r/, with a language difference increasing this propensity more than a dialect difference. Native listeners, however, efficiently distinguish intrusive from canonical /r/ by exploiting the relevant acoustic variation.
Collapse
|
41
|
|
42
|
Consensus standards for the care of children and adolescents in Australian health services. Med J Aust 2011; 194:78-82. [DOI: 10.5694/j.1326-5377.2011.tb04172.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2009] [Accepted: 05/02/2010] [Indexed: 11/17/2022]
|
43
|
Abstract
Spoken-word recognition in a nonnative language is particularly difficult where it depends on discrimination between confusable phonemes. Four experiments here examine whether this difficulty is in part due to phantom competition from “near-words” in speech. Dutch listeners confuse English /æ/ and /ε/, which could lead to the sequence daf being interpreted as deaf, or lemp being interpreted as lamp. In auditory lexical decision, Dutch listeners indeed accepted such near-words as real English words more often than English listeners did. In cross-modal priming, near-words extracted from word or phrase contexts ( daf from DAFfodil, lemp from eviL EMPire) induced activation of corresponding real words ( deaf; lamp) for Dutch, but again not for English, listeners. Finally, by the end of untruncated carrier words containing embedded words or near-words ( definite; daffodil) no activation of the real embedded forms ( deaf in definite) remained for English or Dutch listeners, but activation of embedded near-words ( deaf in daffodil) did still remain, for Dutch listeners only. Misinterpretation of the initial vowel here favoured the phantom competitor and disfavoured the carrier (lexically represented as containing a different vowel). Thus, near-words compete for recognition and continue competing for longer than actually embedded words; nonnative listening indeed involves phantom competition.
Collapse
|
44
|
|
45
|
|
46
|
Abstract
The phoneme detection task is widely used in spoken-word recognition research. Alphabetically literate participants, however, are more used to explicit representations of letters than of phonemes. The present study explored whether phoneme detection is sensitive to how target phonemes are, or may be, orthographically realized. Listeners detected the target sounds [b, m, t, f, s, k] in word-initial position in sequences of isolated English words. Response times were faster to the targets [b, m, t], which have consistent word-initial spelling, than to the targets [f, s, k], which are inconsistently spelled, but only when spelling was rendered salient by the presence in the experiment of many irregularly spelled filler words. Within the inconsistent targets [f, s, k], there was no significant difference between responses to targets in words with more usual (foam, seed, cattle) versus less usual (phone, cede, kettle) spellings. Phoneme detection is thus not necessarily sensitive to orthographic effects; knowledge of spelling stored in the lexical representations of words does not automatically become available as word candidates are activated. However, salient orthographic manipulations in experimental input can induce such sensitivity. We attribute this to listeners' experience of the value of spelling in everyday situations that encourage phonemic decisions (such as learning new names).
Collapse
|
47
|
Prosodic Structure in Early Word Segmentation: ERP Evidence From Dutch Ten-Month-Olds. INFANCY 2009; 14:591-612. [DOI: 10.1080/15250000903263957] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
48
|
Cross-language differences in cue use for speech segmentation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:367-76. [PMID: 19603893 PMCID: PMC2723901 DOI: 10.1121/1.3129127] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2008] [Revised: 03/24/2009] [Accepted: 04/13/2009] [Indexed: 05/19/2023]
Abstract
Two artificial-language learning experiments directly compared English, French, and Dutch listeners' use of suprasegmental cues for continuous-speech segmentation. In both experiments, listeners heard unbroken sequences of consonant-vowel syllables, composed of recurring three- and four-syllable "words." These words were demarcated by (a) no cue other than transitional probabilities induced by their recurrence, (b) a consistent left-edge cue, or (c) a consistent right-edge cue. Experiment 1 examined a vowel lengthening cue. All three listener groups benefited from this cue in right-edge position; none benefited from it in left-edge position. Experiment 2 examined a pitch-movement cue. English listeners used this cue in left-edge position, French listeners used it in right-edge position, and Dutch listeners used it in both positions. These findings are interpreted as evidence of both language-universal and language-specific effects. Final lengthening is a language-universal effect expressing a more general (non-linguistic) mechanism. Pitch movement expresses prominence which has characteristically different placements across languages: typically at right edges in French, but at left edges in English and Dutch. Finally, stress realization in English versus Dutch encourages greater attention to suprasegmental variation by Dutch than by English listeners, allowing Dutch listeners to benefit from an informative pitch-movement cue even in an uncharacteristic position.
Collapse
|
49
|
Greater sensitivity to prosodic goodness in non-native than in native listeners (L). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:3522-3525. [PMID: 19507933 DOI: 10.1121/1.3117434] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
English listeners largely disregard suprasegmental cues to stress in recognizing words. Evidence for this includes the demonstration of Fear et al. [J. Acoust. Soc. Am. 97, 1893-1904 (1995)] that cross-splicings are tolerated between stressed and unstressed full vowels (e.g., au- of autumn, automata). Dutch listeners, however, do exploit suprasegmental stress cues in recognizing native-language words. In this study, Dutch listeners were presented with English materials from the study of Fear et al. Acceptability ratings by these listeners revealed sensitivity to suprasegmental mismatch, in particular, in replacements of unstressed full vowels by higher-stressed vowels, thus evincing greater sensitivity to prosodic goodness than had been shown by the original native listener group.
Collapse
|
50
|
Vowel devoicing and the perception of spoken Japanese words. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:1693-1703. [PMID: 19275326 DOI: 10.1121/1.3075556] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Three experiments, in which Japanese listeners detected Japanese words embedded in nonsense sequences, examined the perceptual consequences of vowel devoicing in that language. Since vowelless sequences disrupt speech segmentation [Norris et al. (1997). Cognit. Psychol. 34, 191-243], devoicing is potentially problematic for perception. Words in initial position in nonsense sequences were detected more easily when followed by a sequence containing a vowel than by a vowelless segment (with or without further context), and vowelless segments that were potential devoicing environments were no easier than those not allowing devoicing. Thus asa, "morning," was easier in asau or asazu than in all of asap, asapdo, asaf, or asafte, despite the fact that the /f/ in the latter two is a possible realization of fu, with devoiced [u]. Japanese listeners thus do not treat devoicing contexts as if they always contain vowels. Words in final position in nonsense sequences, however, produced a different pattern: here, preceding vowelless contexts allowing devoicing impeded word detection less strongly (so, sake was detected less accurately, but not less rapidly, in nyaksake-possibly arising from nyakusake-than in nyagusake). This is consistent with listeners treating consonant sequences as potential realizations of parts of existing lexical candidates wherever possible.
Collapse
|