1
|
Jasmin K, Tierney A, Obasih C, Holt L. Short-term perceptual reweighting in suprasegmental categorization. Psychon Bull Rev 2023; 30:373-382. [PMID: 35915382 PMCID: PMC9971089 DOI: 10.3758/s13423-022-02146-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2022] [Indexed: 11/08/2022]
Abstract
Segmental speech units such as phonemes are described as multidimensional categories whose perception involves contributions from multiple acoustic input dimensions, and the relative perceptual weights of these dimensions respond dynamically to context. For example, when speech is altered to create an "accent" in which two acoustic dimensions are correlated in a manner opposite that of long-term experience, the dimension that carries less perceptual weight is down-weighted to contribute less in category decisions. It remains unclear, however, whether this short-term reweighting extends to perception of suprasegmental features that span multiple phonemes, syllables, or words, in part because it has remained debatable whether suprasegmental features are perceived categorically. Here, we investigated the relative contribution of two acoustic dimensions to word emphasis. Participants categorized instances of a two-word phrase pronounced with typical covariation of fundamental frequency (F0) and duration, and in the context of an artificial "accent" in which F0 and duration (established in prior research on English speech as "primary" and "secondary" dimensions, respectively) covaried atypically. When categorizing "accented" speech, listeners rapidly down-weighted the secondary dimension (duration). This result indicates that listeners continually track short-term regularities across speech input and dynamically adjust the weight of acoustic evidence for suprasegmental decisions. Thus, dimension-based statistical learning appears to be a widespread phenomenon in speech perception extending to both segmental and suprasegmental categorization.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychology, Wolfson Building, Royal Holloway, University of London, Egham, Surrey, TW20 0EX, UK.
| | | | | | - Lori Holt
- Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
2
|
Word Order, Intonation, and Prosodic Phrasing: Individual Differences in the Production and Identification of Narrow and Wide Focus in Urdu. LANGUAGES 2022. [DOI: 10.3390/languages7020103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This study investigates speaker based variation in the use of word order and intonation to mark narrow and wide focus in Urdu. The identification of focus type and position, as well as the prosodic phrasing of declarative sentences produced in the target focus conditions, is also discussed. The results of a semi-spontaneous production experiment indicated no preference for a linear position, as the focused nouns were mostly placed in situ (89%). The analysis of phonetic cues showed significant inter- and intraspeaker variation in participants’ use of longer noun duration, higher F0 peak, and wider F0 range in the narrowly focused nouns, as compared with their counterparts produced in wide focus. In the identification survey conducted online, the consistent use of phonetic cues in speech production was found to influence the correct identification of narrow focus and the position of focused nouns. Another online survey, concerning the prosodic phrasing of sentences produced in narrow and wide focus, showed participants’ slight preference for a recursive Intonational Phrase boundary on the left edge of the narrowly focused nouns. The results of both the surveys show that Urdu speakers vary in their identification of focus as well as their choice of prosodic phrasing in the target contexts. This research highlights the role of individual variation in the use of word order and phonetic cues to mark narrow and wide focus in Urdu. It also illustrates that the identification of focus type and phrasing is far from uniform. These findings have implications for the analysis of intonation in general, as this study testifies that the production and identification of intonation and prosodic phrasing are not invariable and speakers and listeners differ in their use of available linguistic means (word order vs. intonational categories), the selection, as well as the manipulation of phonetic cues.
Collapse
|
3
|
Jasmin K, Dick F, Tierney AT. The Multidimensional Battery of Prosody Perception (MBOPP). Wellcome Open Res 2021; 5:4. [PMID: 35282675 PMCID: PMC8881696 DOI: 10.12688/wellcomeopenres.15607.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/20/2021] [Indexed: 11/20/2022] Open
Abstract
Prosody can be defined as the rhythm and intonation patterns spanning words, phrases and sentences. Accurate perception of prosody is an important component of many aspects of language processing, such as parsing grammatical structures, recognizing words, and determining where emphasis may be placed. Prosody perception is important for language acquisition and can be impaired in language-related developmental disorders. However, existing assessments of prosodic perception suffer from some shortcomings. These include being unsuitable for use with typically developing adults due to ceiling effects and failing to allow the investigator to distinguish the unique contributions of individual acoustic features such as pitch and temporal cues. Here we present the Multi-Dimensional Battery of Prosody Perception (MBOPP), a novel tool for the assessment of prosody perception. It consists of two subtests: Linguistic Focus, which measures the ability to hear emphasis or sentential stress, and Phrase Boundaries, which measures the ability to hear where in a compound sentence one phrase ends, and another begins. Perception of individual acoustic dimensions (Pitch and Duration) can be examined separately, and test difficulty can be precisely calibrated by the experimenter because stimuli were created using a continuous voice morph space. We present validation analyses from a sample of 59 individuals and discuss how the battery might be deployed to examine perception of prosody in various populations.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychology, Royal Holloway, University of London, Ehgam, TW20 0EX, UK
| | - Frederic Dick
- Psychological Sciences, Birkbeck University of London, London, WC1E 7HX, UK
| | | |
Collapse
|
4
|
Jasmin K, Dick F, Tierney AT. The Multidimensional Battery of Prosody Perception (MBOPP). Wellcome Open Res 2021; 5:4. [PMID: 35282675 PMCID: PMC8881696 DOI: 10.12688/wellcomeopenres.15607.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/20/2021] [Indexed: 09/03/2023] Open
Abstract
Prosody can be defined as the rhythm and intonation patterns spanning words, phrases and sentences. Accurate perception of prosody is an important component of many aspects of language processing, such as parsing grammatical structures, recognizing words, and determining where emphasis may be placed. Prosody perception is important for language acquisition and can be impaired in language-related developmental disorders. However, existing assessments of prosodic perception suffer from some shortcomings. These include being unsuitable for use with typically developing adults due to ceiling effects and failing to allow the investigator to distinguish the unique contributions of individual acoustic features such as pitch and temporal cues. Here we present the Multi-Dimensional Battery of Prosody Perception (MBOPP), a novel tool for the assessment of prosody perception. It consists of two subtests: Linguistic Focus, which measures the ability to hear emphasis or sentential stress, and Phrase Boundaries, which measures the ability to hear where in a compound sentence one phrase ends, and another begins. Perception of individual acoustic dimensions (Pitch and Duration) can be examined separately, and test difficulty can be precisely calibrated by the experimenter because stimuli were created using a continuous voice morph space. We present validation analyses from a sample of 59 individuals and discuss how the battery might be deployed to examine perception of prosody in various populations.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychology, Royal Holloway, University of London, Ehgam, TW20 0EX, UK
| | - Frederic Dick
- Psychological Sciences, Birkbeck University of London, London, WC1E 7HX, UK
| | | |
Collapse
|
5
|
Lee A, Nyland J, Peppé S. Irish English PEPS-C (2015 edition) and Learners of ESL. Folia Phoniatr Logop 2021; 73:527-536. [PMID: 33486498 DOI: 10.1159/000513082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 11/13/2020] [Indexed: 11/19/2022] Open
Abstract
INTRODUCTION Profiling Elements of Prosody in Speech-Communication (PEPS-C) is a computerised test for assessing the abilities to understand and use speech prosody in communication. It has been used to obtain a profile of strengths and weaknesses in different prosodic forms and functions for different clinical populations (e.g., autism spectrum disorders, Down syndrome) and second language learners. The 2015 edition of PEPS-C incorporates four new subtests addressing the understanding and production of lexical stress and phrasal stress, and collapses four form subtests (Intonation/Short-Item Input and Output, Prosody/Long-Item Input and Output) into two (Discrimination, Imitation). However, the suitability of these new tasks has not been reported in any published studies, although they are likely to be relevant for learners of English. Moreover, the present authors update the Irish English (IE) version of PEPS-C to the 2015 edition for another research project on prosodic skills in children with spina bifida. Hence, this paper reports the making of PEPS-C 2015 (IE) and examines the usefulness of this test in assessing a group of non-native speakers of English, as compared to a group of native speakers of IE. METHODS PEPS-C 2015 for Irish English was developed by adapting the test items of the UK English version of the test. The PEPS-C 2015 (IE) was then trialled on 25 native speakers of Irish English and 10 Spanish speakers who speak English as a second language (ESL). RESULTS All native speakers of Irish English showed competence (scored ≥75% correct) in the comprehension and expression of prosodic form and functions assessed. For the ESL speakers, the test identified two areas of possible difficulty for this group: Phrase Stress Comprehension and Expression, and Contrastive Stress Comprehension. CONCLUSION The PEPS-C 2015, with its extra stress tasks, might therefore be useful as a prosody assessment tool for ESL speakers, particularly those with a Romance first language or at an early stage of learning, but further research is needed.
Collapse
Affiliation(s)
- Alice Lee
- Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland,
| | - Joseph Nyland
- Department of Speech and Hearing Sciences, University College Cork, Cork, Ireland
| | | |
Collapse
|
6
|
Yuen I, Xu Rattanasone N, Schmidt E, Macdonald G, Holt R, Demuth K. Five-year-olds produce prosodic cues to distinguish compounds from lists in Australian English. JOURNAL OF CHILD LANGUAGE 2021; 48:110-128. [PMID: 32398184 DOI: 10.1017/s0305000920000227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Although previous research has indicated that five-year-olds can use acoustic cues to disambiguate compounds (N1 + N2) from lists (N1, N2) (e.g., 'ice-cream' vs. 'ice, cream') (Yoshida & Katz, 2004, 2006), their productions are not yet fully adult-like (Wells, Peppé & Goulandris, 2004). The goal of this study was to examine this issue in Australian English-speaking children, with a focus on their use of F0, word duration, and pauses. Twenty-four five-year-olds and 20 adults participated in an elicited production experiment. Like adults, children produced distinct F0 patterns for the two structures. They also used longer word durations and more pauses in lists compared to compounds, indicating the presence of a boundary in lists. However, unlike adults, they also inappropriately inserted more pauses within the compound, suggesting the presence of a boundary in compounds as well. The implications for understanding children's developing knowledge of how to map acoustic cues to prosodic structures are discussed.
Collapse
Affiliation(s)
- Ivan Yuen
- Macquarie University, Department of Linguistics, 16 University Avenue, Macquarie University, North Ryde, Sydney, NSW2109, Australia
| | - Nan Xu Rattanasone
- Macquarie University, Department of Linguistics, 16 University Avenue, Macquarie University, North Ryde, Sydney, NSW2109, Australia
| | | | | | - Rebecca Holt
- Macquarie University, Department of Linguistics, 16 University Avenue, Macquarie University, North Ryde, Sydney, NSW2109, Australia
| | - Katherine Demuth
- Macquarie University, Department of Linguistics, 16 University Avenue, Macquarie University, North Ryde, Sydney, NSW2109, Australia
| |
Collapse
|
7
|
Roettger TB, Rimland K. Listeners' adaptation to unreliable intonation is speaker-sensitive. Cognition 2020; 204:104372. [DOI: 10.1016/j.cognition.2020.104372] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 05/19/2020] [Accepted: 06/05/2020] [Indexed: 11/24/2022]
|
8
|
Roettger TB, Franke M. Evidential Strength of Intonational Cues and Rational Adaptation to (Un‐)Reliable Intonation. Cogn Sci 2019; 43:e12745. [DOI: 10.1111/cogs.12745] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 04/11/2019] [Accepted: 04/15/2019] [Indexed: 11/28/2022]
Affiliation(s)
- Timo B. Roettger
- Department of Linguistics Northwestern University & University of Cologne
| | | |
Collapse
|
9
|
A hierarchical linguistic information-based model of English prosody: L2 data analysis and implications for computer-assisted language learning. COMPUT SPEECH LANG 2018. [DOI: 10.1016/j.csl.2018.03.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
10
|
Kalathottukaren RT, Purdy R, McCormick SC, Ballard E. Behavioral Measures to Evaluate Prosodic Skills: A Review of Assessment Tools for Children and Adults. ACTA ACUST UNITED AC 2015. [DOI: 10.1044/cicsd_42_s_138] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
11
|
Patel R, Niziolek C, Reilly K, Guenther FH. Prosodic adaptations to pitch perturbation in running speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2011; 54:1051-9. [PMID: 21173388 PMCID: PMC3352853 DOI: 10.1044/1092-4388(2010/10-0162)] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
PURPOSE A feedback perturbation paradigm was used to investigate whether prosodic cues are controlled independently or in an integrated fashion during sentence production. METHOD Twenty-one healthy speakers of American English were asked to produce sentences with emphatic stress while receiving real-time auditory feedback of their productions. The fundamental frequency (F0) of the stressed word in each 4-word sentence was selectively shifted in a sensorimotor adaptation protocol. Speakers experienced either an upward or a downward shift of the stressed word, which gradually altered the perceived stress of the sentence. RESULTS Participants in the Up and Down groups adapted to F0 shifts by altering the contrast between stressed and unstressed words differentially, such that the two groups deviated from each other in the perturbation phase. Furthermore, selective F0 perturbation in sentences with emphatic stress resulted in compensatory changes in both F0 and intensity. CONCLUSIONS Present findings suggest that F0 and intensity are controlled in an integrated fashion to maintain the contrast between stressed and unstressed words. When a cue is impaired through perturbation, speakers not only oppose the perturbation but enhance other prosodic cues to achieve emphatic stress.
Collapse
|
12
|
Patel R, Brayton JT. Identifying prosodic contrasts in utterances produced by 4-, 7-, and 11-year-old children. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2009; 52:790-801. [PMID: 18978213 DOI: 10.1044/1092-4388(2008/07-0137)] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
PURPOSE Acquisition of prosodic control appears to evolve across development with younger children relying on durational cues and older children utilizing a broader spectrum of cues including fundamental frequency, intensity, and duration. This study aimed to determine whether unfamiliar listeners could identify prosodic contrasts produced by 4-, 7-, and 11-year-olds despite differences in acoustic cues used by each age group. METHOD Thirty-six adult monolingual speakers of American English participated as listeners. A previous study yielded speech recordings from 12 children (2 male, 2 female from each age group) producing 2 linguistic contrasts, question-statement and contrastive stress, which served as listening stimuli. RESULTS In both tasks, listener accuracy ranged from 39.7% to 100% with significant differences between 4-year-olds and both older age groups. Listeners had difficulty deciphering the 4-year-olds' questions compared with statements and were more accurate in identifying contrastive stress placed on sentence-initial words compared with sentence-final words across all age groups. CONCLUSION Although listeners identified prosodic contrasts produced by all 3 age groups, accuracy was significantly higher for 7- and 11-year-old productions. Findings are consistent with production studies that suggest relative stabilization of prosodic control between ages 4 and 7. Parallels between prosodic and segmental acquisition are discussed.
Collapse
Affiliation(s)
- Rupal Patel
- Department of Speech Language Pathology and Audiology, Northeastern University, 106 Forsyth Building, Boston, MA 02115, USA.
| | | |
Collapse
|
13
|
Patel R, Campellone P. Acoustic and perceptual cues to contrastive stress in dysarthria. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2009; 52:206-222. [PMID: 18695019 DOI: 10.1044/1092-4388(2008/07-0078)] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
PURPOSE In this study, the authors sought to understand acoustic and perceptual cues to contrastive stress in speakers with dysarthria (DYS) and healthy controls (HC). METHOD The production experiment examined the ability of 12 DYS (9 male, 3 female; M=39 years of age) and 12 age- and gender-matched HC (9 male, 3 female; M=37.5 years of age) to signal contrastive stress within short sentences. Acoustic changes in fundamental frequency (F0), intensity, and duration were studied. The perceptual experiment explored whether 48 unfamiliar listeners (24 male, 24 female; M=23.4 years of age) could identify the intended stress location in DYS and HC productions. RESULTS Although both speaker groups used all 3 prosodic cues, DYS relied more heavily on duration. Despite reduced F0 and intensity variation within DYS utterances, listeners were highly accurate at identifying both DYS (>93%) and HC (>97%) productions. Acoustic predictors of listener accuracy included heightened prosodic cues on stressed words along with marked decreases in these variables for neighboring nonstressed words. CONCLUSIONS Speakers signaled contrastive stress using relative changes in one or more prosodic cue. Although individual speakers employed different cue combinations, listeners were highly adept at discerning the intended stress location. The communicative potential of prosody in speakers with congenital dysarthria is discussed.
Collapse
Affiliation(s)
- Rupal Patel
- Bouvé College of Health Sciences, Department of Speech-Language Pathology and Audiology, Northeastern University, 360 Huntington Avenue, Room 102 Forsyth Building, Boston, MA 02115, USA.
| | | |
Collapse
|
14
|
Wells B, Peppé S. Intonation abilities of children with speech and language impairments. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2003; 46:5-20. [PMID: 12647884 DOI: 10.1044/1092-4388(2003/001)] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Intonation has been little studied in children with speech and language impairments, although deficits in related aspects of prosody have been hypothesized to underlie specific language impairment. In this study a new intonation battery, the Profiling Elements of Prosodic Systems-Child version (PEPS-C), was administered to 18 children with speech and/or language impairments (LI). PEPS-C comprises 16 tasks (8 x 8, Input x Output) tapping phonetic and functional aspects of intonation in four areas: grammar, affect, interaction, and pragmatics. Scores were compared to a chronological age (CA) matched group of 28 children and a group of 18 children matched for language comprehension (LC). Measures of language comprehension, expressive language, nonverbal intelligence, and segmental phonology were also taken. The LI group did not score significantly below the LC group on any PEPS-C task. On 5 of 16 tasks, the LI group scored significantly lower than the CA group. In the LI group, there were just 2 significant correlations between a PEPS-C task and 1 of the nonprosodic measures. The results support the view that intonation is relatively discrete from other levels of speech and language while suggesting some specific areas of possible vulnerability: auditory memory for longer prosodic strings and the of prosody for pragmatic/interactional purposes.
Collapse
Affiliation(s)
- Bill Wells
- Department of Human Communication Sciences, University of Sheffield, Sheffield, UK.
| | | |
Collapse
|