1
|
Cristia A, Gautheron L, Zhang Z, Schuller B, Scaff C, Rowland C, Räsänen O, Peurey L, Lavechin M, Havard W, Fausey CM, Cychosz M, Bergelson E, Anderson H, Al Futaisi N, Soderstrom M. Establishing the reliability of metrics extracted from long-form recordings using LENA and the ACLEW pipeline. Behav Res Methods 2024; 56:8588-8607. [PMID: 39304601 DOI: 10.3758/s13428-024-02493-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2024] [Indexed: 09/22/2024]
Abstract
Long-form audio recordings are increasingly used to study individual variation, group differences, and many other topics in theoretical and applied fields of developmental science, particularly for the description of children's language input (typically speech from adults) and children's language output (ranging from babble to sentences). The proprietary LENA software has been available for over a decade, and with it, users have come to rely on derived metrics like adult word count (AWC) and child vocalization counts (CVC), which have also more recently been derived using an open-source alternative, the ACLEW pipeline. Yet, there is relatively little work assessing the reliability of long-form metrics in terms of the stability of individual differences across time. Filling this gap, we analyzed eight spoken-language datasets: four from North American English-learning infants, and one each from British English-, French-, American English-/Spanish-, and Quechua-/Spanish-learning infants. The audio data were analyzed using two types of processing software: LENA and the ACLEW open-source pipeline. When all corpora were included, we found relatively low to moderate reliability (across multiple recordings, intraclass correlation coefficient attributed to the child identity [Child ICC], was < 50% for most metrics). There were few differences between the two pipelines. Exploratory analyses suggested some differences as a function of child age and corpora. These findings suggest that, while reliability is likely sufficient for various group-level analyses, caution is needed when using either LENA or ACLEW tools to study individual variation. We also encourage improvement of extant tools, specifically targeting accurate measurement of individual variation.
Collapse
Affiliation(s)
- Alejandrina Cristia
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France.
| | - Lucas Gautheron
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France
- Interdisciplinary Centre for Science and Technology Studies (IZWT) Wuppertal, University of Wuppertal, Nordrhein-Westfalen, Germany
| | - Zixing Zhang
- School of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China
| | - Björn Schuller
- Technische Universität München, Institute for Human-Machine Communication, Munich, Germany
- Imperial College London, GLAM - Group on Language, Audio, & Music, London, UK
| | - Camila Scaff
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France
- Human Ecology group, Institute of Evolutionary Medicine, University of Zurich, Zurich, Switzerland
| | - Caroline Rowland
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Okko Räsänen
- Unit of Computing Sciences, Tampere University, Tampere, Finland
| | - Loann Peurey
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France
| | - Marvin Lavechin
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France
| | - William Havard
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'Etudes cognitives, ENS, EHESS, CNRS, PSL University, Paris, France
- LLL, Université d'Orléans, CNRS, Orléans, France
| | | | - Margaret Cychosz
- Department of Hearing and Speech Sciences, University of Maryland at College Park, College Park, MD, USA
| | | | | | - Najla Al Futaisi
- Imperial College London, GLAM - Group on Language, Audio, & Music, London, UK
| | | |
Collapse
|
2
|
Ferjan Ramírez N, Marjanovič Umek L, Fekonja U. Language environment and early language production in Slovenian infants: An exploratory study using daylong recordings. INFANCY 2024; 29:811-837. [PMID: 39044327 DOI: 10.1111/infa.12615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 05/20/2024] [Accepted: 06/02/2024] [Indexed: 07/25/2024]
Abstract
Daylong recordings provide an ecologically valid option for analyzing language input, and have become a central method for studying child language development. However, the vast majority of this work has been conducted in North America. We harnessed a unique collection of daylong recordings from Slovenian infants (age: 16-30 months, N = 40, 18 girls), and focus our attention on manually annotated measures of parentese (infant-directed speech with a higher pitch, slower tempo, and exaggerated intonation), conversational turns, infant words, and word combinations. Measures from daylong recordings showed large variation, but were comparable to previous studies with North American samples. Infants heard almost twice as much speech and parentese from mothers compared to fathers, but there were no differences in language input to boys and girls. Positive associations were found between the social-interactional features of language input (parentese, turn-taking) and infants' concurrent language production. Measures of child speech from daylong recordings were positively correlated with measures obtained through the Slovenian MacArthur-Bates Communicative Development Inventory. These results support the notion that the social-interactional features of parental language input are the foundation of infants' language skills, even in an environment where infants spend much of their waking hours in childcare settings, as they do in Slovenia.
Collapse
Affiliation(s)
- Naja Ferjan Ramírez
- Department of Linguistics, University of Washington, Seattle, Washington, USA
| | | | - Urška Fekonja
- Department of Psychology, Faculty of Arts, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
3
|
Kosie JE, Lew-Williams C. Infant-directed communication: Examining the many dimensions of everyday caregiver-infant interactions. Dev Sci 2024; 27:e13515. [PMID: 38618899 PMCID: PMC11333185 DOI: 10.1111/desc.13515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 03/22/2024] [Accepted: 03/30/2024] [Indexed: 04/16/2024]
Abstract
Everyday caregiver-infant interactions are dynamic and multidimensional. However, existing research underestimates the dimensionality of infants' experiences, often focusing on one or two communicative signals (e.g., speech alone, or speech and gesture together). Here, we introduce "infant-directed communication" (IDC): the suite of communicative signals from caregivers to infants including speech, action, gesture, emotion, and touch. We recorded 10 min of at-home play between 44 caregivers and their 18- to 24-month-old infants from predominantly white, middle-class, English-speaking families in the United States. Interactions were coded for five dimensions of IDC as well as infants' gestures and vocalizations. Most caregivers used all five dimensions of IDC throughout the interaction, and these dimensions frequently overlapped. For example, over 60% of the speech that infants heard was accompanied by one or more non-verbal communicative cues. However, we saw marked variation across caregivers in their use of IDC, likely reflecting tailored communication to the behaviors and abilities of their infant. Moreover, caregivers systematically increased the dimensionality of IDC, using more overlapping cues in response to infant gestures and vocalizations, and more IDC with infants who had smaller vocabularies. Understanding how and when caregivers use all five signals-together and separately-in interactions with infants has the potential to redefine how developmental scientists conceive of infants' communicative environments, and enhance our understanding of the relations between caregiver input and early learning. RESEARCH HIGHLIGHTS: Infants' everyday interactions with caregivers are dynamic and multimodal, but existing research has underestimated the multidimensionality (i.e., the diversity of simultaneously occurring communicative cues) inherent in infant-directed communication. Over 60% of the speech that infants encounter during at-home, free play interactions overlap with one or more of a variety of non-speech communicative cues. The multidimensionality of caregivers' communicative cues increases in response to infants' gestures and vocalizations, providing new information about how infants' own behaviors shape their input. These findings emphasize the importance of understanding how caregivers use a diverse set of communicative behaviors-both separately and together-during everyday interactions with infants.
Collapse
Affiliation(s)
- Jessica E Kosie
- Department of Psychology, Princeton University, Princeton, New Jersey, USA
- School of Social and Behavioral Sciences, Arizona State University, Phoenix, Arizona, USA
| | - Casey Lew-Williams
- Department of Psychology, Princeton University, Princeton, New Jersey, USA
| |
Collapse
|
4
|
Ferjan Ramírez N, Hippe DS. Estimating infants' language exposure: A comparison of random and volume sampling from daylong recordings collected in a bilingual community. Infant Behav Dev 2024; 75:101943. [PMID: 38537574 DOI: 10.1016/j.infbeh.2024.101943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 02/26/2024] [Accepted: 03/13/2024] [Indexed: 06/11/2024]
Abstract
In North America, the characteristics of a child's language environment predict language outcomes. For example, differences in bilingual language exposure, exposure to electronic media, and exposure to child-directed speech (CDS) relate to children's language growth. Recently, these predictors have been studied through the use of daylong recordings, followed by manual annotation of audio samples selected from these recordings. Using a dataset of daylong recordings collected from bilingually raised infants in the United States as an example, we ask whether two of the most commonly used sampling methods, random sampling and sampling based on high adult speech, differ from each other with regard to estimating the frequencies of specific language behaviors. Daylong recordings from 37 Spanish-English speaking families with infants between 4 and 22 months of age were analyzed. From each child's recording, samples were extracted in two ways (at random/based on high adult speech) and then annotated for Language (Spanish/English/Mixed), CDS, Electronic Media, Social Context, Turn-Taking, and Infant Babbling. Correlation and agreement analyses were performed, in addition to paired sample t-tests, to assess how the choice of one or the other sampling method may affect the estimates. For most behaviors studied, correlation and agreement between the two sampling methods was high (Pearson r values between 0.79 and 0.99 for 16 of 17 measures; Intraclass Correlation Coefficient values between 0.78 and 0.99 for 13 of 17 measures). However, interesting between-sample differences also emerged: the degree of language mixing, the amount of CDS, and the number of conversational turns were all significantly higher when sampling was performed based on high adult speech compared to random sampling. By contrast, the presence of electronic media and one-on-one social contexts was higher when sampling was performed at random. We discuss advantages of choosing one sampling technique over the other, depending on the research question and variables at hand.
Collapse
Affiliation(s)
| | - Daniel S Hippe
- Clinical Research Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| |
Collapse
|
5
|
Katus L, Crespo-Llado MM, Milosavljevic B, Saidykhan M, Njie O, Fadera T, McCann S, Acolatse L, Perapoch Amadó M, Rozhko M, Moore SE, Elwell CE, Lloyd-Fox S. It takes a village: Caregiver diversity and language contingency in the UK and rural Gambia. Infant Behav Dev 2024; 74:101913. [PMID: 38056188 DOI: 10.1016/j.infbeh.2023.101913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 09/28/2023] [Accepted: 11/26/2023] [Indexed: 12/08/2023]
Abstract
INTRODUCTION There is substantial diversity within and between contexts globally in caregiving practices and family composition, which may have implications for the early interaction's infants engage in. We draw on data from the Brain Imaging for Global Health (BRIGHT, www.globalfnirs.org/the-bright-project) project, which longitudinally examined infants in the UK and in rural Gambia, West Africa. In The Gambia, households are commonly characterized by multigenerational, frequently polygamous family structures, which, in part, is reflected in the diversity of caregivers a child spends time with. In this paper, we aim to 1) evaluate and validate the Language Environment Analysis (LENA) for use in the Mandinka speaking families in The Gambia, 2) examine the nature (i.e., prevalence of turn taking) and amount (i.e., adult and child vocalizations) of conversation that infants are exposed to from 12 to 24 months of age and 3) investigate the link between caregiver diversity and child language outcomes, examining the mediating role of contingent turn taking. METHOD We obtained naturalistic seven-hour-long LENA recordings at 12, 18 and 24 months of age from a cohort of N = 204 infants from Mandinka speaking households in The Gambia and N = 61 infants in the UK. We examined developmental changes and site differences in LENA counts of adult word counts (AWC), contingent turn taking (CTT) and child vocalizations (CVC). In the larger and more heterogenous Gambian sample, we also investigated caregiver predictors of turn taking frequency. We hereby examined the number of caregivers present over the recording day and the consistency of caregivers across two subsequent days per age point. We controlled for children's cognitive development via the Mullen Scales of Early Learning (MSEL). RESULTS Our LENA validation showed high internal consistency between the human coders and automated LENA outputs (Cronbach's alpha's all >.8). All LENA counts were higher in the UK compared to the Gambian cohort. In The Gambia, controlling for overall neurodevelopment via the MSEL, CTT at 12 and 18 months predicted CVC at 18 and 24 months. Caregiver consistency was associated with CTT counts at 18 and 24 months. The number of caregivers and CTT counts showed an inverted u-shape relationship at 18 and 24 months, with an intermediate number of caregivers being associated with the highest CTT frequencies. Mediation analyses showed a partial mediation by number of caregivers and CTT and 24-month CVC. DISCUSSION The LENA provided reliable estimates for the Mandinka language in the home recording context. We showed that turn taking is associated with subsequent child vocalizations and explored contextual caregiving factors contributing to turn taking in the Gambian cohort.
Collapse
Affiliation(s)
- Laura Katus
- Institute for Lifecourse Development, School of Human Sciences, University of Greenwich, UK; Centre for Family Research, University of Cambridge, UK.
| | | | - Bosiljka Milosavljevic
- Department of Biological and Experimental Psychology, Queen Mary University of London, UK
| | - Mariama Saidykhan
- Medical Research Council Unit The Gambia at London School of Hygiene and Tropical Medicine, UK
| | - Omar Njie
- Medical Research Council Unit The Gambia at London School of Hygiene and Tropical Medicine, UK
| | - Tijan Fadera
- Medical Research Council Unit The Gambia at London School of Hygiene and Tropical Medicine, UK
| | - Samantha McCann
- Medical Research Council Unit The Gambia at London School of Hygiene and Tropical Medicine, UK; Department of Women and Children's Health, Kings College London, UK
| | - Lena Acolatse
- Nutrition Innovation Centre for Food and Health (NICHE), School of Biomedical Sciences, Ulster University, UK
| | | | - Maria Rozhko
- Department of Psychology, University of Cambridge, UK
| | - Sophie E Moore
- Medical Research Council Unit The Gambia at London School of Hygiene and Tropical Medicine, UK; Department of Women and Children's Health, Kings College London, UK
| | - Clare E Elwell
- Department of Medical Physics and Biomedical Engineering, UK
| | | |
Collapse
|
6
|
Berger SE, Baria AT. Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches. FRONTIERS IN PAIN RESEARCH 2022; 3:896276. [PMID: 35721658 PMCID: PMC9201034 DOI: 10.3389/fpain.2022.896276] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 05/12/2022] [Indexed: 11/13/2022] Open
Abstract
Pain research traverses many disciplines and methodologies. Yet, despite our understanding and field-wide acceptance of the multifactorial essence of pain as a sensory perception, emotional experience, and biopsychosocial condition, pain scientists and practitioners often remain siloed within their domain expertise and associated techniques. The context in which the field finds itself today-with increasing reliance on digital technologies, an on-going pandemic, and continued disparities in pain care-requires new collaborations and different approaches to measuring pain. Here, we review the state-of-the-art in human pain research, summarizing emerging practices and cutting-edge techniques across multiple methods and technologies. For each, we outline foreseeable technosocial considerations, reflecting on implications for standards of care, pain management, research, and societal impact. Through overviewing alternative data sources and varied ways of measuring pain and by reflecting on the concerns, limitations, and challenges facing the field, we hope to create critical dialogues, inspire more collaborations, and foster new ideas for future pain research methods.
Collapse
Affiliation(s)
- Sara E. Berger
- Responsible and Inclusive Technologies Research, Exploratory Sciences Division, IBM Thomas J. Watson Research Center, Yorktown Heights, NY, United States
| | | |
Collapse
|
7
|
Cychosz M, Cristia A. Using big data from long-form recordings to study development and optimize societal impact. ADVANCES IN CHILD DEVELOPMENT AND BEHAVIOR 2022; 62:1-36. [PMID: 35249679 DOI: 10.1016/bs.acdb.2021.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Big data are everywhere. In this chapter, we focus on one source: long-form, child-centered recordings collected using wearable technologies. Because these recordings are simultaneously unobtrusive and encompassing, they may be a breakthrough technology for clinicians and researchers from several diverse fields. We demonstrate this possibility by outlining three applications for the recordings-clinical treatment, large-scale interventions, and language documentation-where we see the greatest potential. We argue that incorporating these recordings into basic and applied research will result in more equitable treatment of patients, more reliable measurements of the effects of interventions on real-world behavior, and deeper scientific insights with less observational bias. We conclude by outlining a proposal for a semistructured online platform where vast numbers of long-form recordings could be hosted and more representative, less biased algorithms could be trained.
Collapse
Affiliation(s)
- Margaret Cychosz
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, United States; Center for Comparative and Evolutionary Biology of Hearing, University of Maryland, College Park, MD, United States
| | - Alejandrina Cristia
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'études cognitives, ENS, EHESS, CNRS, PSL University, Paris, France.
| |
Collapse
|
8
|
Gautheron L, Rochat N, Cristia A. Managing, storing, and sharing long-form recordings and their annotations. LANG RESOUR EVAL 2022. [DOI: 10.1007/s10579-022-09579-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
9
|
Warlaumont AS, Sobowale K, Fausey CM. Daylong Mobile Audio Recordings Reveal Multitimescale Dynamics in Infants' Vocal Productions and Auditory Experiences. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 2022; 31:12-19. [PMID: 35707791 PMCID: PMC9197087 DOI: 10.1177/09637214211058166] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/20/2023]
Abstract
The sounds of human infancy-baby babbling, adult talking, lullaby singing, and more-fluctuate over time. Infant-friendly wearable audio recorders can now capture very large quantities of these sounds throughout infants' everyday lives at home. Here, we review recent discoveries about how infants' soundscapes are organized over the course of a day based on analyses designed to detect patterns at multiple timescales. Analyses of infants' day-long audio have revealed that everyday vocalizations are clustered hierarchically in time, vocal explorations are consistent with foraging dynamics, and musical tunes are distributed such that some are much more available than others. This approach focusing on the multi-scale distributions of sounds heard and produced by infants provides new, fundamental insights on human communication development from a complex systems perspective.
Collapse
Affiliation(s)
| | - Kunmi Sobowale
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles
| | | |
Collapse
|
10
|
Meylan SC, Bergelson E. Learning Through Processing: Toward an Integrated Approach to Early Word Learning. ANNUAL REVIEW OF LINGUISTICS 2021; 8:77-99. [PMID: 35481110 PMCID: PMC9037961 DOI: 10.1146/annurev-linguistics-031220-011146] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Children's linguistic knowledge and the learning mechanisms by which they acquire it grow substantially in infancy and toddlerhood, yet theories of word learning largely fail to incorporate these shifts. Moreover, researchers' often-siloed focus on either familiar word recognition or novel word learning limits the critical consideration of how these two relate. As a step toward a mechanistic theory of language acquisition, we present a framework of "learning through processing" and relate it to the prevailing methods used to assess children's early knowledge of words. Incorporating recent empirical work, we posit a specific, testable timeline of qualitative changes in the learning process in this interval. We conclude with several challenges and avenues for building a comprehensive theory of early word learning: better characterization of the input, reconciling results across approaches, and treating lexical knowledge in the nascent grammar with sufficient sophistication to ensure generalizability across languages and development.
Collapse
Affiliation(s)
- Stephan C Meylan
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
- Department of Psychology and Neuroscience, Duke University, Durham, North Carolina, USA
| | - Elika Bergelson
- Department of Psychology and Neuroscience, Duke University, Durham, North Carolina, USA
| |
Collapse
|
11
|
Mendoza JK, Fausey CM. Quantifying Everyday Ecologies: Principles for Manual Annotation of Many Hours of Infants' Lives. Front Psychol 2021; 12:710636. [PMID: 34552533 PMCID: PMC8450442 DOI: 10.3389/fpsyg.2021.710636] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Accepted: 07/20/2021] [Indexed: 11/25/2022] Open
Abstract
Everyday experiences are the experiences available to shape developmental change. Remarkable advances in devices used to record infants' and toddlers' everyday experiences, as well as in repositories to aggregate and share such recordings across teams of theorists, have yielded a potential gold mine of insights to spur next-generation theories of experience-dependent change. Making full use of these advances, however, currently requires manual annotation. Manually annotating many hours of everyday life is a dedicated pursuit requiring significant time and resources, and in many domains is an endeavor currently lacking foundational facts to guide potentially consequential implementation decisions. These realities make manual annotation a frequent barrier to discoveries, as theorists instead opt for narrower scoped activities. Here, we provide theorists with a framework for manually annotating many hours of everyday life designed to reduce both theoretical and practical overwhelm. We share insights based on our team's recent adventures in the previously uncharted territory of everyday music. We identify principles, and share implementation examples and tools, to help theorists achieve scalable solutions to challenges that are especially fierce when annotating extended timescales. These principles for quantifying everyday ecologies will help theorists collectively maximize return on investment in databases of everyday recordings and will enable a broad community of scholars—across institutions, skillsets, experiences, and working environments—to make discoveries about the experiences upon which development may depend.
Collapse
Affiliation(s)
- Jennifer K Mendoza
- Department of Psychology, University of Oregon, Eugene, OR, United States
| | - Caitlin M Fausey
- Department of Psychology, University of Oregon, Eugene, OR, United States
| |
Collapse
|
12
|
Semenzin C, Hamrick L, Seidl A, Kelleher BL, Cristia A. Describing Vocalizations in Young Children: A Big Data Approach Through Citizen Science Annotation. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2401-2416. [PMID: 34098723 PMCID: PMC8632511 DOI: 10.1044/2021_jslhr-20-00661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 02/19/2021] [Accepted: 02/19/2021] [Indexed: 05/13/2023]
Abstract
Purpose Recording young children's vocalizations through wearables is a promising method to assess language development. However, accurately and rapidly annotating these files remains challenging. Online crowdsourcing with the collaboration of citizen scientists could be a feasible solution. In this article, we assess the extent to which citizen scientists' annotations align with those gathered in the lab for recordings collected from young children. Method Segments identified by Language ENvironment Analysis as produced by the key child were extracted from one daylong recording for each of 20 participants: 10 low-risk control children and 10 children diagnosed with Angelman syndrome, a neurogenetic syndrome characterized by severe language impairments. Speech samples were annotated by trained annotators in the laboratory as well as by citizen scientists on Zooniverse. All annotators assigned one of five labels to each sample: Canonical, Noncanonical, Crying, Laughing, and Junk. This allowed the derivation of two child-level vocalization metrics: the Linguistic Proportion and the Canonical Proportion. Results At the segment level, Zooniverse classifications had moderate precision and recall. More importantly, the Linguistic Proportion and the Canonical Proportion derived from Zooniverse annotations were highly correlated with those derived from laboratory annotations. Conclusions Annotations obtained through a citizen science platform can help us overcome challenges posed by the process of annotating daylong speech recordings. Particularly when used in composites or derived metrics, such annotations can be used to investigate early markers of language delays.
Collapse
Affiliation(s)
- Chiara Semenzin
- Laboratoire de Sciences Cognitives et Psycholinguistique, Département d'Etudes Cognitives, Ecole Normale Supérieure, EHESS, Centre Nationale de la Recherche Scientifique, PSL University, Paris, France
| | | | | | | | - Alejandrina Cristia
- Laboratoire de Sciences Cognitives et Psycholinguistique, Département d'Etudes Cognitives, Ecole Normale Supérieure, EHESS, Centre Nationale de la Recherche Scientifique, PSL University, Paris, France
| |
Collapse
|
13
|
Casillas M, Brown P, Levinson SC. Early language experience in a Papuan community. JOURNAL OF CHILD LANGUAGE 2021; 48:792-814. [PMID: 32988426 DOI: 10.1017/s0305000920000549] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The rate at which young children are directly spoken to varies due to many factors, including (a) caregiver ideas about children as conversational partners and (b) the organization of everyday life. Prior work suggests cross-cultural variation in rates of child-directed speech is due to the former factor, but has been fraught with confounds in comparing postindustrial and subsistence farming communities. We investigate the daylong language environments of children (0;0-3;0) on Rossel Island, Papua New Guinea, a small-scale traditional community where prior ethnographic study demonstrated contingency-seeking child interaction styles. In fact, children were infrequently directly addressed and linguistic input rate was primarily affected by situational factors, though children's vocalization maturity showed no developmental delay. We compare the input characteristics between this community and a Tseltal Mayan one in which near-parallel methods produced comparable results, then briefly discuss the models and mechanisms for learning best supported by our findings.
Collapse
|
14
|
ALICE: An open-source tool for automatic measurement of phoneme, syllable, and word counts from child-centered daylong recordings. Behav Res Methods 2021; 53:818-835. [PMID: 32875399 PMCID: PMC8062390 DOI: 10.3758/s13428-020-01460-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Recordings captured by wearable microphones are a standard method for investigating young children's language environments. A key measure to quantify from such data is the amount of speech present in children's home environments. To this end, the LENA recorder and software-a popular system for measuring linguistic input-estimates the number of adult words that children may hear over the course of a recording. However, word count estimation is challenging to do in a language- independent manner; the relationship between observable acoustic patterns and language-specific lexical entities is far from uniform across human languages. In this paper, we ask whether some alternative linguistic units, namely phone(me)s or syllables, could be measured instead of, or in parallel with, words in order to achieve improved cross-linguistic applicability and comparability of an automated system for measuring child language input. We discuss the advantages and disadvantages of measuring different units from theoretical and technical points of view. We also investigate the practical applicability of measuring such units using a novel system called Automatic LInguistic unit Count Estimator (ALICE) together with audio from seven child-centered daylong audio corpora from diverse cultural and linguistic environments. We show that language-independent measurement of phoneme counts is somewhat more accurate than syllables or words, but all three are highly correlated with human annotations on the same data. We share an open-source implementation of ALICE for use by the language research community, enabling automatic phoneme, syllable, and word count estimation from child-centered audio recordings.
Collapse
|
15
|
Lourenço V, Coutinho J, Pereira AF. Advances in microanalysis: Magnifying the social microscope on mother-infant interactions. Infant Behav Dev 2021; 64:101571. [PMID: 34022549 DOI: 10.1016/j.infbeh.2021.101571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 04/28/2021] [Accepted: 04/30/2021] [Indexed: 10/21/2022]
Abstract
Microanalysis is a method for recording and coding interactional behavior. It has been often compared to a social microscope, for its power in detailing the second-by-second dynamics of social interaction. Microanalysis has deep multidisciplinary foundations, that privilege the description of interactions as they naturally occur, with the purpose of understanding the relations between multiple and simultaneous streams of behaviors. In developmental science, microanalysis has uncovered structural and temporal elements in mother-infant interactions, improving our understanding of the effects of mother-infant interpersonal adaptation in the infant's cognitive and social-emotional development. Detailed manual coding is time intensive and resource demanding, imposing restrictions to sample size, and the ability to analyze multiple behavioral modalities. Moreover, recent increases in the density of multivariate data require different tools. We review present-day techniques that tackle those challenges: (1) sensing techniques for motion tracking and physiological recording; (2) exploratory techniques for detecting patterns from high-density data; and (3) inferential and modeling techniques for understanding contingencies between interactional time series. Two illustrations, from recent developmental research, reveal the power of bringing new lenses to our social microscope: (1) egocentric vision, the use of head mounted cameras and eye-trackers in capturing the infant's first-person perspective of a social exchange; and (2) daily activity sensing, wearable multimodal sensing that brought mother-infant interaction research to the environments where it naturally unfolds.
Collapse
Affiliation(s)
- Vladimiro Lourenço
- Development and Psychopathology, CIPsi, School of Psychology, University of Minho, Portugal
| | - Joana Coutinho
- Psychological Neuroscience, CIPsi, School of Psychology, University of Minho, Portugal
| | - Alfredo F Pereira
- Development and Psychopathology, CIPsi, School of Psychology, University of Minho, Portugal.
| |
Collapse
|
16
|
Cychosz M, Munson B, Edwards JR. Practice and experience predict coarticulation in child speech. LANGUAGE LEARNING AND DEVELOPMENT : THE OFFICIAL JOURNAL OF THE SOCIETY FOR LANGUAGE DEVELOPMENT 2021; 17:366-396. [PMID: 34483779 PMCID: PMC8412131 DOI: 10.1080/15475441.2021.1890080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Much research in child speech development suggests that young children coarticulate more than adults. There are multiple, not mutually-exclusive, explanations for this pattern. For example, children may coarticulate more because they are limited by immature motor control. Or they may coarticulate more if they initially represent phonological segments in larger, more holistic units such as syllables or feet. We tested the importance of several different explanations for coarticulation in child speech by evaluating how four-year-olds' language experience, speech practice, and speech planning predicted their coarticulation between adjacent segments in real words and paired nonwords. Children with larger vocabularies coarticulated less, especially in real words, though there were no reliable coarticulatory differences between real words and nonwords after controlling for word duration. Children who vocalized more throughout a daylong audio recording also coarticulated less. Quantity of child vocalizations was more predictive of the degree of children's coarticulation than a measure of receptive language experience, adult word count. Overall, these results suggest strong roles for children's phonological representations and speech practice, as well as their immature fine motor control, for coarticulatory development.
Collapse
Affiliation(s)
- Margaret Cychosz
- Department of Hearing and Speech Sciences, University of Maryland, College Park
- Center for Comparative and Evolutionary Biology of Hearing, University of Maryland, College Park
| | - Benjamin Munson
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities
| | - Jan R. Edwards
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| |
Collapse
|
17
|
Roete I, Frank SL, Fikkert P, Casillas M. Modeling the Influence of Language Input Statistics on Children's Speech Production. Cogn Sci 2020; 44:e12924. [PMID: 33349953 DOI: 10.1111/cogs.12924] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 08/14/2020] [Accepted: 10/01/2020] [Indexed: 11/30/2022]
Abstract
We trained a computational model (the Chunk-Based Learner; CBL) on a longitudinal corpus of child-caregiver interactions in English to test whether one proposed statistical learning mechanism-backward transitional probability-is able to predict children's speech productions with stable accuracy throughout the first few years of development. We predicted that the model less accurately reconstructs children's speech productions as they grow older because children gradually begin to generate speech using abstracted forms rather than specific "chunks" from their speech environment. To test this idea, we trained the model on both recently encountered and cumulative speech input from a longitudinal child language corpus. We then assessed whether the model could accurately reconstruct children's speech. Controlling for utterance length and the presence of duplicate chunks, we found no evidence that the CBL becomes less accurate in its ability to reconstruct children's speech with age.
Collapse
Affiliation(s)
- Ingeborg Roete
- Language Development Department, Max Planck Institute for Psycholinguistics.,Centre for Language Studies, Radboud University
| | | | | | - Marisa Casillas
- Language Development Department, Max Planck Institute for Psycholinguistics
| |
Collapse
|
18
|
Cychosz M, Romeo R, Soderstrom M, Scaff C, Ganek H, Cristia A, Casillas M, de Barbaro K, Bang JY, Weisleder A. Longform recordings of everyday life: Ethics for best practices. Behav Res Methods 2020; 52:1951-1969. [PMID: 32103465 PMCID: PMC7483614 DOI: 10.3758/s13428-020-01365-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Recent advances in large-scale data storage and processing offer unprecedented opportunities for behavioral scientists to collect and analyze naturalistic data, including from underrepresented groups. Audio data, particularly real-world audio recordings, are of particular interest to behavioral scientists because they provide high-fidelity access to subtle aspects of daily life and social interactions. However, these methodological advances pose novel risks to research participants and communities. In this article, we outline the benefits and challenges associated with collecting, analyzing, and sharing multi-hour audio recording data. Guided by the principles of autonomy, privacy, beneficence, and justice, we propose a set of ethical guidelines for the use of longform audio recordings in behavioral research. This article is also accompanied by an Open Science Framework Ethics Repository that includes informed consent resources such as frequent participant concerns and sample consent forms.
Collapse
Affiliation(s)
- Margaret Cychosz
- Department of Linguistics, University of California, 1203 Dwinelle Hall, Berkeley, CA, 94720, USA.
| | - Rachel Romeo
- Boston Children's Hospital and Massachusetts Institute of Technology, Boston, MA, USA
| | | | - Camila Scaff
- Human Ecology Group, Institute of Evolutionary Medicine, University of Zurich, Zürich, Switzerland
| | | | - Alejandrina Cristia
- Laboratoire de Sciences Cognitives et de Psycholinguistique, Département d'études cognitives, ENS, EHESS, CNRS, PSL University, Paris, France
| | - Marisa Casillas
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Kaya de Barbaro
- Department of Psychology, The University of Texas at Austin, Austin, TX, USA
| | - Janet Y Bang
- Department of Psychology, Stanford University, Stanford, CA, USA
| | - Adriana Weisleder
- Department of Communication Sciences and Disorders, Northwestern University, 2240 Campus Dr., Frances Searle Building, Room 3-358, Evanston, IL, 60208, USA.
| |
Collapse
|
19
|
Swanson MR. The role of caregiver speech in supporting language development in infants and toddlers with autism spectrum disorder. Dev Psychopathol 2020; 32:1230-1239. [PMID: 32893764 PMCID: PMC7872436 DOI: 10.1017/s0954579420000838] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Parents play an essential role in supporting child development by providing a safe home, proper nutrition, and rich educational opportunities. In this article we focus on the role of caregiver speech in supporting development of young children with autism spectrum disorder (ASD). We review studies from typically developing children and children with autism showing that rich and responsive caregiver speech supports language development. Autism intervention studies that target caregiver speech are reviewed as are recent scientific advances from studies of typical development. The strengths and weakness of different techniques for collecting language data from caregivers and children are reviewed, and natural language samples are recommended as best practice for language research in autism. We conclude that caregivers play a powerful role in shaping their children's development and encourage researchers to adapt parent-mediated intervention studies to acknowledge individual differences in parents by using a personalized medicine approach.
Collapse
Affiliation(s)
- Meghan R Swanson
- School of Behavioral and Brain Sciences, University of Texas at Dallas, TX, USA
| |
Collapse
|