1
|
Rubino M, Dietrich M, Abbott KV. Initial Theoretical Discussion of Identity as Barrier and Facilitator in Voice Habilitation and Rehabilitation. J Voice 2023:S0892-1997(23)00295-3. [PMID: 37867071 DOI: 10.1016/j.jvoice.2023.09.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 10/24/2023]
Abstract
OBJECTIVES The purpose of this paper is to review seminal identity theories grounded in social psychology and one concept from voice science and explain how this group may point to identity factors facilitating or impeding voice habilitation and rehabilitation. METHODS Identity theories from the social psychology literature (Dramaturgical Theory, Self-Categorization Theory, Self-Determination Theory, Identity Negotiation Theory) and vocal congruence are described. Concepts are synthesized with voice science research to explore potential identity-behavior relations at play in voice habilitation and rehabilitation. RESULTS Applicable concepts from social psychology and voice science suggest identity-related processes by which a client may or may not develop a voice difference/disorder, seek intervention, and achieve goals in intervention. A bidirectional relationship between identity and behavior has been well-established in the social psychology literature. However, the relevance of vocal behavior has yet to be formally examined within this literature. Importantly, although connections between behavioral tendencies and voice disorders as well as the contribution of identity to gender-affirming voice treatment have been established in the voice science literature, the consideration of identity's possible role in voice habilitation and rehabilitation in cis gender individuals has thus far been scant. CONCLUSIONS Research into identity and voice habilitation and rehabilitation may help to improve voice intervention outcomes. A possible adjunct to human studies is agent-based modeling or other computational approaches to assess the myriad factors that may be relevant within this line of inquiry.
Collapse
Affiliation(s)
- Marianna Rubino
- Department of Communication Sciences and Disorders, University of Houston, Houston, Texas.
| | - Maria Dietrich
- Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany
| | - Katherine Verdolini Abbott
- Department of Linguistics and Cognitive Science, University of Delaware, Newark, Delaware; Department of Communication Sciences and Disorders, University of Delaware, Newark, Delaware
| |
Collapse
|
2
|
Lahiri R, Nasir M, Kumar M, Kim SH, Bishop S, Lord C, Narayanan S. Interpersonal synchrony across vocal and lexical modalities in interactions involving children with autism spectrum disorder. JASA EXPRESS LETTERS 2022; 2:095202. [PMID: 36097603 PMCID: PMC9462442 DOI: 10.1121/10.0013421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 07/21/2022] [Indexed: 06/15/2023]
Abstract
Quantifying behavioral synchrony can inform clinical diagnosis, long-term monitoring, and individualised interventions in neuro-developmental disorders characterized by deficit in communication and social interaction, such as autism spectrum disorder. In this work, three different objective measures of interpersonal synchrony are evaluated across vocal and linguistic communication modalities. For vocal prosodic and spectral features, dynamic time warping distance and squared cosine distance of (feature-wise) complexity are used, and for lexical features, word mover's distance is applied to capture behavioral synchrony. It is shown that these interpersonal vocal and linguistic synchrony measures capture complementary information that helps in characterizing overall behavioral patterns.
Collapse
Affiliation(s)
- Rimita Lahiri
- Signal Analysis and Interpretation Laboratory, University of Southern California, Los Angeles, California 90089, USA
| | - Md Nasir
- Microsoft Artificial Intelligence for Good Research Lab, Redmond, Washington 98052, USA
| | - Manoj Kumar
- Amazon Alexa Artificial Intelligence, Cambridge, Massachusetts 02142, USA
| | - So Hyun Kim
- Center for Autism and the Developing Brain, Weill Cornell Medicine, New York, New York 10065, USA
| | - Somer Bishop
- Department of Psychiatry, University of California, San Francisco, California 94143, USA
| | - Catherine Lord
- Semel Institute of Neuroscience and Human Behavior, University of California, Los Angeles, California 90024, USA , , , , , ,
| | - Shrikanth Narayanan
- Signal Analysis and Interpretation Laboratory, University of Southern California, Los Angeles, California 90089, USA
| |
Collapse
|
3
|
Ostrand R, Chodroff E. It's alignment all the way down, but not all the way up: Speakers align on some features but not others within a dialogue. JOURNAL OF PHONETICS 2021; 88:101074. [PMID: 34366499 PMCID: PMC8345023 DOI: 10.1016/j.wocn.2021.101074] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
During conversation, speakers modulate characteristics of their production to match their interlocutors' characteristics. This behavior is known as alignment. Speakers align at many linguistic levels, including the syntactic, lexical, and phonetic levels. As a result, alignment is often treated as a unitary phenomenon, in which evidence of alignment on one feature is cast as alignment of the entire linguistic level. This experiment investigates whether alignment can occur at some levels but not others, and on some features but not others, within a given dialogue. Participants interacted with two experimenters with highly contrasting acoustic-phonetic and syntactic profiles. The experimenters each described sets of pictures using a consistent acoustic-phonetic and syntactic profile; the participants then described new pictures to each experimenter individually. Alignment was measured as the degree to which subjects matched their current listener's speech (vs. their non-listener's) on each of several individual acoustic-phonetic and syntactic features. Additionally, a holistic measure of phonetic alignment was assessed using 323 acoustic-phonetic features analyzed jointly in a machine learning classifier. Although participants did not align on several individual spectral-phonetic or syntactic features, they did align on individual temporal-phonetic features and as measured by the holistic acoustic-phonetic profile. Thus, alignment can simultaneously occur at some levels but not others within a given dialogue, and is not a single phenomenon but rather a constellation of loosely-related effects. These findings suggest that the mechanism underlying alignment is not a primitive, automatic priming mechanism but rather guided by communicative or social factors.
Collapse
Affiliation(s)
- Rachel Ostrand
- IBM Research, Yorktown Heights, NY, USA
- Corresponding author. (R. Ostrand)
| | - Eleanor Chodroff
- Department of Language and Linguistic Science, University of York, Heslington, York, United Kingdom
| |
Collapse
|
4
|
Nallan Chakravarthula S, Baucom BR, Narayanan S, Georgiou P. An analysis of observation length requirements for machine understanding of human behaviors from spoken language. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2020.101162] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
5
|
Borrie SA, Wynn CJ, Berisha V, Lubold N, Willi MM, Coelho CA, Barrett TS. Conversational Coordination of Articulation Responds to Context: A Clinical Test Case With Traumatic Brain Injury. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:2567-2577. [PMID: 32755503 PMCID: PMC7872735 DOI: 10.1044/2020_jslhr-20-00104] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Revised: 04/15/2020] [Accepted: 05/14/2020] [Indexed: 05/19/2023]
Abstract
Purpose Coordination of communicative behavior supports shared understanding in conversation. The current study brings together analysis of two speech coordination strategies, entrainment and compensation of articulation, in a preliminary investigation into whether strategy organization is shaped by a challenging communicative context-conversing with a person who has a communication disorder. Method As an initial clinical test case, an automated measure of articulatory precision was analyzed in a corpus of spoken dialogue, where a confederate conversed with participants with traumatic brain injury (n = 28) and participants with no brain injury (n = 48). Results Overall, the confederate engaged in significant entrainment and high compensation (hyperarticulation) in conversations with participants with traumatic brain injury relative to significant entrainment and low compensation (hypoarticulation) in conversations with participants with no brain injury. Furthermore, the confederate's articulatory precision changed over the course of the conversations. Conclusions Findings suggest that the organization of conversational coordination is sensitive to context, supporting synergistic models of spoken dialogue. While corpus limitations are acknowledged, these initial results point to differences in the way in which speech strategies are realized in challenging communicative contexts, highlighting a viable and important target for investigation with clinical populations. A framework for investigating speech coordination strategies in tandem and ideas for advancing this line of inquiry serve as key contributions of this work.
Collapse
Affiliation(s)
- Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Camille J. Wynn
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Visar Berisha
- School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe
| | - Nichola Lubold
- School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe
| | - Megan M. Willi
- Communication Sciences and Disorders Program, California State University, Chico
| | - Carl A. Coelho
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs
| | | |
Collapse
|
6
|
Borrie SA, Barrett TS, Liss JM, Berisha V. Sync Pending: Characterizing Conversational Entrainment in Dysarthria Using a Multidimensional, Clinically Informed Approach. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:83-94. [PMID: 31855608 PMCID: PMC7213480 DOI: 10.1044/2019_jslhr-19-00194] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2019] [Revised: 07/11/2019] [Accepted: 09/10/2019] [Indexed: 05/19/2023]
Abstract
Purpose Despite the import of conversational entrainment to successful spoken dialogue, the systematic characterization of this behavioral syncing phenomenon represents a critical gap in the field of speech pathology. The goal of this study was to acoustically characterize conversational entrainment in the context of dysarthria using a multidimensional approach previously validated in healthy populations (healthy conversations; Borrie, Barrett, Willi, & Berisha, 2019). Method A large corpus of goal-oriented conversations between participants with dysarthria and healthy participants (disordered conversations) was elicited using a "spot the difference" task. Expert clinical assessment of entrainment and a measure of conversational success (communicative efficiency) was obtained for each of the audio-recorded conversations. Conversational entrainment of acoustic features representing rhythmic, articulatory, and phonatory dimensions of speech was identified using cross-recurrence quantification analysis with clinically informed model parameters and validated with a sham condition involving conversational participants who did not converse with one another. The relationship between conversational entrainment and communicative efficiency was examined. Results Acoustic evidence of entrainment was observed in phonatory, but not rhythmic and articulatory, behavior, a finding that differs from healthy conversations in which entrainment was observed in all speech signal dimensions. This result, that disordered conversations showed less acoustic entrainment than healthy conversations, is corroborated by clinical assessment of entrainment in which the disordered conversations were rated, overall, as being less in sync than healthy conversations. Furthermore, acoustic entrainment was predictive of communicative efficiency, corroborated by a relationship between clinical assessment and the same outcome measure. Conclusions The findings confirm our hypothesis that the pathological speech production parameters of dysarthria disrupt the seemingly ubiquitous phenomenon of conversational entrainment, thus advancing entrainment deficits as an important variable in dysarthria, one that may have causative effects on the success of everyday communication. Results further reveal that while this approach provides a broad overview, methodologies for characterizing conversational entrainment in dysarthria must continue to be developed and refined, with a focus on clinical utility. Supplemental Material https://osf.io/ktg5q.
Collapse
Affiliation(s)
- Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | | | - Julie M. Liss
- Department of Speech and Hearing Science, Arizona State University, Tempe
| | - Visar Berisha
- Department of Speech and Hearing Science, Arizona State University, Tempe
- School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe
| |
Collapse
|
7
|
Chen CP, Gau SSF, Lee CC. Toward differential diagnosis of autism spectrum disorder using multimodal behavior descriptors and executive functions. COMPUT SPEECH LANG 2019. [DOI: 10.1016/j.csl.2018.12.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
8
|
Borrie SA, Barrett TS, Willi MM, Berisha V. Syncing Up for a Good Conversation: A Clinically Meaningful Methodology for Capturing Conversational Entrainment in the Speech Domain. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:283-296. [PMID: 30950701 PMCID: PMC6436892 DOI: 10.1044/2018_jslhr-s-18-0210] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Purpose Conversational entrainment, the phenomenon whereby communication partners synchronize their behavior, is considered essential for productive and fulfilling conversation. Lack of entrainment could, therefore, negatively impact conversational success. Although studied in many disciplines, entrainment has received limited attention in the field of speech-language pathology, where its implications may have direct clinical relevance. Method A novel computational methodology, informed by expert clinical assessment of conversation, was developed to investigate conversational entrainment across multiple speech dimensions in a corpus of experimentally elicited conversations involving healthy participants. The predictive relationship between the methodology output and an objective measure of conversational success, communicative efficiency, was then examined. Results Using a real versus sham validation procedure, we find evidence of sustained entrainment in rhythmic, articulatory, and phonatory dimensions of speech. We further validate the methodology, showing that models built on speech signal entrainment measures consistently outperform models built on nonentrained speech signal measures in predicting communicative efficiency of the conversations. Conclusions A multidimensional, clinically meaningful methodology for capturing conversational entrainment, validated in healthy populations, has implications for disciplines such as speech-language pathology where conversational entrainment represents a critical knowledge gap in the field, as well as a potential target for remediation.
Collapse
Affiliation(s)
- Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Tyson S. Barrett
- Department of Kinesiology and Health Sciences, Utah State University, Logan
| | - Megan M. Willi
- Department of Communication Sciences and Disorders, California State University, Chico
| | - Visar Berisha
- Department of Speech and Hearing Science, Arizona State University, Tempe
- School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe
| |
Collapse
|
9
|
Weusthoff S, Gaut G, Steyvers M, Atkins DC, Hahlweg K, Hogan J, Zimmermann T, Fischer MS, Baucom DH, Georgiou P, Narayanan S, Baucom BR. The language of interpersonal interaction: An interdisciplinary approach to assessing and processing vocal and speech data. EUROPEAN JOURNAL OF COUNSELLING PSYCHOLOGY 2018. [DOI: 10.5964/ejcop.v7i1.82] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Verbal and non-verbal information is central to social interaction between humans and has been studied intensively in psychology. Especially, dyadic interactions (e.g. between romantic partners or between psychotherapist and patient) are relevant for a number of psychological research areas. However, psychological methods applied so far have not been able to handle the vast amount of data resulting from human interactions, impeding scientific discovery and progress. This paper presents an interdisciplinary approach using technology from engineering and computer science to work with continuous data from human communication and interaction on the verbal (e.g. use of words, content) and non-verbal (e.g. vocal features of the human voice) level. Text-mining techniques such as topic models take into account the semantic and syntactic information of written text (such as therapy session transcripts) and its structure and intercorrelations. Speech signal processing focuses on the vocal information in a speaker’s voice (e.g. based on audio- or videotaped interactions). For both areas, an introduction defining the respective method and related procedures, and sample applications from psychological publications complementing or generating behavioral codes (e.g. in addition to cardiovascular indices of arousal or as a form to encode empathy) are provided. We close with a summary on the opportunities and challenges of learning and applying tools from the novel approaches described in this manuscript to different areas of psychological research and provide the interested reader with a list of additional readings on the technical aspects of topic modeling and speech signal processing.
Collapse
|
10
|
Chong S, Yue G, Bingxu J, Hongguang L. An affective cognition based approach to multi-attribute group decision making. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2018. [DOI: 10.3233/jifs-169563] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Su Chong
- Beijing University of Chemical Technology, Beijing, China
| | - Gao Yue
- Beijing University of Chemical Technology, Beijing, China
| | - Jiang Bingxu
- Beijing University of Chemical Technology, Beijing, China
| | - Li Hongguang
- Beijing University of Chemical Technology, Beijing, China
| |
Collapse
|
11
|
Reblin M, Heyman RE, Ellington L, Baucom BRW, Georgiou PG, Vadaparampil ST. Everyday couples' communication research: Overcoming methodological barriers with technology. PATIENT EDUCATION AND COUNSELING 2018; 101:551-556. [PMID: 29111310 DOI: 10.1016/j.pec.2017.10.019] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Revised: 10/12/2017] [Accepted: 10/26/2017] [Indexed: 06/07/2023]
Abstract
Relationship behaviors contribute to compromised health or resilience. Everyday communication between intimate partners represents the vast majority of their interactions. When intimate partners take on new roles as patients and caregivers, everyday communication takes on a new and important role in managing both the transition and the adaptation to the change in health status. However, everyday communication and its relation to health has been little studied, likely due to barriers in collecting and processing this kind of data. The goal of this paper is to describe deterrents to capturing naturalistic, day-in-the-life communication data and share how technological advances have helped surmount them. We provide examples from a current study and describe how we anticipate technology will further change research capabilities.
Collapse
Affiliation(s)
- Maija Reblin
- Department of Health Outcomes & Behavior, Moffitt Cancer Center, Tampa, USA.
| | - Richard E Heyman
- Family Translational Research Group, New York University, New York, USA
| | - Lee Ellington
- College of Nursing, University of Utah, Salt Lake City, USA
| | - Brian R W Baucom
- Department of Psychology, University of Utah, Salt Lake City, USA
| | - Panayiotis G Georgiou
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, USA
| | | |
Collapse
|
12
|
Nasir M, Baucom BR, Georgiou P, Narayanan S. Predicting couple therapy outcomes based on speech acoustic features. PLoS One 2017; 12:e0185123. [PMID: 28934302 PMCID: PMC5608311 DOI: 10.1371/journal.pone.0185123] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2016] [Accepted: 09/06/2017] [Indexed: 11/19/2022] Open
Abstract
Automated assessment and prediction of marital outcome in couples therapy is a challenging task but promises to be a potentially useful tool for clinical psychologists. Computational approaches for inferring therapy outcomes using observable behavioral information obtained from conversations between spouses offer objective means for understanding relationship dynamics. In this work, we explore whether the acoustics of the spoken interactions of clinically distressed spouses provide information towards assessment of therapy outcomes. The therapy outcome prediction task in this work includes detecting whether there was a relationship improvement or not (posed as a binary classification) as well as discerning varying levels of improvement or decline in the relationship status (posed as a multiclass recognition task). We use each interlocutor's acoustic speech signal characteristics such as vocal intonation and intensity, both independently and in relation to one another, as cues for predicting the therapy outcome. We also compare prediction performance with one obtained via standardized behavioral codes characterizing the relationship dynamics provided by human experts as features for automated classification. Our experiments, using data from a longitudinal clinical study of couples in distressed relations, showed that predictions of relationship outcomes obtained directly from vocal acoustics are comparable or superior to those obtained using human-rated behavioral codes as prediction features. In addition, combining direct signal-derived features with manually coded behavioral features improved the prediction performance in most cases, indicating the complementarity of relevant information captured by humans and machine algorithms. Additionally, considering the vocal properties of the interlocutors in relation to one another, rather than in isolation, showed to be important for improving the automatic prediction. This finding supports the notion that behavioral outcome, like many other behavioral aspects, is closely related to the dynamics and mutual influence of the interlocutors during their interaction and their resulting behavioral patterns.
Collapse
Affiliation(s)
- Md Nasir
- Department of Electrical Engineering, University of Southern California, Los Angeles, United States of America
| | - Brian Robert Baucom
- Department of Psychology, University of Utah, Salt Lake City, Utah, United States of America
| | - Panayiotis Georgiou
- Department of Electrical Engineering, University of Southern California, Los Angeles, United States of America
- * E-mail:
| | - Shrikanth Narayanan
- Department of Electrical Engineering, University of Southern California, Los Angeles, United States of America
| |
Collapse
|
13
|
Piryani R, Madhavi D, Singh V. Analytical mapping of opinion mining and sentiment analysis research during 2000–2015. Inf Process Manag 2017. [DOI: 10.1016/j.ipm.2016.07.001] [Citation(s) in RCA: 107] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
14
|
Koole SL, Tschacher W. Synchrony in Psychotherapy: A Review and an Integrative Framework for the Therapeutic Alliance. Front Psychol 2016; 7:862. [PMID: 27378968 PMCID: PMC4907088 DOI: 10.3389/fpsyg.2016.00862] [Citation(s) in RCA: 152] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2016] [Accepted: 05/24/2016] [Indexed: 12/30/2022] Open
Abstract
During psychotherapy, patient and therapist tend to spontaneously synchronize their vocal pitch, bodily movements, and even their physiological processes. In the present article, we consider how this pervasive phenomenon may shed new light on the therapeutic relationship- or alliance- and its role within psychotherapy. We first review clinical research on the alliance and the multidisciplinary area of interpersonal synchrony. We then integrate both literatures in the Interpersonal Synchrony (In-Sync) model of psychotherapy. According to the model, the alliance is grounded in the coupling of patient and therapist's brains. Because brains do not interact directly, movement synchrony may help to establish inter-brain coupling. Inter-brain coupling may provide patient and therapist with access to another's internal states, which facilitates common understanding and emotional sharing. Over time, these interpersonal exchanges may improve patients' emotion-regulatory capacities and related therapeutic outcomes. We discuss the empirical assessment of interpersonal synchrony and review preliminary research on synchrony in psychotherapy. Finally, we summarize our main conclusions and consider the broader implications of viewing psychotherapy as the product of two interacting brains.
Collapse
|
15
|
Gupta R, Bone D, Lee S, Narayanan S. Analysis of engagement behavior in children during dyadic interactions using prosodic cues. COMPUT SPEECH LANG 2016; 37:47-66. [PMID: 28713198 DOI: 10.1016/j.csl.2015.09.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Child engagement is defined as the interaction of a child with his/her environment in a contextually appropriate manner. Engagement behavior in children is linked to socio-emotional and cognitive state assessment with enhanced engagement identified with improved skills. A vast majority of studies however rely solely, and often implicitly, on subjective perceptual measures of engagement. Access to automatic quantification could assist researchers/clinicians to objectively interpret engagement with respect to a target behavior or condition, and furthermore inform mechanisms for improving engagement in various settings. In this paper, we present an engagement prediction system based exclusively on vocal cues observed during structured interaction between a child and a psychologist involving several tasks. Specifically, we derive prosodic cues that capture engagement levels across the various tasks. Our experiments suggest that a child's engagement is reflected not only in the vocalizations, but also in the speech of the interacting psychologist. Moreover, we show that prosodic cues are informative of the engagement phenomena not only as characterized over the entire task (i.e., global cues), but also in short term patterns (i.e., local cues). We perform a classification experiment assigning the engagement of a child into three discrete levels achieving an unweighted average recall of 55.8% (chance is 33.3%). While the systems using global cues and local level cues are each statistically significant in predicting engagement, we obtain the best results after fusing these two components. We perform further analysis of the cues at local and global levels to achieve insights linking specific prosodic patterns to the engagement phenomenon. We observe that while the performance of our model varies with task setting and interacting psychologist, there exist universal prosodic patterns reflective of engagement.
Collapse
Affiliation(s)
- Rahul Gupta
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, 3710 McClintock Avenue, Los Angeles, CA 90089, USA
| | - Daniel Bone
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, 3710 McClintock Avenue, Los Angeles, CA 90089, USA
| | - Sungbok Lee
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, 3710 McClintock Avenue, Los Angeles, CA 90089, USA
| | - Shrikanth Narayanan
- Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, 3710 McClintock Avenue, Los Angeles, CA 90089, USA
| |
Collapse
|
16
|
Xiao B, Huang C, Imel ZE, Atkins DC, Georgiou P, Narayanan SS. A technology prototype system for rating therapist empathy from audio recordings in addiction counseling. PeerJ Comput Sci 2016; 2:e59. [PMID: 28286867 PMCID: PMC5344199 DOI: 10.7717/peerj-cs.59] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy-a key therapy quality index-from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training.
Collapse
Affiliation(s)
- Bo Xiao
- Department of Electrical Engineering, University of Southern California, Los Angeles, CA, United States
| | - Chewei Huang
- Department of Electrical Engineering, University of Southern California, Los Angeles, CA, United States
| | - Zac E. Imel
- Department of Educational Psychology, University of Utah, Salt Lake City, UT, United States
| | - David C. Atkins
- Department of Psychiatry & Behavioral Sciences, University of Washington, Seattle, WA, United States
| | - Panayiotis Georgiou
- Department of Electrical Engineering, University of Southern California, Los Angeles, CA, United States
| | - Shrikanth S. Narayanan
- Department of Electrical Engineering, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
17
|
He S, Zheng X, Zeng D, Luo C, Zhang Z. Exploring Entrainment Patterns of Human Emotion in Social Media. PLoS One 2016; 11:e0150630. [PMID: 26953692 PMCID: PMC4782991 DOI: 10.1371/journal.pone.0150630] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 02/17/2016] [Indexed: 11/19/2022] Open
Abstract
Emotion entrainment, which is generally defined as the synchronous convergence of human emotions, performs many important social functions. However, what the specific mechanisms of emotion entrainment are beyond in-person interactions, and how human emotions evolve under different entrainment patterns in large-scale social communities, are still unknown. In this paper, we aim to examine the massive emotion entrainment patterns and understand the underlying mechanisms in the context of social media. As modeling emotion dynamics on a large scale is often challenging, we elaborate a pragmatic framework to characterize and quantify the entrainment phenomenon. By applying this framework on the datasets from two large-scale social media platforms, we find that the emotions of online users entrain through social networks. We further uncover that online users often form their relations via dual entrainment, while maintain it through single entrainment. Remarkably, the emotions of online users are more convergent in nonreciprocal entrainment. Building on these findings, we develop an entrainment augmented model for emotion prediction. Experimental results suggest that entrainment patterns inform emotion proximity in dyads, and encoding their associations promotes emotion prediction. This work can further help us to understand the underlying dynamic process of large-scale online interactions and make more reasonable decisions regarding emergency situations, epidemic diseases, and political campaigns in cyberspace.
Collapse
Affiliation(s)
- Saike He
- State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Xiaolong Zheng
- State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Daniel Zeng
- State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
- Department of Management Information Systems, University of Arizona, Tucson, Arizona, 85721, United States of America
| | - Chuan Luo
- State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Zhu Zhang
- State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| |
Collapse
|
18
|
Xiao B, Georgiou P, Baucom B, Narayanan SS. Head Motion Modeling for Human Behavior Analysis in Dyadic Interaction. IEEE TRANSACTIONS ON MULTIMEDIA 2015; 17:1107-1119. [PMID: 26557047 PMCID: PMC4636041 DOI: 10.1109/tmm.2015.2432671] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This paper presents a computational study of head motion in human interaction, notably of its role in conveying interlocutors' behavioral characteristics. Head motion is physically complex and carries rich information; current modeling approaches based on visual signals, however, are still limited in their ability to adequately capture these important properties. Guided by the methodology of kinesics, we propose a data driven approach to identify typical head motion patterns. The approach follows the steps of first segmenting motion events, then parametrically representing the motion by linear predictive features, and finally generalizing the motion types using Gaussian mixture models. The proposed approach is experimentally validated using video recordings of communication sessions from real couples involved in a couples therapy study. In particular we use the head motion model to classify binarized expert judgments of the interactants' specific behavioral characteristics where entrainment in head motion is hypothesized to play a role: Acceptance, Blame, Positive, and Negative behavior. We achieve accuracies in the range of 60% to 70% for the various experimental settings and conditions. In addition, we describe a measure of motion similarity between the interaction partners based on the proposed model. We show that the relative change of head motion similarity during the interaction significantly correlates with the expert judgments of the interactants' behavioral characteristics. These findings demonstrate the effectiveness of the proposed head motion model, and underscore the promise of analyzing human behavioral characteristics through signal processing methods.
Collapse
Affiliation(s)
- Bo Xiao
- Signal and Image Processing Institute, Department of Electrical Engineering, University of Southern California, Los Angeles, CA, 90089 USA
| | - Panayiotis Georgiou
- Signal and Image Processing Institute, Department of Electrical Engineering, University of Southern California, Los Angeles, CA, 90089 USA
| | - Brian Baucom
- Department of Psychology, University of Utah, Salt Lack City, UT, 84112 USA
| | - Shrikanth S Narayanan
- Signal and Image Processing Institute, Department of Electrical Engineering, University of Southern California, Los Angeles, CA, 90089 USA
| |
Collapse
|
19
|
Bone D, Lee CC, Black MP, Williams ME, Lee S, Levitt P, Narayanan S. The psychologist as an interlocutor in autism spectrum disorder assessment: insights from a study of spontaneous prosody. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:1162-77. [PMID: 24686340 PMCID: PMC4326041 DOI: 10.1044/2014_jslhr-s-13-0062] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
PURPOSE The purpose of this study was to examine relationships between prosodic speech cues and autism spectrum disorder (ASD) severity, hypothesizing a mutually interactive relationship between the speech characteristics of the psychologist and the child. The authors objectively quantified acoustic-prosodic cues of the psychologist and of the child with ASD during spontaneous interaction, establishing a methodology for future large-sample analysis. METHOD Speech acoustic-prosodic features were semiautomatically derived from segments of semistructured interviews (Autism Diagnostic Observation Schedule, ADOS; Lord, Rutter, DiLavore, & Risi, 1999; Lord et al., 2012) with 28 children who had previously been diagnosed with ASD. Prosody was quantified in terms of intonation, volume, rate, and voice quality. Research hypotheses were tested via correlation as well as hierarchical and predictive regression between ADOS severity and prosodic cues. RESULTS Automatically extracted speech features demonstrated prosodic characteristics of dyadic interactions. As rated ASD severity increased, both the psychologist and the child demonstrated effects for turn-end pitch slope, and both spoke with atypical voice quality. The psychologist's acoustic cues predicted the child's symptom severity better than did the child's acoustic cues. CONCLUSION The psychologist, acting as evaluator and interlocutor, was shown to adjust his or her behavior in predictable ways based on the child's social-communicative impairments. The results support future study of speech prosody of both interaction partners during spontaneous conversation, while using automatic computational methods that allow for scalable analysis on much larger corpora.
Collapse
Affiliation(s)
- Daniel Bone
- Signal Analysis & Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Chi-Chun Lee
- Signal Analysis & Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Matthew P. Black
- Signal Analysis & Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Marian E. Williams
- University Center for Excellence in Developmental Disabilities, Keck School of Medicine of University of Southern California and Children’s Hospital Los Angeles
| | - Sungbok Lee
- Signal Analysis & Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| | - Pat Levitt
- Keck School of Medicine of University of Southern California
- Children’s Hospital Los Angeles
| | - Shrikanth Narayanan
- Signal Analysis & Interpretation Laboratory (SAIL), University of Southern California, Los Angeles
| |
Collapse
|
20
|
Narayanan S, Georgiou PG. Behavioral Signal Processing: Deriving Human Behavioral Informatics From Speech and Language: Computational techniques are presented to analyze and model expressed and perceived human behavior-variedly characterized as typical, atypical, distressed, and disordered-from speech and language cues and their applications in health, commerce, education, and beyond. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2013; 101:1203-1233. [PMID: 24039277 PMCID: PMC3769794 DOI: 10.1109/jproc.2012.2236291] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
The expression and experience of human behavior are complex and multimodal and characterized by individual and contextual heterogeneity and variability. Speech and spoken language communication cues offer an important means for measuring and modeling human behavior. Observational research and practice across a variety of domains from commerce to healthcare rely on speech- and language-based informatics for crucial assessment and diagnostic information and for planning and tracking response to an intervention. In this paper, we describe some of the opportunities as well as emerging methodologies and applications of human behavioral signal processing (BSP) technology and algorithms for quantitatively understanding and modeling typical, atypical, and distressed human behavior with a specific focus on speech- and language-based communicative, affective, and social behavior. We describe the three important BSP components of acquiring behavioral data in an ecologically valid manner across laboratory to real-world settings, extracting and analyzing behavioral cues from measured data, and developing models offering predictive and decision-making support. We highlight both the foundational speech and language processing building blocks as well as the novel processing and modeling opportunities. Using examples drawn from specific real-world applications ranging from literacy assessment and autism diagnostics to psychotherapy for addiction and marital well being, we illustrate behavioral informatics applications of these signal processing techniques that contribute to quantifying higher level, often subjectively described, human behavior in a domain-sensitive fashion.
Collapse
Affiliation(s)
- Shrikanth Narayanan
- The authors are with the Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089 USA
| | - Panayiotis G. Georgiou
- The authors are with the Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089 USA
| |
Collapse
|