1
|
Abstract
Although still-face effects are well-studied, little is known about the degree to which the Face-to-Face/Still-Face (FFSF) is associated with the production of intense affective displays. Duchenne smiling expresses more intense positive affect than non-Duchenne smiling, while Duchenne cry-faces express more intense negative affect than non-Duchenne cry-faces. Forty 4-month-old infants and their mothers completed the FFSF, and key affect-indexing facial Action Units (AUs) were coded by expert Facial Action Coding System coders for the first 30 s of each FFSF episode. Computer vision software, automated facial affect recognition (AFAR), identified AUs for the entire 2-min episodes. Expert coding and AFAR produced similar infant and mother Duchenne and non-Duchenne FFSF effects, highlighting the convergent validity of automated measurement. Substantive AFAR analyses indicated that both infant Duchenne and non-Duchenne smiling declined from the FF to the SF, but only Duchenne smiling increased from the SF to the RE. In similar fashion, the magnitude of mother Duchenne smiling changes over the FFSF were 2-4 times greater than non-Duchenne smiling changes. Duchenne expressions appear to be a sensitive index of intense infant and mother affective valence that are accessible to automated measurement and may be a target for future FFSF research.
Collapse
Affiliation(s)
- Yeojin Amy Ahn
- Department of Psychology, University of Miami, Coral
Gables, Florida, USA
| | - Itir Önal Ertuğrul
- Department of Information and Computing Sciences, Utrecht
University, Utrecht, Netherlands
| | - Sy-Miin Chow
- Department of Human Development and Family Studies,
Pennsylvania State University, State College, Pennsylvania, USA
| | - Jeffrey F. Cohn
- Department of Psychology, University of Pittsburgh,
Pittsburgh, Pennsylvania, USA
| | - Daniel S. Messinger
- Department of Psychology, University of Miami, Coral
Gables, Florida, USA
- Department of Electrical and Computer Engineering,
University of Miami, Coral Gables, Florida, USA
- Departments of Pediatrics and Music Engineering, University
of Miami, Coral Gables, Florida, USA
| |
Collapse
|
2
|
Auerbach RP, Lan R, Galfalvy H, Alqueza KL, Cohn JF, Crowley RN, Durham K, Joyce KJ, Kahn LE, Kamath RA, Morency LP, Porta G, Srinivasan A, Zelazny J, Brent DA, Allen NB. Intensive Longitudinal Assessment of Adolescents to Predict Suicidal Thoughts and Behaviors. J Am Acad Child Adolesc Psychiatry 2023; 62:1010-1020. [PMID: 37182586 PMCID: PMC10524866 DOI: 10.1016/j.jaac.2023.03.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 03/24/2023] [Accepted: 05/05/2023] [Indexed: 05/16/2023]
Abstract
OBJECTIVE Suicide is a leading cause of death among adolescents. However, there are no clinical tools to detect proximal risk for suicide. METHOD Participants included 13- to 18-year-old adolescents (N = 103) reporting a current depressive, anxiety, and/or substance use disorder who owned a smartphone; 62% reported current suicidal ideation, with 25% indicating a past-year attempt. At baseline, participants were administered clinical interviews to assess lifetime disorders and suicidal thoughts and behaviors (STBs). Self-reports assessing symptoms and suicide risk factors also were obtained. In addition, the Effortless Assessment of Risk States (EARS) app was installed on adolescent smartphones to acquire daily mood and weekly suicidal ideation severity during the 6-month follow-up period. Adolescents completed STB and psychiatric service use interviews at the 1-, 3-, and 6-month follow-up assessments. RESULTS K-means clustering based on aggregates of weekly suicidal ideation scores resulted in a 3-group solution reflecting high-risk (n = 26), medium-risk (n = 47), and low-risk (n = 30) groups. Of the high-risk group, 58% reported suicidal events (ie, suicide attempts, psychiatric hospitalizations, emergency department visits, ideation severity requiring an intervention) during the 6-month follow-up period. For participants in the high-risk and medium-risk groups (n = 73), mood disturbances in the preceding 7 days predicted clinically significant ideation, with a 1-SD decrease in mood doubling participants' likelihood of reporting clinically significant ideation on a given week. CONCLUSION Intensive longitudinal assessment through use of personal smartphones offers a feasible method to assess variability in adolescents' emotional experiences and suicide risk. Translating these tools into clinical practice may help to reduce the needless loss of life among adolescents.
Collapse
Affiliation(s)
- Randy P Auerbach
- Columbia University, New York, and New York State Psychiatric Institute, New York; Sackler Institute, New York.
| | - Ranqing Lan
- Columbia University, New York, and New York State Psychiatric Institute, New York
| | - Hanga Galfalvy
- Columbia University, New York, and New York State Psychiatric Institute, New York
| | - Kira L Alqueza
- Columbia University, New York, and New York State Psychiatric Institute, New York
| | | | | | - Katherine Durham
- Columbia University, New York, and New York State Psychiatric Institute, New York
| | - Karla J Joyce
- University Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | | | - Rahil A Kamath
- Columbia University, New York, and New York State Psychiatric Institute, New York
| | | | - Giovanna Porta
- University Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Apoorva Srinivasan
- Columbia University, New York, and New York State Psychiatric Institute, New York
| | - Jamie Zelazny
- University Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - David A Brent
- University Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | | |
Collapse
|
3
|
Demchenko I, Desai N, Iwasa SN, Gholamali Nezhad F, Zariffa J, Kennedy SH, Rule NO, Cohn JF, Popovic MR, Mulsant BH, Bhat V. Manipulating facial musculature with functional electrical stimulation as an intervention for major depressive disorder: a focused search of literature for a proposal. J Neuroeng Rehabil 2023; 20:64. [PMID: 37193985 DOI: 10.1186/s12984-023-01187-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 05/02/2023] [Indexed: 05/18/2023] Open
Abstract
BACKGROUND Major Depressive Disorder (MDD) is associated with interoceptive deficits expressed throughout the body, particularly the facial musculature. According to the facial feedback hypothesis, afferent feedback from the facial muscles suffices to alter the emotional experience. Thus, manipulating the facial muscles could provide a new "mind-body" intervention for MDD. This article provides a conceptual overview of functional electrical stimulation (FES), a novel neuromodulation-based treatment modality that can be potentially used in the treatment of disorders of disrupted brain connectivity, such as MDD. METHODS A focused literature search was performed for clinical studies of FES as a modulatory treatment for mood symptoms. The literature is reviewed in a narrative format, integrating theories of emotion, facial expression, and MDD. RESULTS A rich body of literature on FES supports the notion that peripheral muscle manipulation in patients with stroke or spinal cord injury may enhance central neuroplasticity, restoring lost sensorimotor function. These neuroplastic effects suggest that FES may be a promising innovative intervention for psychiatric disorders of disrupted brain connectivity, such as MDD. Recent pilot data on repetitive FES applied to the facial muscles in healthy participants and patients with MDD show early promise, suggesting that FES may attenuate the negative interoceptive bias associated with MDD by enhancing positive facial feedback. Neurobiologically, the amygdala and nodes of the emotion-to-motor transformation loop may serve as potential neural targets for facial FES in MDD, as they integrate proprioceptive and interoceptive inputs from muscles of facial expression and fine-tune their motor output in line with socio-emotional context. CONCLUSIONS Manipulating facial muscles may represent a mechanistically novel treatment strategy for MDD and other disorders of disrupted brain connectivity that is worthy of investigation in phase II/III trials.
Collapse
Affiliation(s)
- Ilya Demchenko
- Interventional Psychiatry Program, Mental Health and Addictions Service, St. Michael's Hospital - Unity Health Toronto, Toronto, ON, M5B 1M4, Canada
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Naaz Desai
- Krembil Research Institute - University Health Network, Toronto, ON, M5T 0S8, Canada
- KITE, Toronto Rehabilitation Institute - University Health Network, Toronto, ON, M5G 2A2, Canada
| | - Stephanie N Iwasa
- KITE, Toronto Rehabilitation Institute - University Health Network, Toronto, ON, M5G 2A2, Canada
- CRANIA, University Health Network, Toronto, ON, M5G 2C4, Canada
| | - Fatemeh Gholamali Nezhad
- Interventional Psychiatry Program, Mental Health and Addictions Service, St. Michael's Hospital - Unity Health Toronto, Toronto, ON, M5B 1M4, Canada
| | - José Zariffa
- KITE, Toronto Rehabilitation Institute - University Health Network, Toronto, ON, M5G 2A2, Canada
- CRANIA, University Health Network, Toronto, ON, M5G 2C4, Canada
- Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5G 1V7, Canada
- Institute of Biomedical Engineering, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, M5S 3E2, Canada
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, M5S 3G8, Canada
| | - Sidney H Kennedy
- Interventional Psychiatry Program, Mental Health and Addictions Service, St. Michael's Hospital - Unity Health Toronto, Toronto, ON, M5B 1M4, Canada
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5S 1A8, Canada
- Department of Psychiatry, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5T 1R8, Canada
| | - Nicholas O Rule
- Department of Psychology, Faculty of Arts & Science , University of Toronto, Toronto, ON, M5S 3G3, Canada
| | - Jeffrey F Cohn
- Department of Psychology, Kenneth P. Dietrich School of Arts & Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Milos R Popovic
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5S 1A8, Canada
- KITE, Toronto Rehabilitation Institute - University Health Network, Toronto, ON, M5G 2A2, Canada
- CRANIA, University Health Network, Toronto, ON, M5G 2C4, Canada
- Institute of Biomedical Engineering, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, M5S 3E2, Canada
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, M5S 3G8, Canada
| | - Benoit H Mulsant
- Department of Psychiatry, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5T 1R8, Canada
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, M6J 1H4, Canada
| | - Venkat Bhat
- Interventional Psychiatry Program, Mental Health and Addictions Service, St. Michael's Hospital - Unity Health Toronto, Toronto, ON, M5B 1M4, Canada.
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5S 1A8, Canada.
- Krembil Research Institute - University Health Network, Toronto, ON, M5T 0S8, Canada.
- KITE, Toronto Rehabilitation Institute - University Health Network, Toronto, ON, M5G 2A2, Canada.
- CRANIA, University Health Network, Toronto, ON, M5G 2C4, Canada.
- Department of Psychiatry, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, M5T 1R8, Canada.
| |
Collapse
|
4
|
Swartz HA, Bylsma LM, Fournier JC, Girard JM, Spotts C, Cohn JF, Morency LP. Randomized trial of brief interpersonal psychotherapy and cognitive behavioral therapy for depression delivered both in-person and by telehealth. J Affect Disord 2023; 333:543-552. [PMID: 37121279 DOI: 10.1016/j.jad.2023.04.092] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 04/12/2023] [Accepted: 04/24/2023] [Indexed: 05/02/2023]
Abstract
BACKGROUND Expert consensus guidelines recommend Cognitive Behavioral Therapy (CBT) and Interpersonal Psychotherapy (IPT), interventions that were historically delivered face-to-face, as first-line treatments for Major Depressive Disorder (MDD). Despite ubiquity of telehealth following the COVID-19 pandemic, little is known about differential outcomes with CBT versus IPT delivered in-person (IP) or via telehealth (TH) or whether working alliance is affected. METHODS Adults meeting DSM-5 criteria for MDD were randomly assigned to either 8 sessions of IPT or CBT (group). Mid-trial, COVID-19 forced a change of therapy delivery from IP to TH (study phase). We compared changes in Hamilton Rating Scale for Depression (HRSD-17) and Working Alliance Inventory (WAI) scores for individuals by group and phase: CBT-IP (n = 24), CBT-TH (n = 11), IPT-IP (n = 25) and IPT-TH (n = 17). RESULTS HRSD-17 scores declined significantly from pre to post treatment (pre: M = 17.7, SD = 4.4 vs. post: M = 11.7, SD = 5.9; p < .001; d = 1.45) without significant group or phase effects. WAI scores did not differ by group or phase. Number of completed therapy sessions was greater for TH (M = 7.8, SD = 1.2) relative to IP (M = 7.2, SD = 1.6) (Mann-Whitney U = 387.50, z = -2.24, p = .025). LIMITATIONS Participants were not randomly assigned to IP versus TH. Sample size is small. CONCLUSIONS This study provides preliminary evidence supporting the efficacy of both brief IPT and CBT, delivered by either TH or IP, for depression. It showed that working alliance is preserved in TH, and delivery via TH may improve therapy adherence. Prospective, randomized controlled trials are needed to definitively test efficacy of brief IPT and CBT delivered via TH versus IP.
Collapse
Affiliation(s)
- Holly A Swartz
- University of Pittsburgh, Pittsburgh, PA, United States of America; The Ohio State University, Columbus, OH, United States of America.
| | - Lauren M Bylsma
- University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Jay C Fournier
- University of Kansas, Lawrence, KA, United States of America
| | | | - Crystal Spotts
- University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Jeffrey F Cohn
- University of Pittsburgh, Pittsburgh, PA, United States of America
| | | |
Collapse
|
5
|
Onal Ertugrul I, Ahn YA, Bilalpur M, Messinger DS, Speltz ML, Cohn JF. Infant AFAR: Automated facial action recognition in infants. Behav Res Methods 2023; 55:1024-1035. [PMID: 35538295 PMCID: PMC9646921 DOI: 10.3758/s13428-022-01863-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/13/2022] [Indexed: 11/08/2022]
Abstract
Automated detection of facial action units in infants is challenging. Infant faces have different proportions, less texture, fewer wrinkles and furrows, and unique facial actions relative to adults. For these and related reasons, action unit (AU) detectors that are trained on adult faces may generalize poorly to infant faces. To train and test AU detectors for infant faces, we trained convolutional neural networks (CNN) in adult video databases and fine-tuned these networks in two large, manually annotated, infant video databases that differ in context, head pose, illumination, video resolution, and infant age. AUs were those central to expression of positive and negative emotion. AU detectors trained in infants greatly outperformed ones trained previously in adults. Training AU detectors across infant databases afforded greater robustness to between-database differences than did training database specific AU detectors and outperformed previous state-of-the-art in infant AU detection. The resulting AU detection system, which we refer to as Infant AFAR (Automated Facial Action Recognition), is available to the research community for further testing and applications in infant emotion, social interaction, and related topics.
Collapse
|
6
|
Alghowinem S, Gedeon T, Goecke R, Cohn JF, Parker G. Interpretation of Depression Detection Models via Feature Selection Methods. IEEE Trans Affect Comput 2023; 14:133-152. [PMID: 36938342 PMCID: PMC10019578 DOI: 10.1109/taffc.2020.3035535] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Given the prevalence of depression worldwide and its major impact on society, several studies employed artificial intelligence modelling to automatically detect and assess depression. However, interpretation of these models and cues are rarely discussed in detail in the AI community, but have received increased attention lately. In this study, we aim to analyse the commonly selected features using a proposed framework of several feature selection methods and their effect on the classification results, which will provide an interpretation of the depression detection model. The developed framework aggregates and selects the most promising features for modelling depression detection from 38 feature selection algorithms of different categories. Using three real-world depression datasets, 902 behavioural cues were extracted from speech behaviour, speech prosody, eye movement and head pose. To verify the generalisability of the proposed framework, we applied the entire process to depression datasets individually and when combined. The results from the proposed framework showed that speech behaviour features (e.g. pauses) are the most distinctive features of the depression detection model. From the speech prosody modality, the strongest feature groups were F0, HNR, formants, and MFCC, while for the eye activity modality they were left-right eye movement and gaze direction, and for the head modality it was yaw head movement. Modelling depression detection using the selected features (even though there are only 9 features) outperformed using all features in all the individual and combined datasets. Our feature selection framework did not only provide an interpretation of the model, but was also able to produce a higher accuracy of depression detection with a small number of features in varied datasets. This could help to reduce the processing time needed to extract features and creating the model.
Collapse
Affiliation(s)
- Sharifa Alghowinem
- Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA, with Prince Sultan University, Riyadh, Saudi Arabia and with the Australian National University, Canberra, Australia
| | - Tom Gedeon
- Australian National University, Canberra, Australia
| | | | | | | |
Collapse
|
7
|
Vail AK, Girard JM, Bylsma LM, Cohn JF, Fournier J, Swartz HA, Morency LP. Toward Causal Understanding of Therapist-Client Relationships: A Study of Language Modality and Social Entrainment. Proc ACM Int Conf Multimodal Interact 2022; 2022:487-494. [PMID: 36913231 PMCID: PMC9999472 DOI: 10.1145/3536221.3556616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The relationship between a therapist and their client is one of the most critical determinants of successful therapy. The working alliance is a multifaceted concept capturing the collaborative aspect of the therapist-client relationship; a strong working alliance has been extensively linked to many positive therapeutic outcomes. Although therapy sessions are decidedly multimodal interactions, the language modality is of particular interest given its recognized relationship to similar dyadic concepts such as rapport, cooperation, and affiliation. Specifically, in this work we study language entrainment, which measures how much the therapist and client adapt toward each other's use of language over time. Despite the growing body of work in this area, however, relatively few studies examine causal relationships between human behavior and these relationship metrics: does an individual's perception of their partner affect how they speak, or does how they speak affect their perception? We explore these questions in this work through the use of structural equation modeling (SEM) techniques, which allow for both multilevel and temporal modeling of the relationship between the quality of the therapist-client working alliance and the participants' language entrainment. In our first experiment, we demonstrate that these techniques perform well in comparison to other common machine learning models, with the added benefits of interpretability and causal analysis. In our second analysis, we interpret the learned models to examine the relationship between working alliance and language entrainment and address our exploratory research questions. The results reveal that a therapist's language entrainment can have a significant impact on the client's perception of the working alliance, and that the client's language entrainment is a strong indicator of their perception of the working alliance. We discuss the implications of these results and consider several directions for future work in multimodality.
Collapse
|
8
|
Sheth SA, Bijanki KR, Metzger B, Allawala A, Pirtle V, Adkinson JA, Myers J, Mathura RK, Oswalt D, Tsolaki E, Xiao J, Noecker A, Strutt AM, Cohn JF, McIntyre CC, Mathew SJ, Borton D, Goodman W, Pouratian N. Deep Brain Stimulation for Depression Informed by Intracranial Recordings. Biol Psychiatry 2022; 92:246-251. [PMID: 35063186 PMCID: PMC9124238 DOI: 10.1016/j.biopsych.2021.11.007] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 11/01/2021] [Accepted: 11/02/2021] [Indexed: 11/02/2022]
Abstract
The success of deep brain stimulation (DBS) for treating Parkinson's disease has led to its application to several other disorders, including treatment-resistant depression. Results with DBS for treatment-resistant depression have been heterogeneous, with inconsistencies largely driven by incomplete understanding of the brain networks regulating mood, especially on an individual basis. We report results from the first subject treated with DBS for treatment-resistant depression using an approach that incorporates intracranial recordings to personalize understanding of network behavior and its response to stimulation. These recordings enabled calculation of individually optimized DBS stimulation parameters using a novel inverse solution approach. In the ensuing double-blind, randomized phase incorporating these bespoke parameter sets, DBS led to remission of symptoms and dramatic improvement in quality of life. Results from this initial case demonstrate the feasibility of this personalized platform, which may be used to improve surgical neuromodulation for a vast array of neurologic and psychiatric disorders.
Collapse
Affiliation(s)
- Sameer A. Sheth
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA,Corresponding Author: Sameer A. Sheth, MD, PhD, 7200 Cambridge Street, Suite 9B, Houston, TX 77030, 310-922-2596,
| | - Kelly R. Bijanki
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - Brian Metzger
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - Anusha Allawala
- Department of Engineering, Brown University, Providence, RI, 02912 USA
| | - Victoria Pirtle
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - Josh A. Adkinson
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - John Myers
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - Raissa K. Mathura
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - Denise Oswalt
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - Evangelia Tsolaki
- Department of Neurosurgery, University of California, Los Angeles, Los Angeles, CA, 90095 USA
| | - Jiayang Xiao
- Department of Neurosurgery, Baylor College of Medicine, Houston TX, 77030 USA
| | - Angela Noecker
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106 USA
| | - Adriana M. Strutt
- Department of Neurology, Baylor College of Medicine, Houston TX, 77030 USA
| | - Jeffrey F. Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, 19104 USA
| | - Cameron C. McIntyre
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, 44106 USA
| | - Sanjay J. Mathew
- Department of Psychiatry, Baylor College of Medicine, Houston TX, 77030 USA
| | - David Borton
- Department of Engineering, Brown University, Providence, RI, 02912 USA
| | - Wayne Goodman
- Department of Psychiatry, Baylor College of Medicine, Houston TX, 77030 USA
| | - Nader Pouratian
- Department of Neurosurgery, University of California, Los Angeles, Los Angeles, CA, 90095 USA
| |
Collapse
|
9
|
Provenza NR, Sheth SA, Dastin-van Rijn EM, Mathura RK, Ding Y, Vogt GS, Avendano-Ortega M, Ramakrishnan N, Peled N, Gelin LFF, Xing D, Jeni LA, Ertugrul IO, Barrios-Anderson A, Matteson E, Wiese AD, Xu J, Viswanathan A, Harrison MT, Bijanki KR, Storch EA, Cohn JF, Goodman WK, Borton DA. Long-term ecological assessment of intracranial electrophysiology synchronized to behavioral markers in obsessive-compulsive disorder. Nat Med 2021; 27:2154-2164. [PMID: 34887577 PMCID: PMC8800455 DOI: 10.1038/s41591-021-01550-z] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 09/22/2021] [Indexed: 01/02/2023]
Abstract
Detection of neural signatures related to pathological behavioral states could enable adaptive deep brain stimulation (DBS), a potential strategy for improving efficacy of DBS for neurological and psychiatric disorders. This approach requires identifying neural biomarkers of relevant behavioral states, a task best performed in ecologically valid environments. Here, in human participants with obsessive-compulsive disorder (OCD) implanted with recording-capable DBS devices, we synchronized chronic ventral striatum local field potentials with relevant, disease-specific behaviors. We captured over 1,000 h of local field potentials in the clinic and at home during unstructured activity, as well as during DBS and exposure therapy. The wide range of symptom severity over which the data were captured allowed us to identify candidate neural biomarkers of OCD symptom intensity. This work demonstrates the feasibility and utility of capturing chronic intracranial electrophysiology during daily symptom fluctuations to enable neural biomarker identification, a prerequisite for future development of adaptive DBS for OCD and other psychiatric disorders.
Collapse
Affiliation(s)
- Nicole R Provenza
- Brown University School of Engineering, Providence, RI, USA
- Charles Stark Draper Laboratory, Cambridge, MA, USA
| | - Sameer A Sheth
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | | | - Raissa K Mathura
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Yaohan Ding
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA
| | - Gregory S Vogt
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Michelle Avendano-Ortega
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Nithya Ramakrishnan
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Noam Peled
- MGH/HST Martinos Center for Biomedical Imaging, Charlestown, MA, USA
- Harvard Medical School, Cambridge, MA, USA
| | | | - David Xing
- Brown University School of Engineering, Providence, RI, USA
| | - Laszlo A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Itir Onal Ertugrul
- Department of Cognitive Science and Artificial Intelligence, Tilburg University, Tilburg, the Netherlands
| | | | - Evan Matteson
- Brown University School of Engineering, Providence, RI, USA
| | - Andrew D Wiese
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
- Department of Psychology, University of Missouri-Kansas City, Kansas City, MO, USA
| | - Junqian Xu
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
- Department of Radiology, Baylor College of Medicine, Houston, TX, USA
| | - Ashwin Viswanathan
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | | | - Kelly R Bijanki
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Eric A Storch
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Wayne K Goodman
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | - David A Borton
- Brown University School of Engineering, Providence, RI, USA.
- Carney Institute for Brain Science, Brown University, Providence, RI, USA.
- Center for Neurorestoration and Neurotechnology, Rehabilitation R&D Service, Department of Veterans Affairs, Providence, RI, USA.
| |
Collapse
|
10
|
Wörtwein T, Sheeber LB, Allen N, Cohn JF, Morency LP. Human-Guided Modality Informativeness for Affective States. Proc ACM Int Conf Multimodal Interact 2021; 2021:728-734. [PMID: 35128550 PMCID: PMC8812829 DOI: 10.1145/3462244.3481004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper studies the hypothesis that not all modalities are always needed to predict affective states. We explore this hypothesis in the context of recognizing three affective states that have shown a relation to a future onset of depression: positive, aggressive, and dysphoric. In particular, we investigate three important modalities for face-to-face conversations: vision, language, and acoustic modality. We first perform a human study to better understand which subset of modalities people find informative, when recognizing three affective states. As a second contribution, we explore how these human annotations can guide automatic affect recognition systems to be more interpretable while not degrading their predictive performance. Our studies show that humans can reliably annotate modality informativeness. Further, we observe that guided models significantly improve interpretability, i.e., they attend to modalities similarly to how humans rate the modality informativeness, while at the same time showing a slight increase in predictive performance.
Collapse
Affiliation(s)
- Torsten Wörtwein
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | - Nicholas Allen
- Department of Psychology, University of Oregon, Eugene, OR, USA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | | |
Collapse
|
11
|
Chen M, Chow SM, Hammal Z, Messinger DS, Cohn JF. A Person- and Time-Varying Vector Autoregressive Model to Capture Interactive Infant-Mother Head Movement Dynamics. Multivariate Behav Res 2021; 56:739-767. [PMID: 32530313 PMCID: PMC8763288 DOI: 10.1080/00273171.2020.1762065] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Head movement is an important but often overlooked component of emotion and social interaction. Examination of regularity and differences in head movements of infant-mother dyads over time and across dyads can shed light on whether and how mothers and infants alter their dynamics over the course of an interaction to adapt to each others. One way to study these emergent differences in dynamics is to allow parameters that govern the patterns of interactions to change over time, and according to person- and dyad-specific characteristics. Using two estimation approaches to implement variations of a vector-autoregressive model with time-varying coefficients, we investigated the dynamics of automatically-tracked head movements in mothers and infants during the Face-Face/Still-Face Procedure (SFP) with 24 infant-mother dyads. The first approach requires specification of a confirmatory model for the time-varying parameters as part of a state-space model, whereas the second approach handles the time-varying parameters in a semi-parametric ("mostly" model-free) fashion within a generalized additive modeling framework. Results suggested that infant-mother head movement dynamics varied in time both within and across episodes of the SFP, and varied based on infants' subsequently-assessed attachment security. Code for implementing the time-varying vector-autoregressive model using two R packages, dynr and mgcv, is provided.
Collapse
Affiliation(s)
| | | | - Zakia Hammal
- The Robotics Institute, Carnegie Mellon University
| | | | - Jeffrey F Cohn
- The Robotics Institute, Carnegie Mellon University
- University of Pittsburgh
| |
Collapse
|
12
|
Allawala A, Bijanki KR, Goodman W, Cohn JF, Viswanathan A, Yoshor D, Borton DA, Pouratian N, Sheth SA. In Reply: A Novel Framework for Network-Targeted Neuropsychiatric Deep Brain Stimulation. Neurosurgery 2021; 89:E283. [PMID: 34383050 DOI: 10.1093/neuros/nyab308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 06/27/2021] [Indexed: 11/14/2022] Open
Affiliation(s)
- Anusha Allawala
- School of Engineering Brown University Providence, Rhode Island, USA
| | - Kelly R Bijanki
- Department of Neurosurgery Baylor College of Medicine Houston, Texas, USA
| | - Wayne Goodman
- Menninger Department of Psychiatry and Behavioral Sciences Baylor College of Medicine Houston, Texas, USA
| | - Jeffrey F Cohn
- Department of Psychology University of Pittsburgh Pittsburgh, Pennsylvania, USA
| | - Ashwin Viswanathan
- Department of Neurosurgery Baylor College of Medicine Houston, Texas, USA
| | - Daniel Yoshor
- Department of Neurosurgery University of Pennsylvania Philadelphia, Pennsylvania, USA
| | - David A Borton
- School of Engineering Brown University Providence, Rhode Island, USA
| | - Nader Pouratian
- Department of Neurological Surgery University of Texas, Southwestern Dallas, Texas, USA
| | - Sameer A Sheth
- Department of Neurosurgery Baylor College of Medicine Houston, Texas, USA
| |
Collapse
|
13
|
Niinuma K, Onal Ertugrul I, Cohn JF, Jeni LA. Systematic Evaluation of Design Choices for Deep Facial Action Coding Across Pose. Front Comput Sci 2021. [DOI: 10.3389/fcomp.2021.636094] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The performance of automated facial expression coding is improving steadily. Advances in deep learning techniques have been key to this success. While the advantage of modern deep learning techniques is clear, the contribution of critical design choices remains largely unknown, especially for facial action unit occurrence and intensity across pose. Using the The Facial Expression Recognition and Analysis 2017 (FERA 2017) database, which provides a common protocol to evaluate robustness to pose variation, we systematically evaluated design choices in pre-training, feature alignment, model size selection, and optimizer details. Informed by the findings, we developed an architecture that exceeds state-of-the-art on FERA 2017. The architecture achieved a 3.5% increase in F1 score for occurrence detection and a 5.8% increase in Intraclass Correlation (ICC) for intensity estimation. To evaluate the generalizability of the architecture to unseen poses and new dataset domains, we performed experiments across pose in FERA 2017 and across domains in Denver Intensity of Spontaneous Facial Action (DISFA) and the UNBC Pain Archive.
Collapse
|
14
|
Allawala A, Bijanki KR, Goodman W, Cohn JF, Viswanathan A, Yoshor D, Borton DA, Pouratian N, Sheth SA. A Novel Framework for Network-Targeted Neuropsychiatric Deep Brain Stimulation. Neurosurgery 2021; 89:E116-E121. [PMID: 33913499 PMCID: PMC8279838 DOI: 10.1093/neuros/nyab112] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 02/14/2021] [Indexed: 12/28/2022] Open
Abstract
Deep brain stimulation (DBS) has emerged as a promising therapy for neuropsychiatric illnesses, including depression and obsessive-compulsive disorder, but has shown inconsistent results in prior clinical trials. We propose a shift away from the empirical paradigm for developing new DBS applications, traditionally based on testing brain targets with conventional stimulation paradigms. Instead, we propose a multimodal approach centered on an individualized intracranial investigation adapted from the epilepsy monitoring experience, which integrates comprehensive behavioral assessment, such as the Research Domain Criteria proposed by the National Institutes of Mental Health. In this paradigm-shifting approach, we combine readouts obtained from neurophysiology, behavioral assessments, and self-report during broad exploration of stimulation parameters and behavioral tasks to inform the selection of ideal DBS parameters. Such an approach not only provides a foundational understanding of dysfunctional circuits underlying symptom domains in neuropsychiatric conditions but also aims to identify generalizable principles that can ultimately enable individualization and optimization of therapy without intracranial monitoring.
Collapse
Affiliation(s)
- Anusha Allawala
- School of Engineering, Brown University, Providence, Rhode Island, USA
| | - Kelly R Bijanki
- Department of Neurosurgery, Baylor College of Medicine, Houston, Texas, USA
| | - Wayne Goodman
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, Texas, USA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Ashwin Viswanathan
- Department of Neurosurgery, Baylor College of Medicine, Houston, Texas, USA
| | - Daniel Yoshor
- Department of Neurosurgery, Baylor College of Medicine, Houston, Texas, USA.,Department of Neurosurgery, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - David A Borton
- School of Engineering, Brown University, Providence, Rhode Island, USA.,Carney Institute for Brain Science, Brown University, Providence, Rhode Island, USA.,Department of Veterans Affairs, Providence VA Medical Center for Neurorestoration and Neurotechnology, Providence, Rhode Island, USA
| | - Nader Pouratian
- Department of Neurological Surgery, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Sameer A Sheth
- Department of Neurosurgery, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|
15
|
Girard JM, Cohn JF, Yin L, Morency LP. Reconsidering the Duchenne Smile: Formalizing and Testing Hypotheses about Eye Constriction and Positive Emotion. ACTA ACUST UNITED AC 2021; 2:32-47. [PMID: 34337430 DOI: 10.1007/s42761-020-00030-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The common view of emotional expressions is that certain configurations of facial-muscle movements reliably reveal certain categories of emotion. The principal exemplar of this view is the Duchenne smile, a configuration of facial-muscle movements (i.e., smiling with eye constriction) that has been argued to reliably reveal genuine positive emotion. In this paper, we formalized a list of hypotheses that have been proposed regarding the Duchenne smile, briefly reviewed the literature weighing on these hypotheses, identified limitations and unanswered questions, and conducted two empirical studies to begin addressing these limitations and answering these questions. Both studies analyzed a database of 751 smiles observed while 136 participants completed experimental tasks designed to elicit amusement, embarrassment, fear, and physical pain. Study 1 focused on participants' self-reported positive emotion and Study 2 focused on how third-party observers would perceive videos of these smiles. Most of the hypotheses that have been proposed about the Duchenne smile were either contradicted by or only weakly supported by our data. Eye constriction did provide some information about experienced positive emotion, but this information was lacking in specificity, already provided by other smile characteristics, and highly dependent on context. Eye constriction provided more information about perceived positive emotion, including some unique information over other smile characteristics, but context was also important here as well. Overall, our results suggest that accurately inferring positive emotion from a smile requires more sophisticated methods than simply looking for the presence/absence (or even the intensity) of eye constriction.
Collapse
Affiliation(s)
| | | | - Lijun Yin
- Binghamton University, Binghamton, NY, USA
| | | |
Collapse
|
16
|
Niinuma K, Ertugrul IO, Cohn JF, Jeni LA. Synthetic Expressions are Better Than Real for Learning to Detect Facial Actions. IEEE Winter Conf Appl Comput Vis 2021; 2021:1247-1256. [PMID: 38250021 PMCID: PMC10798354 DOI: 10.1109/wacv48630.2021.00129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]
Abstract
Critical obstacles in training classifiers to detect facial actions are the limited sizes of annotated video databases and the relatively low frequencies of occurrence of many actions. To address these problems, we propose an approach that makes use of facial expression generation. Our approach reconstructs the 3D shape of the face from each video frame, aligns the 3D mesh to a canonical view, and then trains a GAN-based network to synthesize novel images with facial action units of interest. To evaluate this approach, a deep neural network was trained on two separate datasets: One network was trained on video of synthesized facial expressions generated from FERA17; the other network was trained on unaltered video from the same database. Both networks used the same train and validation partitions and were tested on the test partition of actual video from FERA17. The network trained on synthesized facial expressions outperformed the one trained on actual facial expressions and surpassed current state-of-the-art approaches.
Collapse
|
17
|
Ertugrul IO, Cohn JF, Jeni LA, Zhang Z, Yin L, Ji Q. Crossing Domains for AU Coding: Perspectives, Approaches, and Measures. IEEE Trans Biom Behav Identity Sci 2020; 2:158-171. [PMID: 32377637 PMCID: PMC7202467 DOI: 10.1109/tbiom.2020.2977225] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Facial action unit (AU) detectors have performed well when trained and tested within the same domain. How well do AU detectors transfer to domains in which they have not been trained? We review literature on cross-domain transfer and conduct experiments to address limitations of prior research. We evaluate generalizability in four publicly available databases. EB+ (an expanded version of BP4D+), Sayette GFT, DISFA and UNBC Shoulder Pain (SP). The databases differ in observational scenarios, context, participant diversity, range of head pose, video resolution, and AU base rates. In most cases performance decreased with change in domain, often to below the threshold needed for behavioral research. However, exceptions were noted. Deep and shallow approaches generally performed similarly and average results were slightly better for deep model compared to shallow one. Occlusion sensitivity maps revealed that local specificity was greater for AU detection within than cross domains. The findings suggest that more varied domains and deep learning approaches may be better suited for generalizability and suggest the need for more attention to characteristics that vary between domains. Until further improvement is realized, caution is warranted when applying AU classifiers from one domain to another.
Collapse
Affiliation(s)
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - László A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Zheng Zhang
- Department of Computer Science, State University of New York at Binghamton, USA
| | - Lijun Yin
- Department of Computer Science, State University of New York at Binghamton, USA
| | - Qiang Ji
- Rensselaer Polytechnic Institute, Troy, NY, USA
| |
Collapse
|
18
|
Affiliation(s)
- Wayne K Goodman
- From the Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston (Goodman, Storch); the Department of Psychology, University of Pittsburgh (Cohn); and the Department of Neurosurgery, Baylor College of Medicine, Houston (Sheth)
| | - Eric A Storch
- From the Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston (Goodman, Storch); the Department of Psychology, University of Pittsburgh (Cohn); and the Department of Neurosurgery, Baylor College of Medicine, Houston (Sheth)
| | - Jeffrey F Cohn
- From the Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston (Goodman, Storch); the Department of Psychology, University of Pittsburgh (Cohn); and the Department of Neurosurgery, Baylor College of Medicine, Houston (Sheth)
| | - Sameer A Sheth
- From the Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston (Goodman, Storch); the Department of Psychology, University of Pittsburgh (Cohn); and the Department of Neurosurgery, Baylor College of Medicine, Houston (Sheth)
| |
Collapse
|
19
|
Girard JM, Shandar G, Liu Z, Cohn JF, Yin L, Morency LP. Reconsidering the Duchenne Smile: Indicator of Positive Emotion or Artifact of Smile Intensity? Int Conf Affect Comput Intell Interact Workshops 2019; 2019:594-599. [PMID: 32363090 DOI: 10.1109/acii.2019.8925535] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The Duchenne smile hypothesis is that smiles that include eye constriction (AU6) are the product of genuine positive emotion, whereas smiles that do not are either falsified or related to negative emotion. This hypothesis has become very influential and is often used in scientific and applied settings to justify the inference that a smile is either true or false. However, empirical support for this hypothesis has been equivocal and some researchers have proposed that, rather than being a reliable indicator of positive emotion, AU6 may just be an artifact produced by intense smiles. Initial support for this proposal has been found when comparing smiles related to genuine and feigned positive emotion; however, it has not yet been examined when comparing smiles related to genuine positive and negative emotion. The current study addressed this gap in the literature by examining spontaneous smiles from 136 participants during the elicitation of amusement, embarrassment, fear, and pain (from the BP4D+ dataset). Bayesian multilevel regression models were used to quantify the associations between AU6 and self-reported amusement while controlling for smile intensity. Models were estimated to infer amusement from AU6 and to explain the intensity of AU6 using amusement. In both cases, controlling for smile intensity substantially reduced the hypothesized association, whereas the effect of smile intensity itself was quite large and reliable. These results provide further evidence that the Duchenne smile is likely an artifact of smile intensity rather than a reliable and unique indicator of genuine positive emotion.
Collapse
Affiliation(s)
- Jeffrey M Girard
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA
| | - Gayatri Shandar
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA
| | - Zhun Liu
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA
| | - Lijun Yin
- Department of Computer Science, Binghamton University, Binghamton, NY
| | | |
Collapse
|
20
|
Bhatia S, Goecke R, Hammal Z, Cohn JF. Automated Measurement of Head Movement Synchrony during Dyadic Depression Severity Interviews. Proc Int Conf Autom Face Gesture Recognit 2019; 2019. [PMID: 31745390 DOI: 10.1109/fg.2019.8756509] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
With few exceptions, most research in automated assessment of depression has considered only the patient's behavior to the exclusion of the therapist's behavior. We investigated the interpersonal coordination (synchrony) of head movement during patient-therapist clinical interviews. Our sample consisted of patients diagnosed with major depressive disorder. They were recorded in clinical interviews (Hamilton Rating Scale for Depression, HRSD) at 7-week intervals over a period of 21 weeks. For each session, patient and therapist 3D head movement was tracked from 2D videos. Head angles in the horizontal (pitch) and vertical (yaw) axes were used to measure head movement. Interpersonal coordination of head movement between patients and therapists was measured using windowed cross-correlation. Patterns of coordination in head movement were investigated using the peak picking algorithm. Changes in head movement coordination over the course of treatment were measured using a hierarchical linear model (HLM). The results indicated a strong effect for patient-therapist head movement synchrony. Within-dyad variability in head movement coordination was found to be higher than between-dyad variability, meaning that differences over time in a dyad were higher as compared to the differences between dyads. Head movement synchrony did not change over the course of treatment with change in depression severity. To the best of our knowledge, this study is the first attempt to analyze the mutual influence of patient-therapist head movement in relation to depression severity.
Collapse
Affiliation(s)
- Shalini Bhatia
- Human-Centred Technology Research Centre, University of Canberra, Canberra, Australia
| | - Roland Goecke
- Human-Centred Technology Research Centre, University of Canberra, Canberra, Australia
| | - Zakia Hammal
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| |
Collapse
|
21
|
Abstract
Facial action units (AUs) relate to specific local facial regions. Recent efforts in automated AU detection have focused on learning the facial patch representations to detect specific AUs. These efforts have encountered three hurdles. First, they implicitly assume that facial patches are robust to head rotation; yet non-frontal rotation is common. Second, mappings between AUs and patches are defined a priori, which ignores co-occurrences among AUs. And third, the dynamics of AUs are either ignored or modeled sequentially rather than simultaneously as in human perception. Inspired by recent advances in human perception, we propose a dynamic patch-attentive deep network, called D-PAttNet, for AU detection that (i) controls for 3D head and face rotation, (ii) learns mappings of patches to AUs, and (iii) models spatiotemporal dynamics. D-PAttNet approach significantly improves upon existing state of the art.
Collapse
Affiliation(s)
- Itir Onal Ertugrul
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Le Yang
- School of Computer Science, Northwestern Polytechnical University, Xian, China
| | - László A. Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Jeffrey F. Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
22
|
Niinuma K, Jeni LA, Ertugrul IO, Cohn JF. Unmasking the Devil in the Details: What Works for Deep Facial Action Coding? BMVC 2019; 2019:4. [PMID: 32510058 PMCID: PMC7274256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The performance of automated facial expression coding has improving steadily as evidenced by results of the latest Facial Expression Recognition and Analysis (FERA 2017) Challenge. Advances in deep learning techniques have been key to this success. Yet the contribution of critical design choices remains largely unknown. Using the FERA 2017 database, we systematically evaluated design choices in pre-training, feature alignment, model size selection, and optimizer details. Our findings vary from the counter-intuitive (e.g., generic pre-training outperformed face-specific models) to best practices in tuning optimizers. Informed by what we found, we developed an architecture that exceeded state-of-the-art on FERA 2017. We achieved a 3.5% increase in F1 score for occurrence detection and a 5.8% increase in ICC for intensity estimation.
Collapse
Affiliation(s)
| | - Laszlo A Jeni
- Robotics Institute Carnegie Mellon University Pittsburgh, PA, USA
| | | | - Jeffrey F Cohn
- Department of Psychology University of Pittsburgh Pittsburgh, PA, USA
| |
Collapse
|
23
|
Ertugrul IO, Jeni LA, Ding W, Cohn JF. AFAR: A Deep Learning Based Tool for Automated Facial Affect Recognition. Proc Int Conf Autom Face Gesture Recognit 2019; 2019. [PMID: 31762712 DOI: 10.1109/fg.2019.8756623] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
| | - László A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Wanqiao Ding
- Department of Psychology, University of Pittsburgh, PA, USA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, PA, USA
| |
Collapse
|
24
|
Ertugrul IO, Cohn JF, Jeni LA, Zhang Z, Yin L, Ji Q. Cross-domain AU Detection: Domains, Learning Approaches, and Measures. Proc Int Conf Autom Face Gesture Recognit 2019; 2019. [PMID: 31749665 DOI: 10.1109/fg.2019.8756543] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Facial action unit (AU) detectors have performed well when trained and tested within the same domain. Do AU detectors transfer to new domains in which they have not been trained? To answer this question, we review literature on cross-domain transfer and conduct experiments to address limitations of prior research. We evaluate both deep and shallow approaches to AU detection (CNN and SVM, respectively) in two large, well-annotated, publicly available databases, Expanded BP4D+ and GFT. The databases differ in observational scenarios, participant characteristics, range of head pose, video resolution, and AU base rates. For both approaches and databases, performance decreased with change in domain, often to below the threshold needed for behavioral research. Decreases were not uniform, however. They were more pronounced for GFT than for Expanded BP4D+ and for shallow relative to deep learning. These findings suggest that more varied domains and deep learning approaches may be better suited for promoting generalizability. Until further improvement is realized, caution is warranted when applying AU classifiers from one domain to another.
Collapse
Affiliation(s)
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - László A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Zheng Zhang
- Department of Computer Science, State University of New York at Binghamton, USA
| | - Lijun Yin
- Department of Computer Science, State University of New York at Binghamton, USA
| | - Qiang Ji
- Rensselaer Polytechnic Institute, Troy, NY, USA
| |
Collapse
|
25
|
Provenza NR, Matteson ER, Allawala AB, Barrios-Anderson A, Sheth SA, Viswanathan A, McIngvale E, Storch EA, Frank MJ, McLaughlin NCR, Cohn JF, Goodman WK, Borton DA. The Case for Adaptive Neuromodulation to Treat Severe Intractable Mental Disorders. Front Neurosci 2019; 13:152. [PMID: 30890909 PMCID: PMC6412779 DOI: 10.3389/fnins.2019.00152] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Accepted: 02/11/2019] [Indexed: 12/20/2022] Open
Abstract
Mental disorders are a leading cause of disability worldwide, and available treatments have limited efficacy for severe cases unresponsive to conventional therapies. Neurosurgical interventions, such as lesioning procedures, have shown success in treating refractory cases of mental illness, but may have irreversible side effects. Neuromodulation therapies, specifically Deep Brain Stimulation (DBS), may offer similar therapeutic benefits using a reversible (explantable) and adjustable platform. Early DBS trials have been promising, however, pivotal clinical trials have failed to date. These failures may be attributed to targeting, patient selection, or the “open-loop” nature of DBS, where stimulation parameters are chosen ad hoc during infrequent visits to the clinician’s office that take place weeks to months apart. Further, the tonic continuous stimulation fails to address the dynamic nature of mental illness; symptoms often fluctuate over minutes to days. Additionally, stimulation-based interventions can cause undesirable effects if applied when not needed. A responsive, adaptive DBS (aDBS) system may improve efficacy by titrating stimulation parameters in response to neural signatures (i.e., biomarkers) related to symptoms and side effects. Here, we present rationale for the development of a responsive DBS system for treatment of refractory mental illness, detail a strategic approach for identification of electrophysiological and behavioral biomarkers of mental illness, and discuss opportunities for future technological developments that may harness aDBS to deliver improved therapy.
Collapse
Affiliation(s)
- Nicole R Provenza
- Brown University School of Engineering, Providence, RI, United States.,Charles Stark Draper Laboratory, Cambridge, MA, United States
| | - Evan R Matteson
- Brown University School of Engineering, Providence, RI, United States
| | - Anusha B Allawala
- Brown University School of Engineering, Providence, RI, United States
| | - Adriel Barrios-Anderson
- Psychiatric Neurosurgery Program at Butler Hospital, The Warren Alpert Medical School of Brown University, Providence, RI, United States
| | - Sameer A Sheth
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| | - Ashwin Viswanathan
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| | - Elizabeth McIngvale
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, United States
| | - Eric A Storch
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, United States
| | - Michael J Frank
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, United States.,Department of Psychology, University of Pittsburgh, Pittsburgh, PA, United States
| | - Nicole C R McLaughlin
- Psychiatric Neurosurgery Program at Butler Hospital, The Warren Alpert Medical School of Brown University, Providence, RI, United States
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, United States
| | - Wayne K Goodman
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, United States
| | - David A Borton
- Brown University School of Engineering, Providence, RI, United States.,Carney Institute for Brain Science, Brown University, Providence, RI, United States.,Department of Veterans Affairs, Providence Medical Center, Center for Neurorestoration and Neurotechnology, Providence, RI, United States
| |
Collapse
|
26
|
Abstract
Facial action units (AUs) may be represented spatially, temporally, and in terms of their correlation. Previous research focuses on one or another of these aspects or addresses them disjointly. We propose a hybrid network architecture that jointly models spatial and temporal representations and their correlation. In particular, we use a Convolutional Neural Network (CNN) to learn spatial representations, and a Long Short-Term Memory (LSTM) to model temporal dependencies among them. The outputs of CNNs and LSTMs are aggregated into a fusion network to produce per-frame prediction of multiple AUs. The hybrid network was compared to previous state-of-the-art approaches in two large FACS-coded video databases, GFT and BP4D, with over 400,000 AU-coded frames of spontaneous facial behavior in varied social contexts. Relative to standard multi-label CNN and feature-based state-of-the-art approaches, the hybrid system reduced person-specific biases and obtained increased accuracy for AU detection. To address class imbalance within and between batches during training the network, we introduce multi-labeling sampling strategies that further increase accuracy when AUs are relatively sparse. Finally, we provide visualization of the learned AU models, which, to the best of our best knowledge, reveal for the first time how machines see AUs.
Collapse
Affiliation(s)
- Wen-Sheng Chu
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
| | | | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| |
Collapse
|
27
|
Cohn JF, Okun MS, Jeni LA, Ertugrul IO, Borton D, Malone D, Goodman WK. Automated Affect Detection in Deep Brain Stimulation for Obsessive-Compulsive Disorder: A Pilot Study. Proc ACM Int Conf Multimodal Interact 2018; 2018:40-44. [PMID: 30511050 PMCID: PMC6271416 DOI: 10.1145/3242969.3243023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Automated measurement of affective behavior in psychopathology has been limited primarily to screening and diagnosis. While useful, clinicians more often are concerned with whether patients are improving in response to treatment. Are symptoms abating, is affect becoming more positive, are unanticipated side effects emerging? When treatment includes neural implants, need for objective, repeatable biometrics tied to neurophysiology becomes especially pressing. We used automated face analysis to assess treatment response to deep brain stimulation (DBS) in two patients with intractable obsessive-compulsive disorder (OCD). One was assessed intraoperatively following implantation and activation of the DBS device. The other was assessed three months post-implantation. Both were assessed during DBS on and o conditions. Positive and negative valence were quantified using a CNN trained on normative data of 160 non-OCD participants. Thus, a secondary goal was domain transfer of the classifiers. In both contexts, DBS-on resulted in marked positive affect. In response to DBS-off, affect flattened in both contexts and alternated with increased negative affect in the outpatient setting. Mean AUC for domain transfer was 0.87. These findings suggest that parametric variation of DBS is strongly related to affective behavior and may introduce vulnerability for negative affect in the event that DBS is discontinued.
Collapse
|
28
|
Abstract
Most approaches to face alignment treat the face as a 2D object, which fails to represent depth variation and is vulnerable to loss of shape consistency when the face rotates along a 3D axis. Because faces commonly rotate three dimensionally, 2D approaches are vulnerable to significant error. 3D morphable models, employed as a second step in 2D+3D approaches are robust to face rotation but are computationally too expensive for many applications, yet their ability to maintain viewpoint consistency is unknown. We present an alternative approach that estimates 3D face landmarks in a single face image. The method uses a regression forest-based algorithm that adds a third dimension to the common cascade pipeline. 3D face landmarks are estimated directly, which avoids fitting a 3D morphable model. The proposed method achieves viewpoint consistency in a computationally efficient manner that is robust to 3D face rotation. To train and test our approach, we introduce the Multi-PIE Viewpoint Consistent database. In empirical tests, the proposed method achieved simple yet effective head pose estimation and viewpoint consistency on multiple measures relative to alternative approaches.
Collapse
|
29
|
Ertugrul IO, Jeni LA, Cohn JF. FACSCaps: Pose-Independent Facial Action Coding with Capsules. Conf Comput Vis Pattern Recognit Workshops 2018; 2018:2211-2220. [PMID: 30944768 PMCID: PMC6443417 DOI: 10.1109/cvprw.2018.00287] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Most automated facial expression analysis methods treat the face as a 2D object, flat like a sheet of paper. That works well provided images are frontal or nearly so. In real- world conditions, moderate to large head rotation is common and system performance to recognize expression degrades. Multi-view Convolutional Neural Networks (CNNs) have been proposed to increase robustness to pose, but they require greater model sizes and may generalize poorly across views that are not included in the training set. We propose FACSCaps architecture to handle multi-view and multi-label facial action unit (AU) detection within a single model that can generalize to novel views. Additionally, FACSCaps's ability to synthesize faces enables insights into what is leaned by the model. FACSCaps models video frames using matrix capsules, where hierarchical pose relationships between face parts are built into internal representations. The model is trained by jointly optimizing a multi-label loss and the reconstruction accuracy. FACSCaps was evaluated using the FERA 2017 facial expression dataset that includes spontaneous facial expressions in a wide range of head orientations. FACSCaps outperformed both state-of-the-art CNNs and their temporal extensions.
Collapse
Affiliation(s)
| | - Lászlό A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
30
|
Hammal Z, Cohn JF, Wallace ER, Heike CL, Birgfeld CB, Oster H, Speltz ML. Facial Expressiveness in Infants With and Without Craniofacial Microsomia: Preliminary Findings. Cleft Palate Craniofac J 2018; 55:711-720. [PMID: 29377723 PMCID: PMC5936082 DOI: 10.1177/1055665617753481] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
OBJECTIVE To compare facial expressiveness (FE) of infants with and without craniofacial macrosomia (cases and controls, respectively) and to compare phenotypic variation among cases in relation to FE. DESIGN Positive and negative affect was elicited in response to standardized emotion inductions, video recorded, and manually coded from video using the Facial Action Coding System for Infants and Young Children. SETTING Five craniofacial centers: Children's Hospital of Los Angeles, Children's Hospital of Philadelphia, Seattle Children's Hospital, University of Illinois-Chicago, and University of North Carolina-Chapel Hill. PARTICIPANTS Eighty ethnically diverse 12- to 14-month-old infants. MAIN OUTCOME MEASURES FE was measured on a frame-by-frame basis as the sum of 9 observed facial action units (AUs) representative of positive and negative affect. RESULTS FE differed between conditions intended to elicit positive and negative affect (95% confidence interval = 0.09-0.66, P = .01). FE failed to differ between cases and controls (ES = -0.16 to -0.02, P = .47 to .92). Among cases, those with and without mandibular hypoplasia showed similar levels of FE (ES = -0.38 to 0.54, P = .10 to .66). CONCLUSIONS FE varied between positive and negative affect, and cases and controls responded similarly. Null findings for case/control differences may be attributable to a lower than anticipated prevalence of nerve palsy among cases, the selection of AUs, or the use of manual coding. In future research, we will reexamine group differences using an automated, computer vision approach that can cover a broader range of facial movements and their dynamics.
Collapse
Affiliation(s)
- Zakia Hammal
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jeffrey F. Cohn
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | | | - Carrie L. Heike
- Seattle Children’s Research Institute, Seattle, WA, USA
- Seattle Children’s Hospital, Seattle, WA, USA
- University of Washington School of Medicine, Seattle, WA, USA
| | - Craig B. Birgfeld
- Seattle Children’s Research Institute, Seattle, WA, USA
- Seattle Children’s Hospital, Seattle, WA, USA
- University of Washington School of Medicine, Seattle, WA, USA
| | - Harriet Oster
- NYU School of Professional Studies, New York, NY, USA
| | - Matthew L. Speltz
- Seattle Children’s Research Institute, Seattle, WA, USA
- University of Washington School of Medicine, Seattle, WA, USA
| |
Collapse
|
31
|
Abstract
Depression is one of the most common psychiatric disorders worldwide, with over 350 million people affected. Current methods to screen for and assess depression depend almost entirely on clinical interviews and self-report scales. While useful, such measures lack objective, systematic, and efficient ways of incorporating behavioral observations that are strong indicators of depression presence and severity. Using dynamics of facial and head movement and vocalization, we trained classifiers to detect three levels of depression severity. Participants were a community sample diagnosed with major depressive disorder. They were recorded in clinical interviews (Hamilton Rating Scale for Depression, HRSD) at seven-week intervals over a period of 21 weeks. At each interview, they were scored by the HRSD as moderately to severely depressed, mildly depressed, or remitted. Logistic regression classifiers using leave-one-participant-out validation were compared for facial movement, head movement, and vocal prosody individually and in combination. Accuracy of depression severity measurement from facial movement dynamics was higher than that for head movement dynamics, and each was substantially higher than that for vocal prosody. Accuracy using all three modalities combined only marginally exceeded that of face and head combined. These findings suggest that automatic detection of depression severity from behavioral indicators in patients is feasible and that multimodal measures afford the most powerful detection.
Collapse
|
32
|
Martin KB, Hammal Z, Ren G, Cohn JF, Cassell J, Ogihara M, Britton JC, Gutierrez A, Messinger DS. Objective measurement of head movement differences in children with and without autism spectrum disorder. Mol Autism 2018; 9:14. [PMID: 29492241 PMCID: PMC5828311 DOI: 10.1186/s13229-018-0198-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 02/06/2018] [Indexed: 12/03/2022] Open
Abstract
Background Deficits in motor movement in children with autism spectrum disorder (ASD) have typically been characterized qualitatively by human observers. Although clinicians have noted the importance of atypical head positioning (e.g. social peering and repetitive head banging) when diagnosing children with ASD, a quantitative understanding of head movement in ASD is lacking. Here, we conduct a quantitative comparison of head movement dynamics in children with and without ASD using automated, person-independent computer-vision based head tracking (Zface). Because children with ASD often exhibit preferential attention to nonsocial versus social stimuli, we investigated whether children with and without ASD differed in their head movement dynamics depending on stimulus sociality. Methods The current study examined differences in head movement dynamics in children with (n = 21) and without ASD (n = 21). Children were video-recorded while watching a 16-min video of social and nonsocial stimuli. Three dimensions of rigid head movement—pitch (head nods), yaw (head turns), and roll (lateral head inclinations)—were tracked using Zface. The root mean square of pitch, yaw, and roll was calculated to index the magnitude of head angular displacement (quantity of head movement) and angular velocity (speed). Results Compared with children without ASD, children with ASD exhibited greater yaw displacement, indicating greater head turning, and greater velocity of yaw and roll, indicating faster head turning and inclination. Follow-up analyses indicated that differences in head movement dynamics were specific to the social rather than the nonsocial stimulus condition. Conclusions Head movement dynamics (displacement and velocity) were greater in children with ASD than in children without ASD, providing a quantitative foundation for previous clinical reports. Head movement differences were evident in lateral (yaw and roll) but not vertical (pitch) movement and were specific to a social rather than nonsocial condition. When presented with social stimuli, children with ASD had higher levels of head movement and moved their heads more quickly than children without ASD. Children with ASD may use head movement to modulate their perception of social scenes. Electronic supplementary material The online version of this article (10.1186/s13229-018-0198-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Katherine B Martin
- 1Department of Psychology, University of Miami, 5665 Ponce de Leon Blvd, Coral Gables, FL 33146 USA
| | - Zakia Hammal
- 2Robotics Institute, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
| | - Gang Ren
- 3Center for Computational Science, University of Miami, 1320 S Dixie Hwy, Miami, FL 33146 USA
| | - Jeffrey F Cohn
- 4Department of Psychology, University of Pittsburgh, 210 S. Bouquet St., Pittsburgh, PA 15260 USA
| | - Justine Cassell
- 5Human Computer Interaction, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213 USA
| | - Mitsunori Ogihara
- 6Department of Computer Science, University of Miami, 1365 Memorial Drive, Coral Gables, FL 33146 USA
| | - Jennifer C Britton
- 1Department of Psychology, University of Miami, 5665 Ponce de Leon Blvd, Coral Gables, FL 33146 USA
| | - Anibal Gutierrez
- 1Department of Psychology, University of Miami, 5665 Ponce de Leon Blvd, Coral Gables, FL 33146 USA
| | - Daniel S Messinger
- 1Department of Psychology, University of Miami, 5665 Ponce de Leon Blvd, Coral Gables, FL 33146 USA
| |
Collapse
|
33
|
Slifer KJ, Pulbrook V, Amari A, Vona-Messersmith N, Cohn JF, Ambadar Z, Beck M, Piszczor R. Social Acceptance and Facial Behavior in Children with Oral Clefts. Cleft Palate Craniofac J 2017; 43:226-36. [PMID: 16526929 DOI: 10.1597/05-018.1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Objective To examine and compare social acceptance, social behavior, and facial movements of children with and without oral clefts in an experimental setting. Design Two groups of children (with and without oral clefts) were videotaped in a structured social interaction with a peer confederate, when listening to emotional stories, and when told to pose specific facial expressions. Participants Twenty-four children and adolescents ages 7 to 161/2 years with oral clefts were group matched for gender, grade, and socioeconomic status with 25 noncleft controls. Main Outcome Measures Specific social and facial behaviors coded from videotapes; Harter Self-Perception Profile, Social Acceptance subscale. Results Significant between-group differences were obtained. Children in the cleft group more often displayed “Tongue Out,” “Eye Contact,” “Mimicry,” and “Initiates Conversation.” For the cleft group, “Gaze Avoidance” was significantly negatively correlated with social acceptance scores. The groups were comparable in their ability to pose and spontaneously express facial emotion. Conclusions When comparing children with and without oral clefts in an experimental setting, with a relatively small sample size, behavior analysis identified some significant differences in patterns of social behavior but not in the ability to express facial emotion. Results suggest that many children with oral clefts may have relatively typical social development. However, for those who do have social competence deficits, systematic behavioral observation of atypical social responses may help individualize social skills interventions.
Collapse
Affiliation(s)
- Keith J Slifer
- Pediatric Psychology, Department of Behavioral Psychology, Kennedy Krieger Institute, 707 N. Broadway, Baltimore, MD 21205, USA.
| | | | | | | | | | | | | | | |
Collapse
|
34
|
Hammal Z, Chu WS, Cohn JF, Heike C, Speltz ML. Automatic Action Unit Detection in Infants Using Convolutional Neural Network. Int Conf Affect Comput Intell Interact Workshops 2017; 2017:216-221. [PMID: 29862131 PMCID: PMC5976252 DOI: 10.1109/acii.2017.8273603] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Action unit detection in infants relative to adults presents unique challenges. Jaw contour is less distinct, facial texture is reduced, and rapid and unusual facial movements are common. To detect facial action units in spontaneous behavior of infants, we propose a multi-label Convolutional Neural Network (CNN). Eighty-six infants were recorded during tasks intended to elicit enjoyment and frustration. Using an extension of FACS for infants (Baby FACS), over 230,000 frames were manually coded for ground truth. To control for chance agreement, inter-observer agreement between Baby-FACS coders was quantified using free-margin kappa. Kappa coefficients ranged from 0.79 to 0.93, which represents high agreement. The multi-label CNN achieved comparable agreement with manual coding. Kappa ranged from 0.69 to 0.93. Importantly, the CNN-based AU detection revealed the same change in findings with respect to infant expressiveness between tasks. While further research is needed, these findings suggest that automatic AU detection in infants is a viable alternative to manual coding of infant facial expression.
Collapse
Affiliation(s)
- Zakia Hammal
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
| | - Wen-Sheng Chu
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
| | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| | | | | |
Collapse
|
35
|
Valstar MF, Sánchez-Lozano E, Cohn JF, Jeni LA, Girard JM, Zhang Z, Yin L, Pantic M. FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge. Proc Int Conf Autom Face Gesture Recognit 2017; 2017:839-847. [PMID: 29606917 PMCID: PMC5876027 DOI: 10.1109/fg.2017.107] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The field of Automatic Facial Expression Analysis has grown rapidly in recent years. However, despite progress in new approaches as well as benchmarking efforts, most evaluations still focus on either posed expressions, near-frontal recordings, or both. This makes it hard to tell how existing expression recognition approaches perform under conditions where faces appear in a wide range of poses (or camera views), displaying ecologically valid expressions. The main obstacle for assessing this is the availability of suitable data, and the challenge proposed here addresses this limitation. The FG 2017 Facial Expression Recognition and Analysis challenge (FERA 2017) extends FERA 2015 to the estimation of Action Units occurrence and intensity under different camera views. In this paper we present the third challenge in automatic recognition of facial expressions, to be held in conjunction with the 12th IEEE conference on Face and Gesture Recognition, May 2017, in Washington, United States. Two sub-challenges are defined: the detection of AU occurrence, and the estimation of AU intensity. In this work we outline the evaluation protocol, the data used, and the results of a baseline method for both sub-challenges.
Collapse
Affiliation(s)
| | | | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
| | - László A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
| | - Jeffrey M Girard
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| | - Zheng Zhang
- Department of Computer Science, Binghamton University, Binghamton, USA
| | - Lijun Yin
- Department of Computer Science, Binghamton University, Binghamton, USA
| | - Maja Pantic
- Department of Computing, Imperial College London, London, UK
- Electrical Engineering, Mathematics and Computer Science, University of Twente, The Netherlands
| |
Collapse
|
36
|
Girard JM, Chu WS, Jeni LA, Cohn JF, De la Torre F, Sayette MA. Sayette Group Formation Task (GFT) Spontaneous Facial Expression Database. Proc Int Conf Autom Face Gesture Recognit 2017; 2017:581-588. [PMID: 29606916 PMCID: PMC5876025 DOI: 10.1109/fg.2017.144] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Despite the important role that facial expressions play in interpersonal communication and our knowledge that interpersonal behavior is influenced by social context, no currently available facial expression database includes multiple interacting participants. The Sayette Group Formation Task (GFT) database addresses the need for well-annotated video of multiple participants during unscripted interactions. The database includes 172,800 video frames from 96 participants in 32 three-person groups. To aid in the development of automated facial expression analysis systems, GFT includes expert annotations of FACS occurrence and intensity, facial landmark tracking, and baseline results for linear SVM, deep learning, active patch learning, and personalized classification. Baseline performance is quantified and compared using identical partitioning and a variety of metrics (including means and confidence intervals). The highest performance scores were found for the deep learning and active patch learning methods. Learn more at http://osf.io/7wcyz.
Collapse
Affiliation(s)
- Jeffrey M Girard
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15260
| | - Wen-Sheng Chu
- Robotic Institute, Carnegie Mellon University, Pittsburgh, PA 15213
| | - László A Jeni
- Robotic Institute, Carnegie Mellon University, Pittsburgh, PA 15213
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15260
- Robotic Institute, Carnegie Mellon University, Pittsburgh, PA 15213
| | | | - Michael A Sayette
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15260
| |
Collapse
|
37
|
Wen-Sheng Chu, De la Torre F, Cohn JF. Selective Transfer Machine for Personalized Facial Expression Analysis. IEEE Trans Pattern Anal Mach Intell 2017; 39:529-545. [PMID: 28113267 PMCID: PMC5400741 DOI: 10.1109/tpami.2016.2547397] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Automatic facial action unit (AU) and expression detection from videos is a long-standing problem. The problem is challenging in part because classifiers must generalize to previously unknown subjects that differ markedly in behavior and facial morphology (e.g., heavy versus delicate brows, smooth versus deeply etched wrinkles) from those on which the classifiers are trained. While some progress has been achieved through improvements in choices of features and classifiers, the challenge occasioned by individual differences among people remains. Person-specific classifiers would be a possible solution but for a paucity of training data. Sufficient training data for person-specific classifiers typically is unavailable. This paper addresses the problem of how to personalize a generic classifier without additional labels from the test subject. We propose a transductive learning method, which we refer to as a Selective Transfer Machine (STM), to personalize a generic classifier by attenuating person-specific mismatches. STM achieves this effect by simultaneously learning a classifier and re-weighting the training samples that are most relevant to the test subject. We compared STM to both generic classifiers and cross-domain learning methods on four benchmarks: CK+ [44], GEMEP-FERA [67], RUFACS [4] and GFT [57]. STM outperformed generic classifiers in all.
Collapse
|
38
|
Abstract
Event discovery aims to discover a temporal segment of interest, such as human behavior, actions or activities. Most approaches to event discovery within or between time series use supervised learning. This becomes problematic when some relevant event labels are unknown, are difficult to detect, or not all possible combinations of events have been anticipated. To overcome these problems, this paper explores Common Event Discovery (CED), a new problem that aims to discover common events of variable-length segments in an unsupervised manner. A potential solution to CED is searching over all possible pairs of segments, which would incur a prohibitive quartic cost. In this paper, we propose an efficient branch-and-bound (B&B) framework that avoids exhaustive search while guaranteeing a globally optimal solution. To this end, we derive novel bounding functions for various commonality measures and provide extensions to multiple commonality discovery and accelerated search. The B&B framework takes as input any multidimensional signal that can be quantified into histograms. A generalization of the framework can be readily applied to discover events at the same or different times (synchrony and event commonality, respectively). We consider extensions to video search and supervised event detection. The effectiveness of the B&B framework is evaluated in motion capture of deliberate behavior and in video of spontaneous facial behavior in diverse interpersonal contexts: interviews, small groups of young adults, and parent-infant face-to-face interaction.
Collapse
Affiliation(s)
| | | | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University, USA
- Department of Psychology, University of Pittsburgh, USA
| | | |
Collapse
|
39
|
Abstract
To enable real-time, person-independent 3D registration from 2D video, we developed a 3D cascade regression approach in which facial landmarks remain invariant across pose over a range of approximately 60 degrees. From a single 2D image of a person's face, a dense 3D shape is registered in real time for each frame. The algorithm utilizes a fast cascade regression framework trained on high-resolution 3D face-scans of posed and spontaneous emotion expression. The algorithm first estimates the location of a dense set of landmarks and their visibility, then reconstructs face shapes by fitting a part-based 3D model. Because no assumptions are required about illumination or surface properties, the method can be applied to a wide range of imaging conditions that include 2D video and uncalibrated multi-view video. The method has been validated in a battery of experiments that evaluate its precision of 3D reconstruction, extension to multi-view reconstruction, temporal integration for videos and 3D head-pose estimation. Experimental findings strongly support the validity of real-time, 3D registration and reconstruction from 2D video. The software is available online at http://zface.org.
Collapse
Affiliation(s)
- László A. Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jeffrey F. Cohn
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Takeo Kanade
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
40
|
Fairbairn CE, Sayette MA, Wright AGC, Levine JM, Cohn JF, Creswell KG. Extraversion and the Rewarding Effects of Alcohol in a Social Context. J Abnorm Psychol 2016; 124:660-73. [PMID: 25844684 DOI: 10.1037/abn0000024] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The personality trait of extraversion has been linked to problematic drinking patterns. Researchers have long hypothesized that such associations are attributable to increased alcohol-reward sensitivity among extraverted individuals, and surveys suggest that individuals high in extraversion gain greater mood enhancement from alcohol than those low in extraversion. Surprisingly, however, alcohol administration studies have not found individuals high in extraversion to experience enhanced mood following alcohol consumption. Of note, prior studies have examined extraverted participants-individuals who self-identify as being highly social-consuming alcohol in isolation. In the present research, we used a group drinking paradigm to examine whether individuals high in extraversion gained greater reward from alcohol than did those low in extraversion and, further, whether a particular social mechanism (partners’ Duchenne smiling) might underlie alcohol reward sensitivity among extraverted individuals. Social drinkers (n 720) consumed a moderate dose of alcohol, placebo, or control beverage in groups of 3 over the course of 36 min. This social interaction was video-recorded, and Duchenne smiling was coded using the Facial Action Coding System. Results indicated that participants high in extraversion reported significantly more mood enhancement from alcohol than did those low in extraversion. Further, mediated moderation analyses focusing on Duchenne smiling of group members indicated that social processes fully and uniquely accounted for alcohol reward-sensitivity among individuals high in extraversion. Results provide initial experimental evidence that individuals high in extraversion experience increased mood-enhancement from alcohol and further highlight the importance of considering social processes in the etiology of alcohol use disorder.
Collapse
|
41
|
Abstract
Facial action unit (AU) detection from video has been a long-standing problem in the automated facial expression analysis. While progress has been made, accurate detection of facial AUs remains challenging due to ubiquitous sources of errors, such as inter-personal variability, pose, and low-intensity AUs. In this paper, we refer to samples causing such errors as hard samples, and the remaining as easy samples. To address learning with the hard samples, we propose the confidence preserving machine (CPM), a novel two-stage learning framework that combines multiple classifiers following an "easy-to-hard" strategy. During the training stage, CPM learns two confident classifiers. Each classifier focuses on separating easy samples of one class from all else, and thus preserves confidence on predicting each class. During the test stage, the confident classifiers provide "virtual labels" for easy test samples. Given the virtual labels, we propose a quasi-semi-supervised (QSS) learning strategy to learn a person-specific classifier. The QSS strategy employs a spatio-temporal smoothness that encourages similar predictions for samples within a spatio-temporal neighborhood. In addition, to further improve detection performance, we introduce two CPM extensions: iterative CPM that iteratively augments training samples to train the confident classifiers, and kernel CPM that kernelizes the original CPM model to promote nonlinearity. Experiments on four spontaneous data sets GFT, BP4D, DISFA, and RU-FACS illustrate the benefits of the proposed CPM models over baseline methods and the state-of-the-art semi-supervised learning and transfer learning methods.
Collapse
|
42
|
Abstract
Observational measurement plays an integral role in a variety of scientific endeavors within biology, psychology, sociology, education, medicine, and marketing. The current article provides an interdisciplinary primer on observational measurement; in particular, it highlights recent advances in observational methodology and the challenges that accompany such growth. First, we detail the various types of instrument that can be used to standardize measurements across observers. Second, we argue for the importance of validity in observational measurement and provide several approaches to validation based on contemporary validity theory. Third, we outline the challenges currently faced by observational researchers pertaining to measurement drift, observer reactivity, reliability analysis, and time/expense. Fourth, we describe recent advances in computer-assisted measurement, fully automated measurement, and statistical data analysis. Finally, we identify several key directions for future observational research to explore.
Collapse
|
43
|
Corneanu CA, Simon MO, Cohn JF, Guerrero SE. Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications. IEEE Trans Pattern Anal Mach Intell 2016; 38:1548-68. [PMID: 26761193 PMCID: PMC7426891 DOI: 10.1109/tpami.2016.2515606] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.
Collapse
|
44
|
De la Torre F, Cohn JF. Joint Patch and Multi-label Learning for Facial Action Unit and Holistic Expression Recognition. IEEE Trans Image Process 2016; 25:3931-3946. [PMID: 28113424 DOI: 10.1109/tip.2016.2570550] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Most action unit (AU) detection methods use one-versus-all classifiers without considering dependences between features or AUs. In this paper, we introduce a joint patch and multi-label learning (JPML) framework that models the structured joint dependence behind features, AUs, and their interplay. In particular, JPML leverages group sparsity to identify important facial patches, and learns a multi-label classifier constrained by the likelihood of co-occurring AUs. To describe such likelihood, we derive two AU relations, positive correlation and negative competition, by statistically analyzing more than 350,000 video frames annotated with multiple AUs. To the best of our knowledge, this is the first work that jointly addresses patch learning and multi-label learning for AU detection. In addition, we show that JPML can be extended to recognize holistic expressions by learning common and specific patches, which afford a more compact representation than the standard expression recognition methods. We evaluate JPML on three benchmark datasets CK+, BP4D, and GFT, using within-and cross-dataset scenarios. In four of five experiments, JPML achieved the highest averaged F1 scores in comparison with baseline and alternative methods that use either patch learning or multi-label learning alone.
Collapse
|
45
|
Chu WS, Zeng J, De la Torre F, Cohn JF, Messinger DS. Unsupervised Synchrony Discovery in Human Interaction. Proc IEEE Int Conf Comput Vis 2015; 2015:3146-3154. [PMID: 27346988 PMCID: PMC4918688 DOI: 10.1109/iccv.2015.360] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
People are inherently social. Social interaction plays an important and natural role in human behavior. Most computational methods focus on individuals alone rather than in social context. They also require labelled training data. We present an unsupervised approach to discover interpersonal synchrony, referred as to two or more persons preforming common actions in overlapping video frames or segments. For computational efficiency, we develop a branch-and-bound (B&B) approach that affords exhaustive search while guaranteeing a globally optimal solution. The proposed method is entirely general. It takes from two or more videos any multi-dimensional signal that can be represented as a histogram. We derive three novel bounding functions and provide efficient extensions, including multi-synchrony detection and accelerated search, using a warm-start strategy and parallelism. We evaluate the effectiveness of our approach in multiple databases, including human actions using the CMU Mocap dataset [1], spontaneous facial behaviors using group-formation task dataset [37] and parent-infant interaction dataset [28].
Collapse
Affiliation(s)
| | | | | | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University; University of Pittsburgh, USA
| | | |
Collapse
|
46
|
Abstract
Methods to assess individual facial actions have potential to shed light on important behavioral phenomena ranging from emotion and social interaction to psychological disorders and health. However, manual coding of such actions is labor intensive and requires extensive training. To date, establishing reliable automated coding of unscripted facial actions has been a daunting challenge impeding development of psychological theories and applications requiring facial expression assessment. It is therefore essential that automated coding systems be developed with enough precision and robustness to ease the burden of manual coding in challenging data involving variation in participant gender, ethnicity, head pose, speech, and occlusion. We report a major advance in automated coding of spontaneous facial actions during an unscripted social interaction involving three strangers. For each participant (n = 80, 47 % women, 15 % Nonwhite), 25 facial action units (AUs) were manually coded from video using the Facial Action Coding System. Twelve AUs occurred more than 3 % of the time and were processed using automated FACS coding. Automated coding showed very strong reliability for the proportion of time that each AU occurred (mean intraclass correlation = 0.89), and the more stringent criterion of frame-by-frame reliability was moderate to strong (mean Matthew's correlation = 0.61). With few exceptions, differences in AU detection related to gender, ethnicity, pose, and average pixel intensity were small. Fewer than 6 % of frames could be coded manually but not automatically. These findings suggest automated FACS coding has progressed sufficiently to be applied to observational research in emotion and related areas of study.
Collapse
Affiliation(s)
- Jeffrey M Girard
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, 15260, USA
- The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Laszlo A Jeni
- The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Michael A Sayette
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | | |
Collapse
|
47
|
Abstract
Both the occurrence and intensity of facial expressions are critical to what the face reveals. While much progress has been made towards the automatic detection of facial expression occurrence, controversy exists about how to estimate expression intensity. The most straight-forward approach is to train multiclass or regression models using intensity ground truth. However, collecting intensity ground truth is even more time consuming and expensive than collecting binary ground truth. As a shortcut, some researchers have proposed using the decision values of binary-trained maximum margin classifiers as a proxy for expression intensity. We provide empirical evidence that this heuristic is flawed in practice as well as in theory. Unfortunately, there are no shortcuts when it comes to estimating smile intensity: researchers must take the time to collect and train on intensity ground truth. However, if they do so, high reliability with expert human coders can be achieved. Intensity-trained multiclass and regression models outperformed binary-trained classifier decision values on smile intensity estimation across multiple databases and methods for feature extraction and dimensionality reduction. Multiclass models even outperformed binary-trained classifiers on smile occurrence detection.
Collapse
Affiliation(s)
- Jeffrey M. Girard
- Department of Psychology, University of Pittsburgh, 4322 Sennott Square, Pittsburgh, PA, USA 15260
| | - Jeffrey F. Cohn
- Department of Psychology, University of Pittsburgh, 4322 Sennott Square, Pittsburgh, PA, USA 15260
- The Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, USA 15213
| | - Fernando De la Torre
- The Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, USA 15213
| |
Collapse
|
48
|
Dibeklioğlu H, Hammal Z, Yang Y, Cohn JF. Multimodal Detection of Depression in Clinical Interviews. Proc ACM Int Conf Multimodal Interact 2015; 2015:307-310. [PMID: 27213186 DOI: 10.1145/2818346.2820776] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Current methods for depression assessment depend almost entirely on clinical interview or self-report ratings. Such measures lack systematic and efficient ways of incorporating behavioral observations that are strong indicators of psychological disorder. We compared a clinical interview of depression severity with automatic measurement in 48 participants undergoing treatment for depression. Interviews were obtained at 7-week intervals on up to four occasions. Following standard cut-offs, participants at each session were classified as remitted, intermediate, or depressed. Logistic regression classifiers using leave-one-out validation were compared for facial movement dynamics, head movement dynamics, and vocal prosody individually and in combination. Accuracy (remitted versus depressed) for facial movement dynamics was higher than that for head movement dynamics; and each was substantially higher than that for vocal prosody. Accuracy for all three modalities together reached 88.93%, exceeding that for any single modality or pair of modalities. These findings suggest that automatic detection of depression from behavioral indicators is feasible and that multimodal measures afford most powerful detection.
Collapse
Affiliation(s)
- Hamdi Dibeklioğlu
- Pattern Recognition and Bioinformatics Group, Delft University of Technology, Delft, The Netherlands
| | - Zakia Hammal
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ying Yang
- Center for Cognitive Brain Imaging, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA; Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
49
|
Abstract
We investigated the dynamics of head movement in mothers and infants during an age-appropriate, well-validated emotion induction, the Still Face paradigm. In this paradigm, mothers and infants play normally for 2 minutes (Play) followed by 2 minutes in which the mothers remain unresponsive (Still Face), and then two minutes in which they resume normal behavior (Reunion). Participants were 42 ethnically diverse 4-month-old infants and their mothers. Mother and infant angular displacement and angular velocity were measured using the CSIRO head tracker. In male but not female infants, angular displacement increased from Play to Still-Face and decreased from Still Face to Reunion. Infant angular velocity was higher during Still-Face than Reunion with no differences between male and female infants. Windowed cross-correlation suggested changes in how infant and mother head movements are associated, revealing dramatic changes in direction of association. Coordination between mother and infant head movement velocity was greater during Play compared with Reunion. Together, these findings suggest that angular displacement, angular velocity and their coordination between mothers and infants are strongly related to age-appropriate emotion challenge. Attention to head movement can deepen our understanding of emotion communication.
Collapse
Affiliation(s)
- Zakia Hammal
- The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jeffrey F Cohn
- the Robotics Institute, Carnegie Mellon University and the Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Daniel S Messinger
- the Department of Psychology at the University of Miami with secondary appointment in Pediatrics, Electrical and Computer Engineering, and Music Engineering, University of Miami, FL, USA
| |
Collapse
|
50
|
Abstract
Analysis of observable behavior in depression primarily relies on subjective measures. New computational approaches make possible automated audiovisual measurement of behaviors that humans struggle to quantify (e.g., movement velocity and voice inflection). These tools have the potential to improve screening and diagnosis, identify new behavioral indicators of depression, measure response to clinical intervention, and test clinical theories about underlying mechanisms. Highlights include a study that measured the temporal coordination of vocal tract and facial movements, a study that predicted which adolescents would go on to develop depression based on their voice qualities, and a study that tested the behavioral predictions of clinical theories using automated measures of facial actions and head motion.
Collapse
Affiliation(s)
- Jeffrey M. Girard
- Department of Psychology, University of Pittsburgh Sennott Square, 210 S. Bouquet Street, Pittsburgh, PA, USA 15260
| | - Jeffrey F. Cohn
- Department of Psychology, University of Pittsburgh Sennott Square, 210 S. Bouquet Street, Pittsburgh, PA, USA 15260
| |
Collapse
|