1
|
Levkovich I, Rabin E, Farraj RH, Elyoseph Z. Attributional patterns toward students with and without learning disabilities: Artificial intelligence models vs. trainee teachers. RESEARCH IN DEVELOPMENTAL DISABILITIES 2025; 160:104970. [PMID: 40090118 DOI: 10.1016/j.ridd.2025.104970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 01/31/2025] [Accepted: 03/06/2025] [Indexed: 03/18/2025]
Abstract
This study explored differences in the attributional patterns of four advanced artificial intelligence (AI) Large Language Models (LLMs): ChatGPT3.5, ChatGPT4, Claude, and Gemini) by focusing on feedback, frustration, sympathy, and expectations of future failure among students with and without learning disabilities (LD). These findings were compared with responses from a sample of Australian and Chinese trainee teachers, comprising individuals nearing qualification with varied demographic and educational backgrounds. Eight vignettes depicting students with varying abilities and efforts were evaluated by the LLMs ten times each, resulting in 320 evaluations, with trainee teachers providing comparable ratings. For LD students, the LLMs exhibited lower frustration and higher sympathy than trainee teachers, while for non-LD students, LLMs similarly showed lower frustration, with ChatGPT3.5 aligning closely with Chinese teachers and ChatGPT4 demonstrating more sympathy than both teacher groups. Notably, LLMs expressed lower expectations of future academic failure for both LD and non-LD students compared to trainee teachers. Regarding feedback, the findings reflect ratings of the qualitative nature of feedback LLMs and teachers would provide, rather than actual feedback text. The LLMs, particularly ChatGPT3.5 and Gemini, were rated as providing more negative feedback than trainee teachers, while ChatGPT4 provided more positive ratings for both LD and non-LD students, aligning with Chinese teachers in some cases. These findings suggest that LLMs may promote a positive and inclusive outlook for LD students by exhibiting lower judgmental tendencies and higher optimism. However, their tendency to rate feedback more negatively than trainee teachers highlights the need to recalibrate AI tools to better align with cultural and emotional nuances.
Collapse
Affiliation(s)
- Inbar Levkovich
- Faculty of Education, Tel Hai College, Upper Galilee, Israel.
| | - Eyal Rabin
- Department of Psychology and Education, The Open University of Israel, Israel.
| | | | - Zohar Elyoseph
- University of Haifa, Mount Carmel, Haifa, Israel; Department of Brain Sciences, Faculty of Medicine, Imperial College London, UK.
| |
Collapse
|
2
|
Dong X, Xie J, Gong H. A Meta-Analysis of Artificial Intelligence Technologies Use and Loneliness: Examining the Influence of Physical Embodiment, Age Differences, and Effect Direction. CYBERPSYCHOLOGY, BEHAVIOR AND SOCIAL NETWORKING 2025; 28:233-242. [PMID: 39905934 DOI: 10.1089/cyber.2024.0468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2025]
Abstract
Recent research has investigated the connection between artificial intelligence (AI) utilization and feelings of loneliness, yielding inconsistent outcomes. This meta-analysis aims to clarify this relationship by synthesizing data from 47 relevant studies across 21 publications. Findings indicate a generally significant positive correlation between AI use and loneliness (r = 0.163, p < 0.05). Specifically, interactions with physically embodied AI are marginally significantly associated with decreased loneliness (r = -0.266, p = 0.088), whereas engagement with physically disembodied AI is significantly linked to increased loneliness (r = 0.352, p < 0.001). Among older adults (aged 60 and above), AI use is significantly positively associated with loneliness (r = 0.352, p < 0.001), while no significant correlation is observed (r = 0.039, p = 0.659) in younger individuals (aged 35 and below). Furthermore, by incorporating positive attitudes toward AI, the study reveals that the influence of AI use in exacerbating loneliness outweighs the reverse impact, although both directions show significant positive relationships. These results enhance the understanding of how AI usage relates to loneliness and provide practical insights for addressing loneliness through AI technologies.
Collapse
Affiliation(s)
- Xu Dong
- School of Journalism and Communication, Renmin University of China, Beijing, China
| | - Jun Xie
- School of Journalism and Communication, Renmin University of China, Beijing, China
| | - He Gong
- Research Center of Journalism and Social Development, School of Journalism and Communication, Renmin University of China, Beijing, China
| |
Collapse
|
3
|
Zhu Z, Wang Y, Qi Z, Hu W, Zhang X, Wagner SK, Wang Y, Ran AR, Ong J, Waisberg E, Masalkhi M, Suh A, Tham YC, Cheung CY, Yang X, Yu H, Ge Z, Wang W, Sheng B, Liu Y, Lee AG, Denniston AK, Wijngaarden PV, Keane PA, Cheng CY, He M, Wong TY. Oculomics: Current concepts and evidence. Prog Retin Eye Res 2025; 106:101350. [PMID: 40049544 DOI: 10.1016/j.preteyeres.2025.101350] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Revised: 03/03/2025] [Accepted: 03/03/2025] [Indexed: 03/20/2025]
Abstract
The eye provides novel insights into general health, as well as pathogenesis and development of systemic diseases. In the past decade, growing evidence has demonstrated that the eye's structure and function mirror multiple systemic health conditions, especially in cardiovascular diseases, neurodegenerative disorders, and kidney impairments. This has given rise to the field of oculomics-the application of ophthalmic biomarkers to understand mechanisms, detect and predict disease. The development of this field has been accelerated by three major advances: 1) the availability and widespread clinical adoption of high-resolution and non-invasive ophthalmic imaging ("hardware"); 2) the availability of large studies to interrogate associations ("big data"); 3) the development of novel analytical methods, including artificial intelligence (AI) ("software"). Oculomics offers an opportunity to enhance our understanding of the interplay between the eye and the body, while supporting development of innovative diagnostic, prognostic, and therapeutic tools. These advances have been further accelerated by developments in AI, coupled with large-scale linkage datasets linking ocular imaging data with systemic health data. Oculomics also enables the detection, screening, diagnosis, and monitoring of many systemic health conditions. Furthermore, oculomics with AI allows prediction of the risk of systemic diseases, enabling risk stratification, opening up new avenues for prevention or individualized risk prediction and prevention, facilitating personalized medicine. In this review, we summarise current concepts and evidence in the field of oculomics, highlighting the progress that has been made, remaining challenges, and the opportunities for future research.
Collapse
Affiliation(s)
- Zhuoting Zhu
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia.
| | - Yueye Wang
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China
| | - Ziyi Qi
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia; Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai, China
| | - Wenyi Hu
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia
| | - Xiayin Zhang
- Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Siegfried K Wagner
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Yujie Wang
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia
| | - An Ran Ran
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Joshua Ong
- Department of Ophthalmology and Visual Sciences, University of Michigan Kellogg Eye Center, Ann Arbor, USA
| | - Ethan Waisberg
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Mouayad Masalkhi
- University College Dublin School of Medicine, Belfield, Dublin, Ireland
| | - Alex Suh
- Tulane University School of Medicine, New Orleans, LA, USA
| | - Yih Chung Tham
- Department of Ophthalmology and Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Ophthalmology and Visual Sciences Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Xiaohong Yang
- Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Honghua Yu
- Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Zongyuan Ge
- Monash e-Research Center, Faculty of Engineering, Airdoc Research, Nvidia AI Technology Research Center, Monash University, Melbourne, VIC, Australia
| | - Wei Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Bin Sheng
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yun Liu
- Google Research, Mountain View, CA, USA
| | - Andrew G Lee
- Center for Space Medicine and the Department of Ophthalmology, Baylor College of Medicine, Houston, USA; Department of Ophthalmology, Blanton Eye Institute, Houston Methodist Hospital, Houston, USA; The Houston Methodist Research Institute, Houston Methodist Hospital, Houston, USA; Departments of Ophthalmology, Neurology, and Neurosurgery, Weill Cornell Medicine, New York, USA; Department of Ophthalmology, University of Texas Medical Branch, Galveston, USA; University of Texas MD Anderson Cancer Center, Houston, USA; Texas A&M College of Medicine, Bryan, USA; Department of Ophthalmology, The University of Iowa Hospitals and Clinics, Iowa City, USA
| | - Alastair K Denniston
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK; National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre (BRC), University Hospital Birmingham and University of Birmingham, Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK; Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK
| | - Peter van Wijngaarden
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia; Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, VIC, Australia
| | - Pearse A Keane
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Ching-Yu Cheng
- Department of Ophthalmology and Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Ophthalmology and Visual Sciences Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Mingguang He
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China; Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong, China; Centre for Eye and Vision Research (CEVR), 17W Hong Kong Science Park, Hong Kong, China
| | - Tien Yin Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; School of Clinical Medicine, Beijing Tsinghua Changgung Hospital, Tsinghua Medicine, Tsinghua University, Beijing, China.
| |
Collapse
|
4
|
Omar M, Sorin V, Agbareia R, Apakama DU, Soroush A, Sakhuja A, Freeman R, Horowitz CR, Richardson LD, Nadkarni GN, Klang E. Evaluating and addressing demographic disparities in medical large language models: a systematic review. Int J Equity Health 2025; 24:57. [PMID: 40011901 DOI: 10.1186/s12939-025-02419-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Accepted: 02/18/2025] [Indexed: 02/28/2025] Open
Abstract
BACKGROUND Large language models are increasingly evaluated for use in healthcare. However, concerns about their impact on disparities persist. This study reviews current research on demographic biases in large language models to identify prevalent bias types, assess measurement methods, and evaluate mitigation strategies. METHODS We conducted a systematic review, searching publications from January 2018 to July 2024 across five databases. We included peer-reviewed studies evaluating demographic biases in large language models, focusing on gender, race, ethnicity, age, and other factors. Study quality was assessed using the Joanna Briggs Institute Critical Appraisal Tools. RESULTS Our review included 24 studies. Of these, 22 (91.7%) identified biases. Gender bias was the most prevalent, reported in 15 of 16 studies (93.7%). Racial or ethnic biases were observed in 10 of 11 studies (90.9%). Only two studies found minimal or no bias in certain contexts. Mitigation strategies mainly included prompt engineering, with varying effectiveness. However, these findings are tempered by a potential publication bias, as studies with negative results are less frequently published. CONCLUSION Biases are observed in large language models across various medical domains. While bias detection is improving, effective mitigation strategies are still developing. As LLMs increasingly influence critical decisions, addressing these biases and their resultant disparities is essential for ensuring fair artificial intelligence systems. Future research should focus on a wider range of demographic factors, intersectional analyses, and non-Western cultural contexts.
Collapse
Affiliation(s)
- Mahmud Omar
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Vera Sorin
- Diagnostic Radiology, Mayo Clinic, Rochester, MN, USA
| | - Reem Agbareia
- Ophthalmology Department, Hadassah Medical Center, Jerusalem, Israel
| | - Donald U Apakama
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ali Soroush
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ankit Sakhuja
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Robert Freeman
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Carol R Horowitz
- Institute for Health Equity Research, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lynne D Richardson
- Institute for Health Equity Research, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Girish N Nadkarni
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eyal Klang
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
5
|
Nagra H, Mines RA, Dana Z. Exploring the Impact of Digital Peer Support Services on Meeting Unmet Needs Within an Employee Assistance Program: Retrospective Cohort Study. JMIR Hum Factors 2025; 12:e68221. [PMID: 39998863 PMCID: PMC11897672 DOI: 10.2196/68221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 01/09/2025] [Accepted: 01/28/2025] [Indexed: 02/27/2025] Open
Abstract
BACKGROUND The World Health Organization estimates that 1 in 4 people worldwide will experience a mental disorder in their lifetime, highlighting the need for accessible support. OBJECTIVE This study evaluates the integration of digital peer support (DPS) into an employee assistance program (EAP), testing 3 hypotheses: (1) DPS may be associated with changes in EAP counseling utilization within a 5-session model; (2) DPS users experience reduced sadness, loneliness, and stress; and (3) DPS integration generates a positive social return on investment (SROI). METHODS The study analyzed EAP utilization within a 5-session model using pre-post analysis, sentiment changes during DPS chats via natural language processing models, and SROI outcomes. RESULTS Among 587 DPS chats, 432 (73.6%) occurred after business hours, emphasizing the importance of 24/7 availability. A matched cohort analysis (n=72) showed that DPS reduced therapy sessions by 2.07 per participant (P<.001; Cohen d=1.77). Users' messages were evaluated for sentiments of sadness, loneliness, and stress on a 1-10 scale. Significant reductions were observed: loneliness decreased by 55.04% (6.91 to 3.11), sadness by 57.5% (6.84 to 2.91), and stress by 56.57% (6.78 to 2.95). SROI analysis demonstrated value-to-investment ratios of US $1.66 (loneliness), US $2.50 (stress), and US $2.58 (sadness) per dollar invested. CONCLUSIONS Integrating DPS into EAPs provides significant benefits, including increased access, improved emotional outcomes, and a high SROI, reinforcing its value within emotional health support ecosystems.
Collapse
Affiliation(s)
| | | | - Zara Dana
- Supportiv, Berkeley, CA, United States
| |
Collapse
|
6
|
Hadar-Shoval D, Lvovsky M, Asraf K, Shimoni Y, Elyoseph Z. The Feasibility of Large Language Models in Verbal Comprehension Assessment: Mixed Methods Feasibility Study. JMIR Form Res 2025; 9:e68347. [PMID: 39993720 PMCID: PMC11894350 DOI: 10.2196/68347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 01/04/2025] [Accepted: 01/05/2025] [Indexed: 02/26/2025] Open
Abstract
BACKGROUND Cognitive assessment is an important component of applied psychology, but limited access and high costs make these evaluations challenging. OBJECTIVE This study aimed to examine the feasibility of using large language models (LLMs) to create personalized artificial intelligence-based verbal comprehension tests (AI-BVCTs) for assessing verbal intelligence, in contrast with traditional assessment methods based on standardized norms. METHODS We used a within-participants design, comparing scores obtained from AI-BVCTs with those from the Wechsler Adult Intelligence Scale (WAIS-III) verbal comprehension index (VCI). In total, 8 Hebrew-speaking participants completed both the VCI and AI-BVCT, the latter being generated using the LLM Claude. RESULTS The concordance correlation coefficient (CCC) demonstrated strong agreement between AI-BVCT and VCI scores (Claude: CCC=.75, 90% CI 0.266-0.933; GPT-4: CCC=.73, 90% CI 0.170-0.935). Pearson correlations further supported these findings, showing strong associations between VCI and AI-BVCT scores (Claude: r=.84, P<.001; GPT-4: r=.77, P=.02). No statistically significant differences were found between AI-BVCT and VCI scores (P>.05). CONCLUSIONS These findings support the potential of LLMs to assess verbal intelligence. The study attests to the promise of AI-based cognitive tests in increasing the accessibility and affordability of assessment processes, enabling personalized testing. The research also raises ethical concerns regarding privacy and overreliance on AI in clinical work. Further research with larger and more diverse samples is needed to establish the validity and reliability of this approach and develop more accurate scoring procedures.
Collapse
Affiliation(s)
- Dorit Hadar-Shoval
- Department of Psychology, Max Stern Academic College of Emek Yezreel, Afula, Israel
| | - Maya Lvovsky
- Department of Psychology, Max Stern Academic College of Emek Yezreel, Afula, Israel
| | - Kfir Asraf
- Department of Psychology, Max Stern Academic College of Emek Yezreel, Afula, Israel
| | - Yoav Shimoni
- School of Psychology, University of Haifa, Haifa, Israel
- Department of Education and Psychology, Open University of Israel, Raanana, Israel
| | - Zohar Elyoseph
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
- School of Counseling and Human Development, Faculty of Education, University of Haifa, Haifa, Israel
| |
Collapse
|
7
|
Garcia-Rudolph A, Sánchez-Pinsach D, Gilabert A, Saurí J, Soler MD, Opisso E. Building Trust with AI: How Essential is Validating AI Models in the Therapeutic Triad of Therapist, Patient, and Artificial Third? Comment on What is the Current and Future Status of Digital Mental Health Interventions? THE SPANISH JOURNAL OF PSYCHOLOGY 2025; 28:e3. [PMID: 39988913 DOI: 10.1017/sjp.2024.32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Since the publication of "What is the Current and Future Status of Digital Mental Health Interventions?" the exponential growth and widespread adoption of ChatGPT have underscored the importance of reassessing its utility in digital mental health interventions. This review critically examined the potential of ChatGPT, particularly focusing on its application within clinical psychology settings as the technology has continued evolving through 2023 and 2024. Alongside this, our literature review spanned US Medical Licensing Examination (USMLE) validations, assessments of the capacity to interpret human emotions, analyses concerning the identification of depression and its determinants at treatment initiation, and reported our findings. Our review evaluated the capabilities of GPT-3.5 and GPT-4.0 separately in clinical psychology settings, highlighting the potential of conversational AI to overcome traditional barriers such as stigma and accessibility in mental health treatment. Each model displayed different levels of proficiency, indicating a promising yet cautious pathway for integrating AI into mental health practices.
Collapse
Affiliation(s)
- Alejandro Garcia-Rudolph
- Universitat Autònoma de Barcelona, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Spain
| | - David Sánchez-Pinsach
- Universitat Autònoma de Barcelona, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Spain
| | - Anna Gilabert
- Universitat Autònoma de Barcelona, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Spain
| | - Joan Saurí
- Universitat Autònoma de Barcelona, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Spain
| | - Maria Dolors Soler
- Universitat Autònoma de Barcelona, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Spain
| | - Eloy Opisso
- Universitat Autònoma de Barcelona, Spain
- Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Spain
| |
Collapse
|
8
|
Haber Y, Hadar Shoval D, Levkovich I, Yinon D, Gigi K, Pen O, Angert T, Elyoseph Z. The externalization of internal experiences in psychotherapy through generative artificial intelligence: a theoretical, clinical, and ethical analysis. Front Digit Health 2025; 7:1512273. [PMID: 39968063 PMCID: PMC11832678 DOI: 10.3389/fdgth.2025.1512273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Accepted: 01/14/2025] [Indexed: 02/20/2025] Open
Abstract
Introduction Externalization techniques are well established in psychotherapy approaches, including narrative therapy and cognitive behavioral therapy. These methods elicit internal experiences such as emotions and make them tangible through external representations. Recent advances in generative artificial intelligence (GenAI), specifically large language models (LLMs), present new possibilities for therapeutic interventions; however, their integration into core psychotherapy practices remains largely unexplored. This study aimed to examine the clinical, ethical, and theoretical implications of integrating GenAI into the therapeutic space through a proof-of-concept (POC) of AI-driven externalization techniques, while emphasizing the essential role of the human therapist. Methods To this end, we developed two customized GPTs agents: VIVI (visual externalization), which uses DALL-E 3 to create images reflecting patients' internal experiences (e.g., depression or hope), and DIVI (dialogic role-play-based externalization), which simulates conversations with aspects of patients' internal content. These tools were implemented and evaluated through a clinical case study under professional psychological guidance. Results The integration of VIVI and DIVI demonstrated that GenAI can serve as an "artificial third", creating a Winnicottian playful space that enhances, rather than supplants, the dyadic therapist-patient relationship. The tools successfully externalized complex internal dynamics, offering new therapeutic avenues, while also revealing challenges such as empathic failures and cultural biases. Discussion These findings highlight both the promise and the ethical complexities of AI-enhanced therapy, including concerns about data security, representation accuracy, and the balance of clinical authority. To address these challenges, we propose the SAFE-AI protocol, offering clinicians structured guidelines for responsible AI integration in therapy. Future research should systematically evaluate the generalizability, efficacy, and ethical implications of these tools across diverse populations and therapeutic contexts.
Collapse
Affiliation(s)
- Yuval Haber
- The Program of Hermeneutics and Cultural Studies, Interdisciplinary Studies Unit, Bar-Ilan University, Jerusalem, Israel
| | - Dorit Hadar Shoval
- Department of Psychology, Max Stern Academic College of Emek Yezreel, Yezreel Valley, Israel
| | - Inbar Levkovich
- Faculty of Education, Tel-Hai Academic College, Kiryat Shmona, Israel
| | - Dror Yinon
- The Program of Hermeneutics and Cultural Studies, Interdisciplinary Studies Unit, Bar-Ilan University, Jerusalem, Israel
| | - Karny Gigi
- Department of Counseling and Human Development, Faculty of Education, University of Haifa, Haifa, Israel
| | - Oori Pen
- Department of Counseling and Human Development, Faculty of Education, University of Haifa, Haifa, Israel
| | - Tal Angert
- Department of Counseling and Human Development, Faculty of Education, University of Haifa, Haifa, Israel
| | - Zohar Elyoseph
- Department of Counseling and Human Development, Faculty of Education, University of Haifa, Haifa, Israel
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
9
|
Maaz S, Palaganas JC, Palaganas G, Bajwa M. A guide to prompt design: foundations and applications for healthcare simulationists. Front Med (Lausanne) 2025; 11:1504532. [PMID: 39980724 PMCID: PMC11841430 DOI: 10.3389/fmed.2024.1504532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Accepted: 12/17/2024] [Indexed: 02/22/2025] Open
Abstract
Large Language Models (LLMs) like ChatGPT, Gemini, and Claude gain traction in healthcare simulation; this paper offers simulationists a practical guide to effective prompt design. Grounded in a structured literature review and iterative prompt testing, this paper proposes best practices for developing calibrated prompts, explores various prompt types and techniques with use cases, and addresses the challenges, including ethical considerations for using LLMs in healthcare simulation. This guide helps bridge the knowledge gap for simulationists on LLM use in simulation-based education, offering tailored guidance on prompt design. Examples were created through iterative testing to ensure alignment with simulation objectives, covering use cases such as clinical scenario development, OSCE station creation, simulated person scripting, and debriefing facilitation. These use cases provide easy-to-apply methods to enhance realism, engagement, and educational alignment in simulations. Key challenges associated with LLM integration, including bias, privacy concerns, hallucinations, lack of transparency, and the need for robust oversight and evaluation, are discussed alongside ethical considerations unique to healthcare education. Recommendations are provided to help simulationists craft prompts that align with educational objectives while mitigating these challenges. By offering these insights, this paper contributes valuable, timely knowledge for simulationists seeking to leverage generative AI's capabilities in healthcare education responsibly.
Collapse
Affiliation(s)
- Sara Maaz
- Department of Clinical Skills, College of Medicine, Alfaisal University, Riyadh, Saudi Arabia
- Department of Health Professions Education, MGH Institute of Health Professions, Boston, MA, United States
| | - Janice C. Palaganas
- Department of Clinical Skills, College of Medicine, Alfaisal University, Riyadh, Saudi Arabia
| | - Gerry Palaganas
- Director of Technology, AAXIS Group Corporation, Los Angeles, CA, United States
| | - Maria Bajwa
- Department of Clinical Skills, College of Medicine, Alfaisal University, Riyadh, Saudi Arabia
| |
Collapse
|
10
|
Asman O, Torous J, Tal A. Responsible Design, Integration, and Use of Generative AI in Mental Health. JMIR Ment Health 2025; 12:e70439. [PMID: 39864170 PMCID: PMC11769776 DOI: 10.2196/70439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2024] [Revised: 01/04/2025] [Accepted: 01/06/2025] [Indexed: 01/28/2025] Open
Abstract
Unlabelled Generative artificial intelligence (GenAI) shows potential for personalized care, psychoeducation, and even crisis prediction in mental health, yet responsible use requires ethical consideration and deliberation and perhaps even governance. This is the first published theme issue focused on responsible GenAI in mental health. It brings together evidence and insights on GenAI's capabilities, such as emotion recognition, therapy-session summarization, and risk assessment, while highlighting the sensitive nature of mental health data and the need for rigorous validation. Contributors discuss how bias, alignment with human values, transparency, and empathy must be carefully addressed to ensure ethically grounded, artificial intelligence-assisted care. By proposing conceptual frameworks; best practices; and regulatory approaches, including ethics of care and the preservation of socially important humanistic elements, this theme issue underscores that GenAI can complement, rather than replace, the vital role of human empathy in clinical settings. To achieve this, an ongoing collaboration between researchers, clinicians, policy makers, and technologists is essential.
Collapse
Affiliation(s)
- Oren Asman
- Department of Nursing, Faculty of Medical and Health Sciences, Tel Aviv University, P.O.B 39040, Ramat Aviv, Tel Aviv, 6997801, Israel, 972 547608020
- The Samueli Initiative for Responsible AI in Medicine, Tel Aviv University, Tel Aviv, Israel
| | - John Torous
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Cambridge, United States
| | - Amir Tal
- The Samueli Initiative for Responsible AI in Medicine, Tel Aviv University, Tel Aviv, Israel
- Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
11
|
Kramer RSS. Face to face: Comparing ChatGPT with human performance on face matching. Perception 2025; 54:65-68. [PMID: 39497555 DOI: 10.1177/03010066241295992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2024]
Abstract
ChatGPT's large language model, GPT-4V, has been trained on vast numbers of image-text pairs and is therefore capable of processing visual input. This model operates very differently from current state-of-the-art neural networks designed specifically for face perception and so I chose to investigate whether ChatGPT could also be applied to this domain. With this aim, I focussed on the task of face matching, that is, deciding whether two photographs showed the same person or not. Across six different tests, ChatGPT demonstrated performance that was comparable with human accuracies despite being a domain-general 'virtual assistant' rather than a specialised tool for face processing. This perhaps surprising result identifies a new avenue for exploration in this field, while further research should explore the boundaries of ChatGPT's ability, along with how its errors may relate to those made by humans.
Collapse
|
12
|
Selim R, Basu A, Anto A, Foscht T, Eisingerich AB. Effects of Large Language Model-Based Offerings on the Well-Being of Students: Qualitative Study. JMIR Form Res 2024; 8:e64081. [PMID: 39729617 PMCID: PMC11724218 DOI: 10.2196/64081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 08/21/2024] [Accepted: 11/20/2024] [Indexed: 12/29/2024] Open
Abstract
BACKGROUND In recent years, the adoption of large language model (LLM) applications, such as ChatGPT, has seen a significant surge, particularly among students. These artificial intelligence-driven tools offer unprecedented access to information and conversational assistance, which is reshaping the way students engage with academic content and manage the learning process. Despite the growing prevalence of LLMs and reliance on these technologies, there remains a notable gap in qualitative in-depth research examining the emotional and psychological effects of LLMs on users' mental well-being. OBJECTIVE In order to address these emerging and critical issues, this study explores the role of LLM-based offerings, such as ChatGPT, in students' lives, namely, how postgraduate students use such offerings and how they make students feel, and examines the impact on students' well-being. METHODS To address the aims of this study, we employed an exploratory approach, using in-depth, semistructured, qualitative, face-to-face interviews with 23 users (13 female and 10 male users; mean age 23 years, SD 1.55 years) of ChatGPT-4o, who were also university students at the time (inclusion criteria). Interviewees were invited to reflect upon how they use ChatGPT, how it makes them feel, and how it may influence their lives. RESULTS The current findings from the exploratory qualitative interviews showed that users appreciate the functional support (8/23, 35%), escapism (8/23, 35%), and fantasy fulfillment (7/23, 30%) they receive from LLM-based offerings, such as ChatGPT, but at the same time, such usage is seen as a "double-edged sword," with respondents indicating anxiety (8/23, 35%), dependence (11/23, 48%), concerns about deskilling (12/23, 52%), and angst or pessimism about the future (11/23, 48%). CONCLUSIONS This study employed exploratory in-depth interviews to examine how the usage of LLM-based offerings, such as ChatGPT, makes users feel and assess the effects of using LLM-based offerings on mental well-being. The findings of this study show that students used ChatGPT to make their lives easier and felt a sense of cognitive escapism and even fantasy fulfillment, but this came at the cost of feeling anxious and pessimistic about the future.
Collapse
Affiliation(s)
- Rania Selim
- Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Arunima Basu
- Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Ailin Anto
- Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Thomas Foscht
- Department of Marketing, University of Graz, Styria, Austria
| | | |
Collapse
|
13
|
Liu Y, Kauttonen J, Zhao B, Li X, Peng W. Editorial: Towards Emotion AI to next generation healthcare and education. Front Psychol 2024; 15:1533053. [PMID: 39749281 PMCID: PMC11694222 DOI: 10.3389/fpsyg.2024.1533053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2024] [Accepted: 12/04/2024] [Indexed: 01/04/2025] Open
Affiliation(s)
- Yang Liu
- Center for Machine Vision and Signal Analysis, University of Oulu, Oulu, Finland
| | - Janne Kauttonen
- RDI & Competences, Haaga-Helia University of Applied Sciences, Helsinki, Finland
| | - Bowen Zhao
- Guangzhou Institute of Technology, Xidian University, Guangzhou, China
| | - Xiaobai Li
- School of Cyber Science and Technology, Zhejing University, Hangzhou, China
| | - Wei Peng
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, United States
| |
Collapse
|
14
|
Kolding S, Lundin RM, Hansen L, Østergaard SD. Use of generative artificial intelligence (AI) in psychiatry and mental health care: a systematic review. Acta Neuropsychiatr 2024; 37:e37. [PMID: 39523628 DOI: 10.1017/neu.2024.50] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
OBJECTIVES Tools based on generative artificial intelligence (AI) such as ChatGPT have the potential to transform modern society, including the field of medicine. Due to the prominent role of language in psychiatry, e.g., for diagnostic assessment and psychotherapy, these tools may be particularly useful within this medical field. Therefore, the aim of this study was to systematically review the literature on generative AI applications in psychiatry and mental health. METHODS We conducted a systematic review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The search was conducted across three databases, and the resulting articles were screened independently by two researchers. The content, themes, and findings of the articles were qualitatively assessed. RESULTS The search and screening process resulted in the inclusion of 40 studies. The median year of publication was 2023. The themes covered in the articles were mainly mental health and well-being in general - with less emphasis on specific mental disorders (substance use disorder being the most prevalent). The majority of studies were conducted as prompt experiments, with the remaining studies comprising surveys, pilot studies, and case reports. Most studies focused on models that generate language, ChatGPT in particular. CONCLUSIONS Generative AI in psychiatry and mental health is a nascent but quickly expanding field. The literature mainly focuses on applications of ChatGPT, and finds that generative AI performs well, but notes that it is limited by significant safety and ethical concerns. Future research should strive to enhance transparency of methods, use experimental designs, ensure clinical relevance, and involve users/patients in the design phase.
Collapse
Affiliation(s)
- Sara Kolding
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Robert M Lundin
- Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Geelong, VIC, Australia
- Mildura Base Public Hospital, Mental Health Services, Alcohol and Other Drugs Integrated Treatment Team, Mildura, VIC, Australia
- Barwon Health, Change to Improve Mental Health (CHIME), Mental Health Drugs and Alcohol Services, Geelong, VIC, Australia
| | - Lasse Hansen
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Søren Dinesen Østergaard
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
| |
Collapse
|
15
|
Chen D, Liu Y, Guo Y, Zhang Y. The revolution of generative artificial intelligence in psychology: The interweaving of behavior, consciousness, and ethics. Acta Psychol (Amst) 2024; 251:104593. [PMID: 39522296 DOI: 10.1016/j.actpsy.2024.104593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 11/02/2024] [Accepted: 11/04/2024] [Indexed: 11/16/2024] Open
Abstract
In recent years, there have been unparalleled prospects for psychological study due to the swift advancement of generative artificial intelligence (AI) in natural language processing, shown by ChatGPT. This review article looks into the uses and effects of generative artificial intelligence in psychology. We employed a systematic selection process, encompassing papers published between 2015 and 2024 from databases such as Google Scholar, PubMed, and IEEE Xplore, using keywords like "Generative AI in psychology" "ChatGPT and behavior modeling" and "AI in mental health". First, the paper goes over the fundamental ideas of generative AI and lists its uses in data analysis, behavior modeling, and social interaction simulation. A detailed comparison table has been added to contrast conventional research methodologies with GenAI-based approaches in psychology studies. Next, analyzing the theoretical and ethical issues that generative AI raises for psychological research, it highlights how crucial it is to develop a coherent theoretical framework. This study illustrates the benefits of generative AI in handling vast amounts of data and increasing research efficiency by contrasting traditional research methods with AI-driven methodologies. Regarding particular uses, the study explores how generative AI might be used to simulate social interactions, analyze massive amounts of text, and learn about cognitive processes. Section 5 has been expanded to include discussions on political biases, geographic biases, and other biases. In conclusion, the paper looks forward to the future development of generative AI in psychology research and suggests techniques for improving it. We have included methodological solutions such as the Retrieval Augmented Generation (RAG) approach and human-in-the-loop systems, as well as data privacy solutions like open-source local LLMs. In summary, generative AI has the potential to revolutionize psychological research, but in order to maintain the moral and scientific integrity of the field, ethical and theoretical concerns must be carefully considered before applying the technology.
Collapse
Affiliation(s)
- Dian Chen
- School of Economics and Management, Southeast University, Nanjing, China
| | - Ying Liu
- Peking Union Medical College, Beijing, China
| | - Yiting Guo
- School of Economics and Management, Southeast University, Nanjing, China.
| | - Yulin Zhang
- School of Economics and Management, Southeast University, Nanjing, China
| |
Collapse
|
16
|
Dana Z, Nagra H, Kilby K. Role of Synchronous, Moderated, and Anonymous Peer Support Chats on Reducing Momentary Loneliness in Older Adults: Retrospective Observational Study. JMIR Form Res 2024; 8:e59501. [PMID: 39453688 PMCID: PMC11549579 DOI: 10.2196/59501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 06/06/2024] [Accepted: 09/03/2024] [Indexed: 10/26/2024] Open
Abstract
BACKGROUND Older adults have a high rate of loneliness, which contributes to increased psychosocial risk, medical morbidity, and mortality. Digital emotional support interventions provide a convenient and rapid avenue for additional support. Digital peer support interventions for emotional struggles contrast the usual provider-based clinical care models because they offer more accessible, direct support for empowerment, highlighting the users' autonomy, competence, and relatedness. OBJECTIVE This study aims to examine a novel anonymous and synchronous peer-to-peer digital chat service facilitated by trained human moderators. The experience of a cohort of 699 adults aged ≥65 years was analyzed to determine (1) if participation, alone, led to measurable aggregate change in momentary loneliness and optimism and (2) the impact of peers on momentary loneliness and optimism. METHODS Participants were each prompted with a single question: "What's your struggle?" Using a proprietary artificial intelligence model, the free-text response automatched the respondent based on their self-expressed emotional struggle to peers and a chat moderator. Exchanged messages were analyzed to quantitatively measure the change in momentary loneliness and optimism using a third-party, public, natural language processing model (GPT-4 [OpenAI]). The sentiment change analysis was initially performed at the individual level and then averaged across all users with similar emotion types to produce a statistically significant (P<.05) collective trend per emotion. To evaluate the peer impact on momentary loneliness and optimism, we performed propensity matching to align the moderator+single user and moderator+small group chat cohorts and then compare the emotion trends between the matched cohorts. RESULTS Loneliness and optimism trends significantly improved after 8 (P=.02) to 9 minutes (P=.03) into the chat. We observed a significant improvement in the momentary loneliness and optimism trends between the moderator+small group compared to the moderator+single user chat cohort after 19 (P=.049) and 21 minutes (P=.04) for optimism and loneliness, respectively. CONCLUSIONS Chat-based peer support may be a viable intervention to help address momentary loneliness in older adults and present an alternative to traditional care. The promising results support the need for further study to expand the evidence for such cost-effective options.
Collapse
Affiliation(s)
- Zara Dana
- Supportiv, Berkeley, CA, United States
| | | | | |
Collapse
|
17
|
Artsi Y, Sorin V, Glicksberg BS, Nadkarni GN, Klang E. Advancing Clinical Practice: The Potential of Multimodal Technology in Modern Medicine. J Clin Med 2024; 13:6246. [PMID: 39458196 PMCID: PMC11508674 DOI: 10.3390/jcm13206246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 10/15/2024] [Accepted: 10/17/2024] [Indexed: 10/28/2024] Open
Abstract
Multimodal technology is poised to revolutionize clinical practice by integrating artificial intelligence with traditional diagnostic modalities. This evolution traces its roots from Hippocrates' humoral theory to the use of sophisticated AI-driven platforms that synthesize data across multiple sensory channels. The interplay between historical medical practices and modern technology challenges conventional patient-clinician interactions and redefines diagnostic accuracy. Highlighting applications from neurology to radiology, the potential of multimodal technology emerges, suggesting a future where AI not only supports but enhances human sensory inputs in medical diagnostics. This shift invites the medical community to navigate the ethical, practical, and technological changes reshaping the landscape of clinical medicine.
Collapse
Affiliation(s)
- Yaara Artsi
- Azrieli Faculty of Medicine, Bar-Ilan University, Zefat 1311502, Israel
| | - Vera Sorin
- Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA;
| | - Benjamin S. Glicksberg
- Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; (B.S.G.); (G.N.N.); (E.K.)
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Girish N. Nadkarni
- Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; (B.S.G.); (G.N.N.); (E.K.)
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Eyal Klang
- Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; (B.S.G.); (G.N.N.); (E.K.)
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|
18
|
Elyoseph Z, Gur T, Haber Y, Simon T, Angert T, Navon Y, Tal A, Asman O. An Ethical Perspective on the Democratization of Mental Health With Generative AI. JMIR Ment Health 2024; 11:e58011. [PMID: 39417792 PMCID: PMC11500620 DOI: 10.2196/58011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Revised: 07/20/2024] [Accepted: 07/24/2024] [Indexed: 10/19/2024] Open
Abstract
Unlabelled Knowledge has become more open and accessible to a large audience with the "democratization of information" facilitated by technology. This paper provides a sociohistorical perspective for the theme issue "Responsible Design, Integration, and Use of Generative AI in Mental Health." It evaluates ethical considerations in using generative artificial intelligence (GenAI) for the democratization of mental health knowledge and practice. It explores the historical context of democratizing information, transitioning from restricted access to widespread availability due to the internet, open-source movements, and most recently, GenAI technologies such as large language models. The paper highlights why GenAI technologies represent a new phase in the democratization movement, offering unparalleled access to highly advanced technology as well as information. In the realm of mental health, this requires delicate and nuanced ethical deliberation. Including GenAI in mental health may allow, among other things, improved accessibility to mental health care, personalized responses, and conceptual flexibility, and could facilitate a flattening of traditional hierarchies between health care providers and patients. At the same time, it also entails significant risks and challenges that must be carefully addressed. To navigate these complexities, the paper proposes a strategic questionnaire for assessing artificial intelligence-based mental health applications. This tool evaluates both the benefits and the risks, emphasizing the need for a balanced and ethical approach to GenAI integration in mental health. The paper calls for a cautious yet positive approach to GenAI in mental health, advocating for the active engagement of mental health professionals in guiding GenAI development. It emphasizes the importance of ensuring that GenAI advancements are not only technologically sound but also ethically grounded and patient-centered.
Collapse
Affiliation(s)
- Zohar Elyoseph
- Department of Brain Sciences, Faculty of Medicine, Imperial College, Fulham Palace Rd, London, W6 8RF, United Kingdom, 44 547836088
- Faculty of Education, University of Haifa, Haifa, Israel
| | - Tamar Gur
- The Adelson School of Entrepreneurship, Reichman University, Herzliya, Israel
| | - Yuval Haber
- The PhD Program of Hermeneutics & Cultural Studies, Bar-Ilan University, Ramat Gan, Israel
| | - Tomer Simon
- Microsoft Israel R&D Center, Tel Aviv, Israel
| | - Tal Angert
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Yuval Navon
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Amir Tal
- Samueli Initiative for Responsible AI in Medicine, Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Oren Asman
- Samueli Initiative for Responsible AI in Medicine, Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
- Department of Nursing, Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
19
|
Park C, Kim J. Exploring Affective Representations in Emotional Narratives: An Exploratory Study Comparing ChatGPT and Human Responses. CYBERPSYCHOLOGY, BEHAVIOR AND SOCIAL NETWORKING 2024; 27:736-741. [PMID: 39229675 DOI: 10.1089/cyber.2024.0100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
While artificial Intelligence (AI) has made significant advancements, the seeming absence of its emotional ability has hindered effective communication with humans. This study explores how ChatGPT (ChatGPT-3.5 Mar 23, 2023 Version) represents affective responses to emotional narratives and compare these responses to human responses. Thirty-four participants read affect-eliciting short stories and rated their emotional responses and 10 recorded ChatGPT sessions generated responses to the stories. Classification analyses revealed the successful identification of affective categories of stories, valence, and arousal within and across sessions for ChatGPT. Classification analyses revealed the successful identification of affective categories of stories, valence, and arousal within and across sessions for ChatGPT. Classification accuracies predicting affective categories of stories, valence, and arousal of humans based on the affective ratings of ChatGPT and vice versa were not significant, indicating differences in the way the affective states were represented., indicating differences in the way the affective states were represented. These findings suggested that ChatGPT can distinguish emotional states and generate affective responses consistently, but there are differences in how the affective states are represented between ChatGPT and humans. Understanding these mechanisms is crucial for improving emotional interactions with AI.
Collapse
Affiliation(s)
- Chaery Park
- Department of Psychology, Jeonbuk National University, Jeonju, Republic of Korea
| | - Jongwan Kim
- Department of Psychology, Jeonbuk National University, Jeonju, Republic of Korea
| |
Collapse
|
20
|
Hadar-Shoval D, Asraf K, Shinan-Altman S, Elyoseph Z, Levkovich I. Embedded values-like shape ethical reasoning of large language models on primary care ethical dilemmas. Heliyon 2024; 10:e38056. [PMID: 39381244 PMCID: PMC11458949 DOI: 10.1016/j.heliyon.2024.e38056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 09/17/2024] [Indexed: 10/10/2024] Open
Abstract
Objective This article uses the framework of Schwartz's values theory to examine whether the embedded values-like profile within large language models (LLMs) impact ethical decision-making dilemmas faced by primary care. It specifically aims to evaluate whether each LLM exhibits a distinct values-like profile, assess its alignment with general population values, and determine whether latent values influence clinical recommendations. Methods The Portrait Values Questionnaire-Revised (PVQ-RR) was submitted to each LLM (Claude, Bard, GPT-3.5, and GPT-4) 20 times to ensure reliable and valid responses. Their responses were compared to a benchmark derived from a diverse international sample consisting of over 53,000 culturally diverse respondents who completed the PVQ-RR. Four vignettes depicting prototypical professional quandaries involving conflicts between competing values were presented to the LLMs. The option selected by each LLM and the strength of its recommendation were evaluated to determine if underlying values-like impact output. Results Each LLM demonstrated a unique values-like profile. Universalism and self-direction were prioritized, while power and tradition were assigned less importance than population benchmarks, suggesting potential Western-centric biases. Four clinical vignettes involving value conflicts were presented to the LLMs. Preliminary indications suggested that embedded values-like influence recommendations. Significant variances in confidence strength regarding chosen recommendations materialized between models, proposing that further vetting is required before the LLMs can be relied on as judgment aids. However, the overall selection of preferences aligned with intrinsic value hierarchies. Conclusion The distinct intrinsic values-like embedded within LLMs shape ethical decision-making, which carries implications for their integration in primary care settings serving diverse populations. For context-appropriate, equitable delivery of AI-assisted healthcare globally it is essential that LLMs are tailored to align with cultural outlooks.
Collapse
Affiliation(s)
- Dorit Hadar-Shoval
- The Center for Psychobiological Research, Department of Psychology and Educational Counseling, Max Stern Yezreel Valley College, Israel
| | - Kfir Asraf
- The Center for Psychobiological Research, Department of Psychology and Educational Counseling, Max Stern Yezreel Valley College, Israel
| | - Shiri Shinan-Altman
- The Louis and Gabi Weisfeld School of Social Work, Bar-Ilan University, Ramat Gan, Israel
| | - Zohar Elyoseph
- The Center for Psychobiological Research, Department of Psychology and Educational Counseling, Max Stern Yezreel Valley College, Israel
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, England
- Department of Counseling and Human Development, Department of Education, University of Haifa, Israel
| | | |
Collapse
|
21
|
Vzorin GD, Bukinich AM, Sedykh AV, Vetrova II, Sergienko EA. The Emotional Intelligence of the GPT-4 Large Language Model. PSYCHOLOGY IN RUSSIA: STATE OF ART 2024; 17:85-99. [PMID: 39552777 PMCID: PMC11562005 DOI: 10.11621/pir.2024.0206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 06/13/2024] [Indexed: 11/19/2024] Open
Abstract
Background Advanced AI models such as the large language model GPT-4 demonstrate sophisticated intellectual capabilities, sometimes exceeding human intellectual performance. However, the emotional competency of these models, along with their underlying mechanisms, has not been sufficiently evaluated. Objective Our research aimed to explore different emotional intelligence domains in GPT-4 according to the Mayer-Salovey-Caruso model. We also tried to find out whether GPT-4's answer accuracy is consistent with its explanation of the answer. Design The Russian version of the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) sections was used in this research, with questions asked as text prompts in separate, independent ChatGPT chats three times each. Results High scores were achieved by the GPT-4 Large Language Model on the Understanding Emotions scale (with scores of 117, 124, and 128 across the three runs) and the Strategic Emotional Intelligence scale (with scores of 118, 121, and 122). Average scores were obtained on the Managing Emotions scale (103, 108, and 110 points). However, the Using Emotions to Facilitate Thought scale yielded low and less reliable scores (85, 86, and 88 points). Four types of explanations for the answer choices were identified: Meaningless sentences; Relation declaration; Implicit logic; and Explicit logic. Correct answers were accompanied by all types of explanations, whereas incorrect answers were only followed by Meaningless sentences or Explicit logic. This distribution aligns with observed patterns in children when they explore and elucidate mental states. Conclusion GPT-4 is capable of emotion identification and managing emotions, but it lacks deep reflexive analysis of emotional experience and the motivational aspect of emotions.
Collapse
Affiliation(s)
- Gleb D. Vzorin
- Lomonosov Moscow State University, Russia
- Institute of Psychology of Russian Academy of Sciences, Moscow, Russia
| | - Alexey M. Bukinich
- Lomonosov Moscow State University, Russia
- Federal Scientific Center of Psychological and Interdisciplinary Research, Moscow, Russia
| | | | - Irina I. Vetrova
- Institute of Psychology of Russian Academy of Sciences, Moscow, Russia
| | | |
Collapse
|
22
|
Shinan-Altman S, Elyoseph Z, Levkovich I. The impact of history of depression and access to weapons on suicide risk assessment: a comparison of ChatGPT-3.5 and ChatGPT-4. PeerJ 2024; 12:e17468. [PMID: 38827287 PMCID: PMC11143969 DOI: 10.7717/peerj.17468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 05/05/2024] [Indexed: 06/04/2024] Open
Abstract
The aim of this study was to evaluate the effectiveness of ChatGPT-3.5 and ChatGPT-4 in incorporating critical risk factors, namely history of depression and access to weapons, into suicide risk assessments. Both models assessed suicide risk using scenarios that featured individuals with and without a history of depression and access to weapons. The models estimated the likelihood of suicidal thoughts, suicide attempts, serious suicide attempts, and suicide-related mortality on a Likert scale. A multivariate three-way ANOVA analysis with Bonferroni post hoc tests was conducted to examine the impact of the forementioned independent factors (history of depression and access to weapons) on these outcome variables. Both models identified history of depression as a significant suicide risk factor. ChatGPT-4 demonstrated a more nuanced understanding of the relationship between depression, access to weapons, and suicide risk. In contrast, ChatGPT-3.5 displayed limited insight into this complex relationship. ChatGPT-4 consistently assigned higher severity ratings to suicide-related variables than did ChatGPT-3.5. The study highlights the potential of these two models, particularly ChatGPT-4, to enhance suicide risk assessment by considering complex risk factors.
Collapse
Affiliation(s)
| | - Zohar Elyoseph
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, England, United Kingdom
- The Center for Psychobiological Research, Department of Psychology and Educational Counseling, Max Stern Yezreel Valley College, Emek Yezreel, Israel
| | - Inbar Levkovich
- Faculty of Graduate Studies, Oranim Academic College of Education, Kiryat Tiv’on, Israel
| |
Collapse
|
23
|
Haber Y, Levkovich I, Hadar-Shoval D, Elyoseph Z. The Artificial Third: A Broad View of the Effects of Introducing Generative Artificial Intelligence on Psychotherapy. JMIR Ment Health 2024; 11:e54781. [PMID: 38787297 PMCID: PMC11137430 DOI: 10.2196/54781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/24/2024] [Accepted: 04/18/2024] [Indexed: 05/25/2024] Open
Abstract
Unlabelled This paper explores a significant shift in the field of mental health in general and psychotherapy in particular following generative artificial intelligence's new capabilities in processing and generating humanlike language. Following Freud, this lingo-technological development is conceptualized as the "fourth narcissistic blow" that science inflicts on humanity. We argue that this narcissistic blow has a potentially dramatic influence on perceptions of human society, interrelationships, and the self. We should, accordingly, expect dramatic changes in perceptions of the therapeutic act following the emergence of what we term the artificial third in the field of psychotherapy. The introduction of an artificial third marks a critical juncture, prompting us to ask the following important core questions that address two basic elements of critical thinking, namely, transparency and autonomy: (1) What is this new artificial presence in therapy relationships? (2) How does it reshape our perception of ourselves and our interpersonal dynamics? and (3) What remains of the irreplaceable human elements at the core of therapy? Given the ethical implications that arise from these questions, this paper proposes that the artificial third can be a valuable asset when applied with insight and ethical consideration, enhancing but not replacing the human touch in therapy.
Collapse
Affiliation(s)
- Yuval Haber
- The PhD Program of Hermeneutics and Cultural Studies, Interdisciplinary Studies Unit, Bar-Ilan University, Ramat Gan, Israel
| | | | - Dorit Hadar-Shoval
- Department of Psychology and Educational Counseling, The Max Stern Yezreel Valley College, Emek Yezreel, Israel
| | - Zohar Elyoseph
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
- The Center for Psychobiological Research, Department of Psychology and Educational Counseling, The Max Stern Yezreel Valley College, Emek Yezreel, Israel
| |
Collapse
|
24
|
Hadar-Shoval D, Asraf K, Mizrachi Y, Haber Y, Elyoseph Z. Assessing the Alignment of Large Language Models With Human Values for Mental Health Integration: Cross-Sectional Study Using Schwartz's Theory of Basic Values. JMIR Ment Health 2024; 11:e55988. [PMID: 38593424 PMCID: PMC11040439 DOI: 10.2196/55988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 03/01/2024] [Accepted: 03/08/2024] [Indexed: 04/11/2024] Open
Abstract
BACKGROUND Large language models (LLMs) hold potential for mental health applications. However, their opaque alignment processes may embed biases that shape problematic perspectives. Evaluating the values embedded within LLMs that guide their decision-making have ethical importance. Schwartz's theory of basic values (STBV) provides a framework for quantifying cultural value orientations and has shown utility for examining values in mental health contexts, including cultural, diagnostic, and therapist-client dynamics. OBJECTIVE This study aimed to (1) evaluate whether the STBV can measure value-like constructs within leading LLMs and (2) determine whether LLMs exhibit distinct value-like patterns from humans and each other. METHODS In total, 4 LLMs (Bard, Claude 2, Generative Pretrained Transformer [GPT]-3.5, GPT-4) were anthropomorphized and instructed to complete the Portrait Values Questionnaire-Revised (PVQ-RR) to assess value-like constructs. Their responses over 10 trials were analyzed for reliability and validity. To benchmark the LLMs' value profiles, their results were compared to published data from a diverse sample of 53,472 individuals across 49 nations who had completed the PVQ-RR. This allowed us to assess whether the LLMs diverged from established human value patterns across cultural groups. Value profiles were also compared between models via statistical tests. RESULTS The PVQ-RR showed good reliability and validity for quantifying value-like infrastructure within the LLMs. However, substantial divergence emerged between the LLMs' value profiles and population data. The models lacked consensus and exhibited distinct motivational biases, reflecting opaque alignment processes. For example, all models prioritized universalism and self-direction, while de-emphasizing achievement, power, and security relative to humans. Successful discriminant analysis differentiated the 4 LLMs' distinct value profiles. Further examination found the biased value profiles strongly predicted the LLMs' responses when presented with mental health dilemmas requiring choosing between opposing values. This provided further validation for the models embedding distinct motivational value-like constructs that shape their decision-making. CONCLUSIONS This study leveraged the STBV to map the motivational value-like infrastructure underpinning leading LLMs. Although the study demonstrated the STBV can effectively characterize value-like infrastructure within LLMs, substantial divergence from human values raises ethical concerns about aligning these models with mental health applications. The biases toward certain cultural value sets pose risks if integrated without proper safeguards. For example, prioritizing universalism could promote unconditional acceptance even when clinically unwise. Furthermore, the differences between the LLMs underscore the need to standardize alignment processes to capture true cultural diversity. Thus, any responsible integration of LLMs into mental health care must account for their embedded biases and motivation mismatches to ensure equitable delivery across diverse populations. Achieving this will require transparency and refinement of alignment techniques to instill comprehensive human values.
Collapse
Affiliation(s)
- Dorit Hadar-Shoval
- The Psychology Department, Max Stern Yezreel Valley College, Tel Adashim, Israel
| | - Kfir Asraf
- The Psychology Department, Max Stern Yezreel Valley College, Tel Adashim, Israel
| | - Yonathan Mizrachi
- The Jane Goodall Institute, Max Stern Yezreel Valley College, Tel Adashim, Israel
- The Laboratory for AI, Machine Learning, Business & Data Analytics, Tel-Aviv University, Tel Aviv, Israel
| | - Yuval Haber
- The PhD Program of Hermeneutics and Cultural Studies, Interdisciplinary Studies Unit, Bar-Ilan University, Ramat Gan, Israel
| | - Zohar Elyoseph
- The Psychology Department, Center for Psychobiological Research, Max Stern Yezreel Valley College, Tel Adashim, Israel
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
25
|
Elyoseph Z, Levkovich I. Comparing the Perspectives of Generative AI, Mental Health Experts, and the General Public on Schizophrenia Recovery: Case Vignette Study. JMIR Ment Health 2024; 11:e53043. [PMID: 38533615 PMCID: PMC11004608 DOI: 10.2196/53043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 01/24/2024] [Accepted: 02/11/2024] [Indexed: 03/28/2024] Open
Abstract
Background The current paradigm in mental health care focuses on clinical recovery and symptom remission. This model's efficacy is influenced by therapist trust in patient recovery potential and the depth of the therapeutic relationship. Schizophrenia is a chronic illness with severe symptoms where the possibility of recovery is a matter of debate. As artificial intelligence (AI) becomes integrated into the health care field, it is important to examine its ability to assess recovery potential in major psychiatric disorders such as schizophrenia. Objective This study aimed to evaluate the ability of large language models (LLMs) in comparison to mental health professionals to assess the prognosis of schizophrenia with and without professional treatment and the long-term positive and negative outcomes. Methods Vignettes were inputted into LLMs interfaces and assessed 10 times by 4 AI platforms: ChatGPT-3.5, ChatGPT-4, Google Bard, and Claude. A total of 80 evaluations were collected and benchmarked against existing norms to analyze what mental health professionals (general practitioners, psychiatrists, clinical psychologists, and mental health nurses) and the general public think about schizophrenia prognosis with and without professional treatment and the positive and negative long-term outcomes of schizophrenia interventions. Results For the prognosis of schizophrenia with professional treatment, ChatGPT-3.5 was notably pessimistic, whereas ChatGPT-4, Claude, and Bard aligned with professional views but differed from the general public. All LLMs believed untreated schizophrenia would remain static or worsen without professional treatment. For long-term outcomes, ChatGPT-4 and Claude predicted more negative outcomes than Bard and ChatGPT-3.5. For positive outcomes, ChatGPT-3.5 and Claude were more pessimistic than Bard and ChatGPT-4. Conclusions The finding that 3 out of the 4 LLMs aligned closely with the predictions of mental health professionals when considering the "with treatment" condition is a demonstration of the potential of this technology in providing professional clinical prognosis. The pessimistic assessment of ChatGPT-3.5 is a disturbing finding since it may reduce the motivation of patients to start or persist with treatment for schizophrenia. Overall, although LLMs hold promise in augmenting health care, their application necessitates rigorous validation and a harmonious blend with human expertise.
Collapse
Affiliation(s)
- Zohar Elyoseph
- Department of Brain Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
- The Center for Psychobiological Research, Department of Psychology and Educational Counseling, Max Stern Yezreel Valley College, Emek Yezreel, Israel
| | - Inbar Levkovich
- Faculty of Graduate Studies, Oranim Academic College, Kiryat Tiv'on, Israel
| |
Collapse
|