1
|
Davis J, Van Bulck L, Durieux BN, Lindvall C. The Temperature Feature of ChatGPT: Modifying Creativity for Clinical Research. JMIR Hum Factors 2024; 11:e53559. [PMID: 38457221 PMCID: PMC10960206 DOI: 10.2196/53559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/11/2023] [Accepted: 01/24/2024] [Indexed: 03/09/2024] Open
Abstract
More clinicians and researchers are exploring uses for large language model chatbots, such as ChatGPT, for research, dissemination, and educational purposes. Therefore, it becomes increasingly relevant to consider the full potential of this tool, including the special features that are currently available through the application programming interface. One of these features is a variable called temperature, which changes the degree to which randomness is involved in the model's generated output. This is of particular interest to clinicians and researchers. By lowering this variable, one can generate more consistent outputs; by increasing it, one can receive more creative responses. For clinicians and researchers who are exploring these tools for a variety of tasks, the ability to tailor outputs to be less creative may be beneficial for work that demands consistency. Additionally, access to more creative text generation may enable scientific authors to describe their research in more general language and potentially connect with a broader public through social media. In this viewpoint, we present the temperature feature, discuss potential uses, and provide some examples.
Collapse
Affiliation(s)
- Joshua Davis
- Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, MA, United States
- Albany Medical College, Albany, NY, United States
| | - Liesbet Van Bulck
- KU Leuven Department of Public Health and Primary Care, KU Leuven-University of Leuven, Leuven, Belgium
- Research Foundation Flanders (FWO), Brussels, Belgium
| | - Brigitte N Durieux
- Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, MA, United States
| | - Charlotta Lindvall
- Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, MA, United States
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, United States
- Harvard Medical School, Harvard University, Boston, MA, United States
| |
Collapse
|
2
|
Chen H, Cohen E, Wilson D, Alfred M. A Machine Learning Approach with Human-AI Collaboration for Automated Classification of Patient Safety Event Reports: Algorithm Development and Validation Study. JMIR Hum Factors 2024; 11:e53378. [PMID: 38271086 PMCID: PMC10853856 DOI: 10.2196/53378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 11/30/2023] [Accepted: 12/03/2023] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Adverse events refer to incidents with potential or actual harm to patients in hospitals. These events are typically documented through patient safety event (PSE) reports, which consist of detailed narratives providing contextual information on the occurrences. Accurate classification of PSE reports is crucial for patient safety monitoring. However, this process faces challenges due to inconsistencies in classifications and the sheer volume of reports. Recent advancements in text representation, particularly contextual text representation derived from transformer-based language models, offer a promising solution for more precise PSE report classification. Integrating the machine learning (ML) classifier necessitates a balance between human expertise and artificial intelligence (AI). Central to this integration is the concept of explainability, which is crucial for building trust and ensuring effective human-AI collaboration. OBJECTIVE This study aims to investigate the efficacy of ML classifiers trained using contextual text representation in automatically classifying PSE reports. Furthermore, the study presents an interface that integrates the ML classifier with the explainability technique to facilitate human-AI collaboration for PSE report classification. METHODS This study used a data set of 861 PSE reports from a large academic hospital's maternity units in the Southeastern United States. Various ML classifiers were trained with both static and contextual text representations of PSE reports. The trained ML classifiers were evaluated with multiclass classification metrics and the confusion matrix. The local interpretable model-agnostic explanations (LIME) technique was used to provide the rationale for the ML classifier's predictions. An interface that integrates the ML classifier with the LIME technique was designed for incident reporting systems. RESULTS The top-performing classifier using contextual representation was able to obtain an accuracy of 75.4% (95/126) compared to an accuracy of 66.7% (84/126) by the top-performing classifier trained using static text representation. A PSE reporting interface has been designed to facilitate human-AI collaboration in PSE report classification. In this design, the ML classifier recommends the top 2 most probable event types, along with the explanations for the prediction, enabling PSE reporters and patient safety analysts to choose the most suitable one. The LIME technique showed that the classifier occasionally relies on arbitrary words for classification, emphasizing the necessity of human oversight. CONCLUSIONS This study demonstrates that training ML classifiers with contextual text representations can significantly enhance the accuracy of PSE report classification. The interface designed in this study lays the foundation for human-AI collaboration in the classification of PSE reports. The insights gained from this research enhance the decision-making process in PSE report classification, enabling hospitals to more efficiently identify potential risks and hazards and enabling patient safety analysts to take timely actions to prevent patient harm.
Collapse
Affiliation(s)
- Hongbo Chen
- Department of Mechanical & Industrial Engineering, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, Canada
| | - Eldan Cohen
- Department of Mechanical & Industrial Engineering, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, Canada
| | - Dulaney Wilson
- Department of Public Health Sciences, College of Medicine, Medical University of South Carolina, Charleston, SC, United States
| | - Myrtede Alfred
- Department of Mechanical & Industrial Engineering, Faculty of Applied Science & Engineering, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
3
|
Cheng SL, Tsai SJ, Bai YM, Ko CH, Hsu CW, Yang FC, Tsai CK, Tu YK, Yang SN, Tseng PT, Hsu TW, Liang CS, Su KP. Comparisons of Quality, Correctness, and Similarity Between ChatGPT-Generated and Human-Written Abstracts for Basic Research: Cross-Sectional Study. J Med Internet Res 2023; 25:e51229. [PMID: 38145486 PMCID: PMC10760418 DOI: 10.2196/51229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 10/17/2023] [Accepted: 11/20/2023] [Indexed: 12/26/2023] Open
Abstract
BACKGROUND ChatGPT may act as a research assistant to help organize the direction of thinking and summarize research findings. However, few studies have examined the quality, similarity (abstracts being similar to the original one), and accuracy of the abstracts generated by ChatGPT when researchers provide full-text basic research papers. OBJECTIVE We aimed to assess the applicability of an artificial intelligence (AI) model in generating abstracts for basic preclinical research. METHODS We selected 30 basic research papers from Nature, Genome Biology, and Biological Psychiatry. Excluding abstracts, we inputted the full text into ChatPDF, an application of a language model based on ChatGPT, and we prompted it to generate abstracts with the same style as used in the original papers. A total of 8 experts were invited to evaluate the quality of these abstracts (based on a Likert scale of 0-10) and identify which abstracts were generated by ChatPDF, using a blind approach. These abstracts were also evaluated for their similarity to the original abstracts and the accuracy of the AI content. RESULTS The quality of ChatGPT-generated abstracts was lower than that of the actual abstracts (10-point Likert scale: mean 4.72, SD 2.09 vs mean 8.09, SD 1.03; P<.001). The difference in quality was significant in the unstructured format (mean difference -4.33; 95% CI -4.79 to -3.86; P<.001) but minimal in the 4-subheading structured format (mean difference -2.33; 95% CI -2.79 to -1.86). Among the 30 ChatGPT-generated abstracts, 3 showed wrong conclusions, and 10 were identified as AI content. The mean percentage of similarity between the original and the generated abstracts was not high (2.10%-4.40%). The blinded reviewers achieved a 93% (224/240) accuracy rate in guessing which abstracts were written using ChatGPT. CONCLUSIONS Using ChatGPT to generate a scientific abstract may not lead to issues of similarity when using real full texts written by humans. However, the quality of the ChatGPT-generated abstracts was suboptimal, and their accuracy was not 100%.
Collapse
Affiliation(s)
- Shu-Li Cheng
- Department of Nursing, Mackay Medical College, Taipei, Taiwan
| | - Shih-Jen Tsai
- Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan
- Division of Psychiatry, School of Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Ya-Mei Bai
- Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan
- Division of Psychiatry, School of Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Chih-Hung Ko
- Department of Psychiatry, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan
- Department of Psychiatry, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
- Department of Psychiatry, Kaohsiung Municipal Siaogang Hospital, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Chih-Wei Hsu
- Department of Psychiatry, Kaohsiung Chang Gung Memorial Hospital, Kaohsiung, Taiwan
| | - Fu-Chi Yang
- Department of Neurology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Chia-Kuang Tsai
- Department of Neurology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Yu-Kang Tu
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
- Department of Dentistry, National Taiwan University Hospital, Taipei, Taiwan
| | - Szu-Nian Yang
- Department of Psychiatry, Tri-service Hospital, Beitou branch, Taipei, Taiwan
- Department of Psychiatry, Armed Forces Taoyuan General Hospital, Taoyuan, Taiwan
- Graduate Institute of Health and Welfare Policy, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Ping-Tao Tseng
- Institute of Biomedical Sciences, Institute of Precision Medicine, National Sun Yat-sen University, Kaohsiung, Taiwan
- Department of Psychology, College of Medical and Health Science, Asia University, Taichung, Taiwan
- Prospect Clinic for Otorhinolaryngology and Neurology, Kaohsiung, Taiwan
| | - Tien-Wei Hsu
- Department of Psychiatry, E-Da Dachang Hospital, I-Shou University, Kaohsiung, Taiwan
- Department of Psychiatry, E-Da Hospital, I-Shou University, Kaohsiung, Taiwan
| | - Chih-Sung Liang
- Department of Psychiatry, Tri-service Hospital, Beitou branch, Taipei, Taiwan
- Department of Psychiatry, National Defense Medical Center, Taipei, Taiwan
| | - Kuan-Pin Su
- College of Medicine, China Medical University, Taichung, Taiwan
- Mind-Body Interface Laboratory, China Medical University and Hospital, Taichung, Taiwan
- An-Nan Hospital, China Medical University, Tainan, Taiwan
| |
Collapse
|