1
|
Meer A, Rahm P, Schwendinger M, Vock M, Grunder B, Demurtas J, Rutishauser J. Safety of patient self-triage: real-life prospective evaluation of a symptom-checker in adult patients visiting an interdisciplinary emergency care center. J Med Internet Res 2024. [PMID: 38809606 DOI: 10.2196/58157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024] Open
Abstract
BACKGROUND Symptom-checkers have become important tools for self-triage, assisting patients to determine the urgency of medical care. To be safe and effective, these tools must be validated, particularly to avoid potential hazardous undertriage without leading to inefficient overtriage. Only limited safety data from studies including small sample sizes have been available so far. OBJECTIVE The objective of our study was to prospectively investigate the safety of patients' self-triage in a large patient sample. We used SMASS pathfinder, a symptom-checker based on a computerized transparent neural network. METHODS We recruited 2543 patients into this single centre, prospective clinical trial conducted at the cantonal hospital of Baden, Switzerland. Patients with an Emergency Severity Index of 1-2 were treated by the team of the emergency department, while those with an index of 3-5 were seen at the walk-in clinic by general physicians. We compared the triage recommendation obtained by the patients' self-triage with the assessment of the clinical urgency made by three successive interdisciplinary panels of physicians (Panel A, B, C). Using a Clopper-Pearson confidence interval, we assumed that in order to confirm the symptom-checkers safety, the upper confidence bound for the probability of a potentially hazardous undertriage should lie below 1%. A potentially hazardous undertriage was defined as a triage in which either all (consensus criterion) or the majority (majority criterion) of the experts of the last panel (Panel C) rated the triage of the symptom-checker to be "rather likely" or "likely" life-threatening or harmful. RESULTS Of the 2543 patients, 1227 (48.3%) were female and 1316 (51.7%) male. None of the patients reached the pre-specified consensus criterion for a potentially hazardous undertriage. This resulted in an upper 95% confidence bound of 0.1184%. 4 cases met the majority criterion. This resulted in an upper 95% confidence bound for the probability of a potentially hazardous undertriage of 0.3616%. The two-sided 95% Clopper-Pearson confidence interval for the probability of overtriage (450 cases, 17.7%) was 16.23% to 19.24%, which is considerably lower than figures reported in the literature. CONCLUSIONS The symptom-checker proved to be a safe triage tool, avoiding potentially hazardous undertriage in a real-life clinical setting of emergency consultations at a WIC/ED, whithout causing undesirable overtriage. Our data suggest the symptom-checker may be safely used in clinical routine. CLINICALTRIAL ClinicalTrials.gov identifier NCT04055298.
Collapse
Affiliation(s)
| | - Philipp Rahm
- Emergency Department, Cantonal Hospital Baden, Baden, CH
| | | | - Michael Vock
- Institute of Mathematical Statistics and Actuarial Science, University of Berne, Berne, CH
| | | | | | | |
Collapse
|
2
|
Augusto Duenhas Accorsi T, Tocci Moreira F, Aires Eduardo A, Albaladejo Morbeck R, Francine Köhler K, De Amicis Lima K, Henrique Sartorato Pedrotti C. Outcome After Self-Triage App Referral in Urgent Direct-to-Consumer Telemedicine Encounter. Telemed J E Health 2024. [PMID: 38805348 DOI: 10.1089/tmj.2024.0126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024] Open
Abstract
Background: The quantification of self-triage effectiveness, guided by mobile applications, in urgent direct-to-consumer telemedicine (TM) encounters requires further investigation. The objective of this study was to evaluate the outcomes of referral guidance provided by a symptom-based self-management mobile application decision algorithm in the context of remote urgent care assessments. Methods: An observational retrospective single-center study was conducted from May 2022 to December 2023. The inclusion criteria encompassed individuals aged >18 years old, and those spontaneously seeking virtual emergency care through the EINSTEIN CONECTA application. Patients experiencing connectivity issues, preventing completion of the encounter, were excluded. The primary outcomes included the rate of patient concurrence with the algorithm's recommendation for seeking in-person emergency care and the referral rate to face-to-face assessment among cases evaluated through TM. The application's algorithm employs scientific evidence based on symptoms to recommend referrals to emergency departments (EDs). Results: Out of 88,834 patients connected to the TM Center, self-triage obviated the need for virtual physician assessment in 53,302 (60%) encounters. A total of 35,532 patients were remotely evaluated by 316 on-duty physicians, resulting in 1,125 ICD-coded diagnoses. Among these, 21,722 (61.1%) were initially advised by self-triage to visit the ED, with subsequent medical assessment leading to in-person referrals in 6,354 (29.3%) of the evaluations. Of the 13,810 patients recommended to continue with virtual care post-self-triage, 157 (1.1%) were referred for in-person assessment. Conclusions: Self-triage effectively reduced the need for physician encounters in approximately three-fifths of TM consultations. Despite being based on scientific evidence, symptom-based referral algorithms demonstrated high sensitivity but poor correlation with physician decision-making.
Collapse
|
3
|
Sellin J, Pantel JT, Börsch N, Conrad R, Mücke M. [Short paths to diagnosis with artificial intelligence: systematic literature review on diagnostic decision support systems]. Schmerz 2024; 38:19-27. [PMID: 38165492 DOI: 10.1007/s00482-023-00777-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/24/2023] [Indexed: 01/03/2024]
Abstract
BACKGROUND Rare diseases are often recognized late. Their diagnosis is particularly challenging due to the diversity, complexity and heterogeneity of clinical symptoms. Computer-aided diagnostic aids, often referred to as diagnostic decision support systems (DDSS), are promising tools for shortening the time to diagnosis. Despite initial positive evaluations, DDSS are not yet widely used, partly due to a lack of integration with existing clinical or practice information systems. OBJECTIVE This article provides an insight into currently existing diagnostic support systems that function without access to electronic patient records and only require information that is easily obtainable. MATERIALS AND METHODS A systematic literature search identified eight articles on DDSS that can assist in the diagnosis of rare diseases with no need for access to electronic patient records or other information systems in practices and hospitals. The main advantages and disadvantages of the identified rare disease diagnostic support systems were extracted and summarized. RESULTS Symptom checkers and DDSS based on portrait photos and pain drawings already exist. The degree of maturity of these applications varies. CONCLUSION DDSS currently still face a number of challenges, such as concerns about data protection and accuracy, and acceptance and awareness continue to be rather low. On the other hand, there is great potential for faster diagnosis, especially for rare diseases, which are easily overlooked due to their large number and the low awareness of them. The use of DDSS should therefore be carefully considered by doctors on a case-by-case basis.
Collapse
Affiliation(s)
- Julia Sellin
- Institut für Digitale Allgemeinmedizin, Universitätsklinikum RWTH Aachen, Aachen, Deutschland.
- Zentrum für Seltene Erkrankungen Aachen (ZSEA), Universitätsklinikum RWTH Aachen, Aachen, Deutschland.
| | - Jean Tori Pantel
- Institut für Digitale Allgemeinmedizin, Universitätsklinikum RWTH Aachen, Aachen, Deutschland
- Zentrum für Seltene Erkrankungen Aachen (ZSEA), Universitätsklinikum RWTH Aachen, Aachen, Deutschland
| | - Natalie Börsch
- Institut für Digitale Allgemeinmedizin, Universitätsklinikum RWTH Aachen, Aachen, Deutschland
- Zentrum für Seltene Erkrankungen Aachen (ZSEA), Universitätsklinikum RWTH Aachen, Aachen, Deutschland
| | - Rupert Conrad
- Klinik für Psychosomatische Medizin und Psychotherapie, Universitätsklinikum Münster, Münster, Deutschland
| | - Martin Mücke
- Institut für Digitale Allgemeinmedizin, Universitätsklinikum RWTH Aachen, Aachen, Deutschland
- Zentrum für Seltene Erkrankungen Aachen (ZSEA), Universitätsklinikum RWTH Aachen, Aachen, Deutschland
| |
Collapse
|
4
|
Aleksandra S, Robert K, Klaudia K, Dawid L, Mariusz S. Artificial Intelligence in Optimizing the Functioning of Emergency Departments; a Systematic Review of Current Solutions. ARCHIVES OF ACADEMIC EMERGENCY MEDICINE 2024; 12:e22. [PMID: 38572221 PMCID: PMC10988184 DOI: 10.22037/aaem.v12i1.2110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
Introduction The burgeoning burden on emergency departments is a global challenge that we have been confronting for many years. Emerging artificial intelligence (AI)-based solutions may constitute a critical component in the optimization of these units. This systematic review was conducted to thoroughly examine and summarize the currently available AI solutions, assess potential benefits from their implementation, and identify anticipated directions of further development in this fascinating and rapidly evolving field. Methods This systematic review utilized data compiled from three key scientific databases: PubMed (2045 publications), Scopus (877 publications), and Web of Science (2495 publications). After meticulous removal of duplicates, we conducted a detailed analysis of 2052 articles, including 147 full-text papers. From these, we selected 51 of the most pertinent and representative publications for the review. Results Overall the present research indicates that due to high accuracy and sensitivity of machine learning (ML) models it's reasonable to use AI in support of doctors as it can show them the potential diagnosis, which could save time and resources. However, AI-generated diagnoses should be verified by a doctor as AI is not infallible. Conclusions Currently available AI algorithms are capable of analysing complex medical data with unprecedented precision and speed. Despite AI's vast potential, it is still a nascent technology that is often perceived as complicated and challenging to implement. We propose that a pivotal point in effectively harnessing this technology is the close collaboration between medical professionals and AI experts. Future research should focus on further refining AI algorithms, performing comprehensive validation, and introducing suitable legal regulations and standard procedures, thereby fully leveraging the potential of AI to enhance the quality and efficiency of healthcare delivery.
Collapse
Affiliation(s)
- Szymczyk Aleksandra
- Department of Emergency Medicine, Medical University of Gdansk, Smoluchowskiego 17, 80-214 Gdansk, Poland
| | | | | | | | | |
Collapse
|
5
|
Welzel C, Cotte F, Wekenborg M, Vasey B, McCulloch P, Gilbert S. Holistic Human-Serving Digitization of Health Care Needs Integrated Automated System-Level Assessment Tools. J Med Internet Res 2023; 25:e50158. [PMID: 38117545 PMCID: PMC10765286 DOI: 10.2196/50158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 09/01/2023] [Accepted: 10/26/2023] [Indexed: 12/21/2023] Open
Abstract
Digital health tools, platforms, and artificial intelligence- or machine learning-based clinical decision support systems are increasingly part of health delivery approaches, with an ever-greater degree of system interaction. Critical to the successful deployment of these tools is their functional integration into existing clinical routines and workflows. This depends on system interoperability and on intuitive and safe user interface design. The importance of minimizing emergent workflow stress through human factors research and purposeful design for integration cannot be overstated. Usability of tools in practice is as important as algorithm quality. Regulatory and health technology assessment frameworks recognize the importance of these factors to a certain extent, but their focus remains mainly on the individual product rather than on emergent system and workflow effects. The measurement of performance and user experience has so far been performed in ad hoc, nonstandardized ways by individual actors using their own evaluation approaches. We propose that a standard framework for system-level and holistic evaluation could be built into interacting digital systems to enable systematic and standardized system-wide, multiproduct, postmarket surveillance and technology assessment. Such a system could be made available to developers through regulatory or assessment bodies as an application programming interface and could be a requirement for digital tool certification, just as interoperability is. This would enable health systems and tool developers to collect system-level data directly from real device use cases, enabling the controlled and safe delivery of systematic quality assessment or improvement studies suitable for the complexity and interconnectedness of clinical workflows using developing digital health technologies.
Collapse
Affiliation(s)
- Cindy Welzel
- Else Kröner Fresenius Center for Digital Health, TUD Dresden University of Technology, Dresden, Germany
| | | | - Magdalena Wekenborg
- Else Kröner Fresenius Center for Digital Health, TUD Dresden University of Technology, Dresden, Germany
| | - Baptiste Vasey
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
| | - Peter McCulloch
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
| | - Stephen Gilbert
- Else Kröner Fresenius Center for Digital Health, TUD Dresden University of Technology, Dresden, Germany
| |
Collapse
|
6
|
Fraser H, Crossland D, Bacher I, Ranney M, Madsen T, Hilliard R. Comparison of Diagnostic and Triage Accuracy of Ada Health and WebMD Symptom Checkers, ChatGPT, and Physicians for Patients in an Emergency Department: Clinical Data Analysis Study. JMIR Mhealth Uhealth 2023; 11:e49995. [PMID: 37788063 PMCID: PMC10582809 DOI: 10.2196/49995] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 08/17/2023] [Accepted: 08/25/2023] [Indexed: 10/04/2023] Open
Abstract
BACKGROUND Diagnosis is a core component of effective health care, but misdiagnosis is common and can put patients at risk. Diagnostic decision support systems can play a role in improving diagnosis by physicians and other health care workers. Symptom checkers (SCs) have been designed to improve diagnosis and triage (ie, which level of care to seek) by patients. OBJECTIVE The aim of this study was to evaluate the performance of the new large language model ChatGPT (versions 3.5 and 4.0), the widely used WebMD SC, and an SC developed by Ada Health in the diagnosis and triage of patients with urgent or emergent clinical problems compared with the final emergency department (ED) diagnoses and physician reviews. METHODS We used previously collected, deidentified, self-report data from 40 patients presenting to an ED for care who used the Ada SC to record their symptoms prior to seeing the ED physician. Deidentified data were entered into ChatGPT versions 3.5 and 4.0 and WebMD by a research assistant blinded to diagnoses and triage. Diagnoses from all 4 systems were compared with the previously abstracted final diagnoses in the ED as well as with diagnoses and triage recommendations from three independent board-certified ED physicians who had blindly reviewed the self-report clinical data from Ada. Diagnostic accuracy was calculated as the proportion of the diagnoses from ChatGPT, Ada SC, WebMD SC, and the independent physicians that matched at least one ED diagnosis (stratified as top 1 or top 3). Triage accuracy was calculated as the number of recommendations from ChatGPT, WebMD, or Ada that agreed with at least 2 of the independent physicians or were rated "unsafe" or "too cautious." RESULTS Overall, 30 and 37 cases had sufficient data for diagnostic and triage analysis, respectively. The rate of top-1 diagnosis matches for Ada, ChatGPT 3.5, ChatGPT 4.0, and WebMD was 9 (30%), 12 (40%), 10 (33%), and 12 (40%), respectively, with a mean rate of 47% for the physicians. The rate of top-3 diagnostic matches for Ada, ChatGPT 3.5, ChatGPT 4.0, and WebMD was 19 (63%), 19 (63%), 15 (50%), and 17 (57%), respectively, with a mean rate of 69% for physicians. The distribution of triage results for Ada was 62% (n=23) agree, 14% unsafe (n=5), and 24% (n=9) too cautious; that for ChatGPT 3.5 was 59% (n=22) agree, 41% (n=15) unsafe, and 0% (n=0) too cautious; that for ChatGPT 4.0 was 76% (n=28) agree, 22% (n=8) unsafe, and 3% (n=1) too cautious; and that for WebMD was 70% (n=26) agree, 19% (n=7) unsafe, and 11% (n=4) too cautious. The unsafe triage rate for ChatGPT 3.5 (41%) was significantly higher (P=.009) than that of Ada (14%). CONCLUSIONS ChatGPT 3.5 had high diagnostic accuracy but a high unsafe triage rate. ChatGPT 4.0 had the poorest diagnostic accuracy, but a lower unsafe triage rate and the highest triage agreement with the physicians. The Ada and WebMD SCs performed better overall than ChatGPT. Unsupervised patient use of ChatGPT for diagnosis and triage is not recommended without improvements to triage accuracy and extensive clinical evaluation.
Collapse
Affiliation(s)
- Hamish Fraser
- Brown Center for Biomedical Informatics, The Warren Alpert Medical School of Brown University, Providence, RI, United States
- Department of Health Services, Policy and Practice, Brown University School of Public Health, Providence, RI, United States
| | - Daven Crossland
- Brown Center for Biomedical Informatics, The Warren Alpert Medical School of Brown University, Providence, RI, United States
- Department of Epidemiology, Brown University School of Public Health, Providence, RI, United States
| | - Ian Bacher
- Brown Center for Biomedical Informatics, The Warren Alpert Medical School of Brown University, Providence, RI, United States
| | - Megan Ranney
- School of Public Health, Yale University, New Haven, CT, United States
| | - Tracy Madsen
- Department of Epidemiology, Brown University School of Public Health, Providence, RI, United States
- Department of Emergency Medicine, The Warren Alpert Medical School of Brown University, Providence, RI, United States
| | - Ross Hilliard
- Department of Internal Medicine, Maine Medical Center, Portland, ME, United States
| |
Collapse
|
7
|
Määttä J, Lindell R, Hayward N, Martikainen S, Honkanen K, Inkala M, Hirvonen P, Martikainen TJ. Diagnostic Performance, Triage Safety, and Usability of a Clinical Decision Support System Within a University Hospital Emergency Department: Algorithm Performance and Usability Study. JMIR Med Inform 2023; 11:e46760. [PMID: 37656018 PMCID: PMC10501486 DOI: 10.2196/46760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 06/22/2023] [Accepted: 07/14/2023] [Indexed: 09/02/2023] Open
Abstract
Background Computerized clinical decision support systems (CDSSs) are increasingly adopted in health care to optimize resources and streamline patient flow. However, they often lack scientific validation against standard medical care. Objective The purpose of this study was to assess the performance, safety, and usability of a CDSS in a university hospital emergency department setting in Kuopio, Finland. Methods Patients entering the emergency department were asked to voluntarily participate in this study. Patients aged 17 years or younger, patients with cognitive impairments, and patients who entered the unit in an ambulance or with the need for immediate care were excluded. Patients completed the CDSS web-based form and usability questionnaire when waiting for the triage nurse's evaluation. The CDSS data were anonymized and did not affect the patients' usual evaluation or treatment. Retrospectively, 2 medical doctors evaluated the urgency of each patient's condition by using the triage nurse's information, and urgent and nonurgent groups were created. The International Statistical Classification of Diseases, Tenth Revision diagnoses were collected from the electronic health records. Usability was assessed by using a positive version of the System Usability Scale questionnaire. Results In total, our analyses included 248 patients. Regarding urgency, the mean sensitivities were 85% and 19%, respectively, for urgent and nonurgent cases when assessing the performance of CDSS evaluations in comparison to that of physicians. The mean sensitivities were 85% and 35%, respectively, when comparing the evaluations between the two physicians. Our CDSS did not miss any cases that were evaluated to be emergencies by physicians; thus, all emergency cases evaluated by physicians were evaluated as either urgent cases or emergency cases by the CDSS. In differential diagnosis, the CDSS had an exact match accuracy of 45.5% (97/213). The usability was good, with a mean System Usability Scale score of 78.2 (SD 16.8). Conclusions In a university hospital emergency department setting with a large real-world population, our CDSS was found to be equally as sensitive in urgent patient cases as physicians and was found to have an acceptable differential diagnosis accuracy, with good usability. These results suggest that this CDSS can be safely assessed further in a real-world setting. A CDSS could accelerate triage by providing patient-provided data in advance of patients' initial consultations and categorize patient cases as urgent and nonurgent cases upon patients' arrival to the emergency department.
Collapse
Affiliation(s)
| | - Rony Lindell
- Klinik Healthcare Solutions Oy, Helsinki, Finland
| | - Nick Hayward
- Klinik Healthcare Solutions Oy, Helsinki, Finland
| | - Susanna Martikainen
- Department of Health and Social Management, University of Eastern Finland, Kuopio, Finland
| | - Katri Honkanen
- Department of Emergency Care, Kuopio University Hospital, Kuopio, Finland
| | - Matias Inkala
- Department of Emergency Care, Kuopio University Hospital, Kuopio, Finland
| | | | - Tero J Martikainen
- Department of Emergency Care, Kuopio University Hospital, Kuopio, Finland
| |
Collapse
|
8
|
North F, Garrison GM, Jensen TB, Pecina J, Stroebel R. Hospitalization Risk Associated With Emergency Department Reasons for Visit and Patient Age: A Retrospective Evaluation of National Emergency Department Survey Data to Help Identify Potentially Avoidable Emergency Department Visits. Health Serv Res Manag Epidemiol 2023; 10:23333928231214169. [PMID: 38023369 PMCID: PMC10664417 DOI: 10.1177/23333928231214169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/30/2023] [Accepted: 10/30/2023] [Indexed: 12/01/2023] Open
Abstract
Background Patients often present to emergency departments (EDs) with concerns that do not require emergency care. Self-triage and other interventions may help some patients decide whether they should be seen in the ED. Symptoms associated with low risk of hospitalization can be identified in national ED data and can inform the design of interventions to reduce avoidable ED visits. Methods We used the National Hospital Ambulatory Medical Care Survey (NHAMCS) data from the United States National Health Care Statistics (NHCS) division of the Centers for Disease Control and Prevention (CDC). The ED datasets from 2011 through 2020 were combined. Primary reasons for ED visit and the binary field for hospital admission from the ED were used to estimate the proportion of ED patients admitted to the hospital for each reason for visit and age category. Results There were 221,027 surveyed ED visits during the 10-year data collection with 736 different primary reasons for visit and 23,228 hospitalizations. There were 145 million estimated hospitalizations from 1.37 billion estimated ED visits (10.6%). Inclusion criteria for this study were reasons for visit which had at least 30 ED visits in the sample; there were 396 separate reasons for visit which met this criteria. Of these 396 reasons for visit, 97 had admission percentages less than 2% and another 52 had hospital admissions estimated between 2% and 4%. However, there was a significant increase in hospitalizations within many of the ED reasons for visit in older adults. Conclusion Reasons for visit from national ED data can be ranked by hospitalization risk. Low-risk symptoms may help healthcare institutions identify potentially avoidable ED visits. Healthcare systems can use this information to help manage potentially avoidable ED visits with interventions designed to apply to their patient population and healthcare access.
Collapse
Affiliation(s)
- Frederick North
- Community Internal Medicine, Geriatrics, and Palliative Care, Mayo Clinic, Rochester, MN, USA
| | | | - Teresa B Jensen
- Department of Family Medicine, Mayo Clinic, Rochester, MN, USA
| | - Jennifer Pecina
- Department of Family Medicine, Mayo Clinic, Rochester, MN, USA
| | - Robert Stroebel
- Community Internal Medicine, Geriatrics, and Palliative Care, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
9
|
Fraser HSF, Cohan G, Koehler C, Anderson J, Lawrence A, Pateña J, Bacher I, Ranney ML. Evaluation of Diagnostic and Triage Accuracy and Usability of a Symptom Checker in an Emergency Department: Observational Study. JMIR Mhealth Uhealth 2022; 10:e38364. [PMID: 36121688 PMCID: PMC9531004 DOI: 10.2196/38364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 05/31/2022] [Accepted: 06/10/2022] [Indexed: 11/26/2022] Open
Abstract
Background Symptom checkers are clinical decision support apps for patients, used by tens of millions of people annually. They are designed to provide diagnostic and triage advice and assist users in seeking the appropriate level of care. Little evidence is available regarding their diagnostic and triage accuracy with direct use by patients for urgent conditions. Objective The aim of this study is to determine the diagnostic and triage accuracy and usability of a symptom checker in use by patients presenting to an emergency department (ED). Methods We recruited a convenience sample of English-speaking patients presenting for care in an urban ED. Each consenting patient used a leading symptom checker from Ada Health before the ED evaluation. Diagnostic accuracy was evaluated by comparing the symptom checker’s diagnoses and those of 3 independent emergency physicians viewing the patient-entered symptom data, with the final diagnoses from the ED evaluation. The Ada diagnoses and triage were also critiqued by the independent physicians. The patients completed a usability survey based on the Technology Acceptance Model. Results A total of 40 (80%) of the 50 participants approached completed the symptom checker assessment and usability survey. Their mean age was 39.3 (SD 15.9; range 18-76) years, and they were 65% (26/40) female, 68% (27/40) White, 48% (19/40) Hispanic or Latino, and 13% (5/40) Black or African American. Some cases had missing data or a lack of a clear ED diagnosis; 75% (30/40) were included in the analysis of diagnosis, and 93% (37/40) for triage. The sensitivity for at least one of the final ED diagnoses by Ada (based on its top 5 diagnoses) was 70% (95% CI 54%-86%), close to the mean sensitivity for the 3 physicians (on their top 3 diagnoses) of 68.9%. The physicians rated the Ada triage decisions as 62% (23/37) fully agree and 24% (9/37) safe but too cautious. It was rated as unsafe and too risky in 22% (8/37) of cases by at least one physician, in 14% (5/37) of cases by at least two physicians, and in 5% (2/37) of cases by all 3 physicians. Usability was rated highly; participants agreed or strongly agreed with the 7 Technology Acceptance Model usability questions with a mean score of 84.6%, although “satisfaction” and “enjoyment” were rated low. Conclusions This study provides preliminary evidence that a symptom checker can provide acceptable usability and diagnostic accuracy for patients with various urgent conditions. A total of 14% (5/37) of symptom checker triage recommendations were deemed unsafe and too risky by at least two physicians based on the symptoms recorded, similar to the results of studies on telephone and nurse triage. Larger studies are needed of diagnosis and triage performance with direct patient use in different clinical environments.
Collapse
Affiliation(s)
- Hamish S F Fraser
- Brown Center for Biomedical Informatics, Warren Alpert Medical School, Brown University, Providence, RI, United States
- School of Public Health, Brown University, Providence, RI, United States
| | - Gregory Cohan
- Warren Alpert Medical School, Brown University, Providence, RI, United States
| | - Christopher Koehler
- Department of Emergency Medicine, Brown University, Providence, RI, United States
| | - Jared Anderson
- Department of Emergency Medicine, Brown University, Providence, RI, United States
| | - Alexis Lawrence
- Harvard Medical Faculty Physicians, Department of Emergency Medicine, St Luke's Hospital, New Bedford, MA, United States
| | - John Pateña
- Brown-Lifespan Center for Digital Health, Providence, RI, United States
| | - Ian Bacher
- Brown Center for Biomedical Informatics, Warren Alpert Medical School, Brown University, Providence, RI, United States
| | - Megan L Ranney
- School of Public Health, Brown University, Providence, RI, United States
- Department of Emergency Medicine, Brown University, Providence, RI, United States
- Brown-Lifespan Center for Digital Health, Providence, RI, United States
| |
Collapse
|
10
|
Schmude M, Salim N, Azadzoy H, Bane M, Millen E, O'Donnell L, Bode P, Türk E, Vaidya R, Gilbert S. Investigating the Potential for Clinical Decision Support in Sub-Saharan Africa With AFYA (Artificial Intelligence-Based Assessment of Health Symptoms in Tanzania): Protocol for a Prospective, Observational Pilot Study. JMIR Res Protoc 2022; 11:e34298. [PMID: 35671073 PMCID: PMC9214611 DOI: 10.2196/34298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 02/17/2022] [Accepted: 04/30/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Low- and middle-income countries face difficulties in providing adequate health care. One of the reasons is a shortage of qualified health workers. Diagnostic decision support systems are designed to aid clinicians in their work and have the potential to mitigate pressure on health care systems. OBJECTIVE The Artificial Intelligence-Based Assessment of Health Symptoms in Tanzania (AFYA) study will evaluate the potential of an English-language artificial intelligence-based prototype diagnostic decision support system for mid-level health care practitioners in a low- or middle-income setting. METHODS This is an observational, prospective clinical study conducted in a busy Tanzanian district hospital. In addition to usual care visits, study participants will consult a mid-level health care practitioner, who will use a prototype diagnostic decision support system, and a study physician. The accuracy and comprehensiveness of the differential diagnosis provided by the diagnostic decision support system will be evaluated against a gold-standard differential diagnosis provided by an expert panel. RESULTS Patient recruitment started in October 2021. Participants were recruited directly in the waiting room of the outpatient clinic at the hospital. Data collection will conclude in May 2022. Data analysis is planned to be finished by the end of June 2022. The results will be published in a peer-reviewed journal. CONCLUSIONS Most diagnostic decision support systems have been developed and evaluated in high-income countries, but there is great potential for these systems to improve the delivery of health care in low- and middle-income countries. The findings of this real-patient study will provide insights based on the performance and usability of a prototype diagnostic decision support system in low- or middle-income countries. TRIAL REGISTRATION ClinicalTrials.gov NCT04958577; http://clinicaltrials.gov/ct2/show/NCT04958577. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/34298.
Collapse
Affiliation(s)
| | - Nahya Salim
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, United Republic of Tanzania
| | | | - Mustafa Bane
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, United Republic of Tanzania
| | | | | | | | | | | | - Stephen Gilbert
- Ada Health GmbH, Berlin, Germany.,Else Kröner Fresenius Center for Digital Health, University Hospital Carl Gustav Carus Dresden, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|