1
|
Hammoud M, Douglas S, Darmach M, Alawneh S, Sanyal S, Kanbour Y. Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study. JMIR AI 2024; 3:e46875. [PMID: 38875676 DOI: 10.2196/46875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/15/2023] [Accepted: 03/02/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Medical self-diagnostic tools (or symptom checkers) are becoming an integral part of digital health and our daily lives, whereby patients are increasingly using them to identify the underlying causes of their symptoms. As such, it is essential to rigorously investigate and comprehensively report the diagnostic performance of symptom checkers using standard clinical and scientific approaches. OBJECTIVE This study aims to evaluate and report the accuracies of a few known and new symptom checkers using a standard and transparent methodology, which allows the scientific community to cross-validate and reproduce the reported results, a step much needed in health informatics. METHODS We propose a 4-stage experimentation methodology that capitalizes on the standard clinical vignette approach to evaluate 6 symptom checkers. To this end, we developed and peer-reviewed 400 vignettes, each approved by at least 5 out of 7 independent and experienced primary care physicians. To establish a frame of reference and interpret the results of symptom checkers accordingly, we further compared the best-performing symptom checker against 3 primary care physicians with an average experience of 16.6 (SD 9.42) years. To measure accuracy, we used 7 standard metrics, including M1 as a measure of a symptom checker's or a physician's ability to return a vignette's main diagnosis at the top of their differential list, F1-score as a trade-off measure between recall and precision, and Normalized Discounted Cumulative Gain (NDCG) as a measure of a differential list's ranking quality, among others. RESULTS The diagnostic accuracies of the 6 tested symptom checkers vary significantly. For instance, the differences in the M1, F1-score, and NDCG results between the best-performing and worst-performing symptom checkers or ranges were 65.3%, 39.2%, and 74.2%, respectively. The same was observed among the participating human physicians, whereby the M1, F1-score, and NDCG ranges were 22.8%, 15.3%, and 21.3%, respectively. When compared against each other, physicians outperformed the best-performing symptom checker by an average of 1.2% using F1-score, whereas the best-performing symptom checker outperformed physicians by averages of 10.2% and 25.1% using M1 and NDCG, respectively. CONCLUSIONS The performance variation between symptom checkers is substantial, suggesting that symptom checkers cannot be treated as a single entity. On a different note, the best-performing symptom checker was an artificial intelligence (AI)-based one, shedding light on the promise of AI in improving the diagnostic capabilities of symptom checkers, especially as AI keeps advancing exponentially.
Collapse
|
2
|
Wetzel AJ, Klemmt M, Müller R, Rieger MA, Joos S, Koch R. Only the anxious ones? Identifying characteristics of symptom checker app users: a cross-sectional survey. BMC Med Inform Decis Mak 2024; 24:21. [PMID: 38262993 PMCID: PMC10804572 DOI: 10.1186/s12911-024-02430-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/16/2024] [Indexed: 01/25/2024] Open
Abstract
BACKGROUND Symptom checker applications (SCAs) may help laypeople classify their symptoms and receive recommendations on medically appropriate actions. Further research is necessary to estimate the influence of user characteristics, attitudes and (e)health-related competencies. OBJECTIVE The objective of this study is to identify meaningful predictors for SCA use considering user characteristics. METHODS An explorative cross-sectional survey was conducted to investigate German citizens' demographics, eHealth literacy, hypochondria, self-efficacy, and affinity for technology using German language-validated questionnaires. A total of 869 participants were eligible for inclusion in the study. As n = 67 SCA users were assessed and matched 1:1 with non-users, a sample of n = 134 participants were assessed in the main analysis. A four-step analysis was conducted involving explorative predictor selection, model comparisons, and parameter estimates for selected predictors, including sensitivity and post hoc analyses. RESULTS Hypochondria and self-efficacy were identified as meaningful predictors of SCA use. Hypochondria showed a consistent and significant effect across all analyses OR: 1.24-1.26 (95% CI: 1.1-1.4). Self-efficacy OR: 0.64-0.93 (95% CI: 0.3-1.4) showed inconsistent and nonsignificant results, leaving its role in SCA use unclear. Over half of the SCA users in our sample met the classification for hypochondria (cut-off on the WI of 5). CONCLUSIONS Hypochondria has emerged as a significant predictor of SCA use with a consistently stable effect, yet according to the literature, individuals with this trait may be less likely to benefit from SCA despite their greater likelihood of using it. These users could be further unsettled by risk-averse triage and unlikely but serious diagnosis suggestions. TRIAL REGISTRATION The study was registered in the German Clinical Trials Register (DRKS) DRKS00022465, DERR1- https://doi.org/10.2196/34026 .
Collapse
Affiliation(s)
- Anna-Jasmin Wetzel
- Institute for General Practice and Interprofessional Care, University Hospital Tübingen, Osianderstr 5, 72076, Tübingen, Germany.
| | - Malte Klemmt
- Institute of Applied Social Sciences, Technical University of Applied Sciences, Würzburg-Schweinfurt, Tiepolostraße 6, 97070, Würzburg, Germany
| | - Regina Müller
- Institute for Philosophy, University of Bremen, Enrique-Schmidt-Str 7, 28359, Bremen, Germany
| | - Monika A Rieger
- Institute of Occupational Medicine, Social Medicine and Health Services Research, University Hospital Tübingen, Wilhelmstr 27, 72074, Tübingen, Germany
| | - Stefanie Joos
- Institute for General Practice and Interprofessional Care, University Hospital Tübingen, Osianderstr 5, 72076, Tübingen, Germany
| | - Roland Koch
- Institute for General Practice and Interprofessional Care, University Hospital Tübingen, Osianderstr 5, 72076, Tübingen, Germany
| |
Collapse
|
3
|
Wetzel AJ, Koch R, Koch N, Klemmt M, Müller R, Preiser C, Rieger M, Rösel I, Ranisch R, Ehni HJ, Joos S. 'Better see a doctor?' Status quo of symptom checker apps in Germany: A cross-sectional survey with a mixed-methods design (CHECK.APP). Digit Health 2024; 10:20552076241231555. [PMID: 38434790 PMCID: PMC10908232 DOI: 10.1177/20552076241231555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2024] [Indexed: 03/05/2024] Open
Abstract
Background Symptom checker apps (SCAs) offer symptom classification and low-threshold self-triage for laypeople. They are already in use despite their poor accuracy and concerns that they may negatively affect primary care. This study assesses the extent to which SCAs are used by medical laypeople in Germany and which software is most popular. We examined associations between satisfaction with the general practitioner (GP) and SCA use as well as the number of GP visits and SCA use. Furthermore, we assessed the reasons for intentional non-use. Methods We conducted a survey comprising standardised and open-ended questions. Quantitative data were weighted, and open-ended responses were examined using thematic analysis. Results This study included 850 participants. The SCA usage rate was 8%, and approximately 50% of SCA non-users were uninterested in trying SCAs. The most commonly used SCAs were NetDoktor and Ada. Surprisingly, SCAs were most frequently used in the age group of 51-55 years. No significant associations were found between SCA usage and satisfaction with the GP or the number of GP visits and SCA usage. Thematic analysis revealed skepticism regarding the results and recommendations of SCAs and discrepancies between users' requirements and the features of apps. Conclusion SCAs are still widely unknown in the German population and have been sparsely used so far. Many participants were not interested in trying SCAs, and we found no positive or negative associations of SCAs and primary care.
Collapse
Affiliation(s)
- Anna-Jasmin Wetzel
- Institute of General Practice and Interprofessional Care, University Hospital Tübingen, Tübingen, Germany
| | - Roland Koch
- Institute of General Practice and Interprofessional Care, University Hospital Tübingen, Tübingen, Germany
| | - Nadine Koch
- Institute of Software Engineering, University of Stuttgart, Stuttgart, Germany
| | - Malte Klemmt
- Institute of Applied Social Science, University of Applied Science Würzburg-Schweinfurt, Wurzburg, Germany
| | - Regina Müller
- Institute of Philosophy, University of Bremen, Bremen, Germany
| | - Christine Preiser
- Institute of Occupational and Social Medicine and Health Services Research, University Hospital Tübingen, Tübingen, Germany
| | - Monika Rieger
- Institute of Occupational and Social Medicine and Health Services Research, University Hospital Tübingen, Tübingen, Germany
| | - Inka Rösel
- Institute of Clinical Epidemiology and Applied Biometry, University Hospital Tübingen, Tübingen, Germany
| | - Robert Ranisch
- Faculty of Health Sciences, University of Potsdam, Potsdam, Germany
| | - Hans-Jörg Ehni
- Institute of Ethics and History of Medicine, University Hospital Tübingen, Tübingen, Germany
| | - Stefanie Joos
- Institute of General Practice and Interprofessional Care, University Hospital Tübingen, Tübingen, Germany
| |
Collapse
|
4
|
Li BR, Wang J. Research status of internet-delivered cognitive behavioral therapy in cancer patients. World J Psychiatry 2023; 13:831-837. [DOI: 10.5498/wjp.v13.i11.831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/17/2023] [Accepted: 09/27/2023] [Indexed: 11/17/2023] Open
Abstract
The latest global cancer burden data released by the International Agency for Research on Cancer of the World Health Organization in 2020 shows that there were 19.29 million new cancer cases worldwide, with 4.57 million in China, ranking first. The number of cancer survivors is increasing, with a 5-year survival rate exceeding 85%, but there are emotional disorders. Cognitive behavioral therapy (CBT) can improve negative emotions and has significant effects on patients. However, there is a limited number of physicians and high costs, so internet interventions have become a solution. The feasibility of web-based interventions for breast cancer patients has been proven. Research on internet-delivered CBT is also increasing. The purpose of this study was to review the concept of web-based CBT and its application status in cancer survivors, in order to provide relevant intervention for scholars and provide reference and supplement for patients to provide psychological therapy.
Collapse
Affiliation(s)
- Bing-Rui Li
- Operating Room, The Fourth Affiliated Hospital of China Medical University, Shenyang 110033, Liaoning Province, China
| | - Jing Wang
- Operating Room, The Fourth Affiliated Hospital of China Medical University, Shenyang 110033, Liaoning Province, China
| |
Collapse
|
5
|
Gellert GA, Rasławska-Socha J, Marcjasz N, Price T, Kuszczyński K, Młodawska A, Jędruch A, Orzechowski PM. How Virtual Triage Can Improve Patient Experience and Satisfaction: A Narrative Review and Look Forward. TELEMEDICINE REPORTS 2023; 4:292-306. [PMID: 37817871 PMCID: PMC10561746 DOI: 10.1089/tmr.2023.0037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/21/2023] [Indexed: 10/12/2023]
Abstract
Objective To complete a review of the literature on patient experience and satisfaction as relates to the potential for virtual triage (VT) or symptom checkers to enhance and enable improvements in these important health care delivery objectives. Methods Review and synthesis of the literature on patient experience and satisfaction as informed by emerging evidence, indicating potential for VT to favorably impact these clinical care objectives and outcomes. Results/Conclusions VT enhances potential clinical effectiveness through early detection and referral, can reduce avoidable care delivery due to late clinical presentation, and can divert primary care needs to more clinically appropriate outpatient settings rather than high-acuity emergency departments. Delivery of earlier and faster, more acuity level-appropriate care, as well as patient avoidance of excess care acuity (and associated cost), offer promise as contributors to improved patient experience and satisfaction. The application of digital triage as a front door to health care delivery organizations offers care engagement that can help reduce patient need to visit a medical facility for low-acuity conditions more suitable for self-care, thus avoiding unpleasant queues and reducing microbiological and other patient risks associated with visits to medical facilities. VT also offers an opportunity for providers to make patient health care experiences more personalized.
Collapse
|
6
|
Wiedermann CJ, Mahlknecht A, Piccoliori G, Engl A. Redesigning Primary Care: The Emergence of Artificial-Intelligence-Driven Symptom Diagnostic Tools. J Pers Med 2023; 13:1379. [PMID: 37763147 PMCID: PMC10532810 DOI: 10.3390/jpm13091379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 09/13/2023] [Accepted: 09/14/2023] [Indexed: 09/29/2023] Open
Abstract
Modern healthcare is facing a juxtaposition of increasing patient demands owing to an aging population and a decreasing general practitioner workforce, leading to strained access to primary care. The coronavirus disease 2019 pandemic has emphasized the potential for alternative consultation methods, highlighting opportunities to minimize unnecessary care. This article discusses the role of artificial-intelligence-driven symptom checkers, particularly their efficiency, utility, and challenges in primary care. Based on a study conducted in Italian general practices, insights from both physicians and patients were gathered regarding this emergent technology, highlighting differences in perceived utility, user satisfaction, and potential challenges. While symptom checkers are seen as potential tools for addressing healthcare challenges, concerns regarding their accuracy and the potential for misdiagnosis persist. Patients generally viewed them positively, valuing their ease of use and the empowerment they provide in managing health. However, some general practitioners perceive these tools as challenges to their expertise. This article proposes that artificial-intelligence-based symptom checkers can optimize medical-history taking for the benefit of both general practitioners and patients, with potential enhancements in complex diagnostic tasks rather than routine diagnoses. It underscores the importance of carefully integrating digital innovations while preserving the essential human touch in healthcare. Symptom checkers offer promising solutions; ensuring their accuracy, reliability, and effective integration into primary care requires rigorous research, clinical guidance, and an understanding of varied user perceptions. Collaboration among technologists, clinicians, and patients is paramount for the successful evolution of digital tools in healthcare.
Collapse
Affiliation(s)
- Christian J. Wiedermann
- Institute of General Practice and Public Health, Claudiana—College of Health Professions, 39100 Bolzano, Italy
- Department of Public Health, Medical Decision Making and HTA, University of Health Sciences, Medical Informatics and Technology-Tyrol, 6060 Hall, Austria
| | - Angelika Mahlknecht
- Institute of General Practice and Public Health, Claudiana—College of Health Professions, 39100 Bolzano, Italy
| | - Giuliano Piccoliori
- Institute of General Practice and Public Health, Claudiana—College of Health Professions, 39100 Bolzano, Italy
| | - Adolf Engl
- Institute of General Practice and Public Health, Claudiana—College of Health Professions, 39100 Bolzano, Italy
| |
Collapse
|
7
|
Hurvitz N, Ilan Y. The Constrained-Disorder Principle Assists in Overcoming Significant Challenges in Digital Health: Moving from "Nice to Have" to Mandatory Systems. Clin Pract 2023; 13:994-1014. [PMID: 37623270 PMCID: PMC10453547 DOI: 10.3390/clinpract13040089] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/16/2023] [Accepted: 08/18/2023] [Indexed: 08/26/2023] Open
Abstract
The success of artificial intelligence depends on whether it can penetrate the boundaries of evidence-based medicine, the lack of policies, and the resistance of medical professionals to its use. The failure of digital health to meet expectations requires rethinking some of the challenges faced. We discuss some of the most significant challenges faced by patients, physicians, payers, pharmaceutical companies, and health systems in the digital world. The goal of healthcare systems is to improve outcomes. Assisting in diagnosing, collecting data, and simplifying processes is a "nice to have" tool, but it is not essential. Many of these systems have yet to be shown to improve outcomes. Current outcome-based expectations and economic constraints make "nice to have," "assists," and "ease processes" insufficient. Complex biological systems are defined by their inherent disorder, bounded by dynamic boundaries, as described by the constrained disorder principle (CDP). It provides a platform for correcting systems' malfunctions by regulating their degree of variability. A CDP-based second-generation artificial intelligence system provides solutions to some challenges digital health faces. Therapeutic interventions are held to improve outcomes with these systems. In addition to improving clinically meaningful endpoints, CDP-based second-generation algorithms ensure patient and physician engagement and reduce the health system's costs.
Collapse
Affiliation(s)
| | - Yaron Ilan
- Hadassah Medical Center, Department of Medicine, Faculty of Medicine, Hebrew University, POB 1200, Jerusalem IL91120, Israel;
| |
Collapse
|
8
|
Nov O, Singh N, Mann D. Putting ChatGPT's Medical Advice to the (Turing) Test: Survey Study. JMIR MEDICAL EDUCATION 2023; 9:e46939. [PMID: 37428540 PMCID: PMC10366957 DOI: 10.2196/46939] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/26/2023] [Accepted: 06/14/2023] [Indexed: 07/11/2023]
Abstract
BACKGROUND Chatbots are being piloted to draft responses to patient questions, but patients' ability to distinguish between provider and chatbot responses and patients' trust in chatbots' functions are not well established. OBJECTIVE This study aimed to assess the feasibility of using ChatGPT (Chat Generative Pre-trained Transformer) or a similar artificial intelligence-based chatbot for patient-provider communication. METHODS A survey study was conducted in January 2023. Ten representative, nonadministrative patient-provider interactions were extracted from the electronic health record. Patients' questions were entered into ChatGPT with a request for the chatbot to respond using approximately the same word count as the human provider's response. In the survey, each patient question was followed by a provider- or ChatGPT-generated response. Participants were informed that 5 responses were provider generated and 5 were chatbot generated. Participants were asked-and incentivized financially-to correctly identify the response source. Participants were also asked about their trust in chatbots' functions in patient-provider communication, using a Likert scale from 1-5. RESULTS A US-representative sample of 430 study participants aged 18 and older were recruited on Prolific, a crowdsourcing platform for academic studies. In all, 426 participants filled out the full survey. After removing participants who spent less than 3 minutes on the survey, 392 respondents remained. Overall, 53.3% (209/392) of respondents analyzed were women, and the average age was 47.1 (range 18-91) years. The correct classification of responses ranged between 49% (192/392) to 85.7% (336/392) for different questions. On average, chatbot responses were identified correctly in 65.5% (1284/1960) of the cases, and human provider responses were identified correctly in 65.1% (1276/1960) of the cases. On average, responses toward patients' trust in chatbots' functions were weakly positive (mean Likert score 3.4 out of 5), with lower trust as the health-related complexity of the task in the questions increased. CONCLUSIONS ChatGPT responses to patient questions were weakly distinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower-risk health questions. It is important to continue studying patient-chatbot interaction as chatbots move from administrative to more clinical roles in health care.
Collapse
Affiliation(s)
- Oded Nov
- Department of Technology Management, Tandon School of Engineering, New York University, New York, NY, United States
| | - Nina Singh
- Department of Population Health, Grossman School of Medicine, New York University, New York, NY, United States
| | - Devin Mann
- Department of Population Health, Grossman School of Medicine, New York University, New York, NY, United States
- Medical Center Information Technology, Langone Health, New York University, New York, NY, United States
| |
Collapse
|
9
|
Kopka M, Scatturin L, Napierala H, Fürstenau D, Feufel MA, Balzer F, Schmieding ML. Characteristics of Users and Nonusers of Symptom Checkers in Germany: Cross-Sectional Survey Study. J Med Internet Res 2023; 25:e46231. [PMID: 37338970 DOI: 10.2196/46231] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/12/2023] [Accepted: 05/03/2023] [Indexed: 06/21/2023] Open
Abstract
BACKGROUND Previous studies have revealed that users of symptom checkers (SCs, apps that support self-diagnosis and self-triage) are predominantly female, are younger than average, and have higher levels of formal education. Little data are available for Germany, and no study has so far compared usage patterns with people's awareness of SCs and the perception of usefulness. OBJECTIVE We explored the sociodemographic and individual characteristics that are associated with the awareness, usage, and perceived usefulness of SCs in the German population. METHODS We conducted a cross-sectional online survey among 1084 German residents in July 2022 regarding personal characteristics and people's awareness and usage of SCs. Using random sampling from a commercial panel, we collected participant responses stratified by gender, state of residence, income, and age to reflect the German population. We analyzed the collected data exploratively. RESULTS Of all respondents, 16.3% (177/1084) were aware of SCs and 6.5% (71/1084) had used them before. Those aware of SCs were younger (mean 38.8, SD 14.6 years, vs mean 48.3, SD 15.7 years), were more often female (107/177, 60.5%, vs 453/907, 49.9%), and had higher formal education levels (eg, 72/177, 40.7%, vs 238/907, 26.2%, with a university/college degree) than those unaware. The same observation applied to users compared to nonusers. It disappeared, however, when comparing users to nonusers who were aware of SCs. Among users, 40.8% (29/71) considered these tools useful. Those considering them useful reported higher self-efficacy (mean 4.21, SD 0.66, vs mean 3.63, SD 0.81, on a scale of 1-5) and a higher net household income (mean EUR 2591.63, SD EUR 1103.96 [mean US $2798.96, SD US $1192.28], vs mean EUR 1626.60, SD EUR 649.05 [mean US $1756.73, SD US $700.97]) than those who considered them not useful. More women considered SCs unhelpful (13/44, 29.5%) compared to men (4/26, 15.4%). CONCLUSIONS Concurring with studies from other countries, our findings show associations between sociodemographic characteristics and SC usage in a German sample: users were on average younger, of higher socioeconomic status, and more commonly female compared to nonusers. However, usage cannot be explained by sociodemographic differences alone. It rather seems that sociodemographics explain who is or is not aware of the technology, but those who are aware of SCs are equally likely to use them, independently of sociodemographic differences. Although in some groups (eg, people with anxiety disorder), more participants reported to know and use SCs, they tended to perceive them as less useful. In other groups (eg, male participants), fewer respondents were aware of SCs, but those who used them perceived them to be more useful. Thus, SCs should be designed to fit specific user needs, and strategies should be developed to help reach individuals who could benefit but are not aware of SCs yet.
Collapse
Affiliation(s)
- Marvin Kopka
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Lennart Scatturin
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Hendrik Napierala
- Institute of General Practice and Family Medicine, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Daniel Fürstenau
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Department of Business IT, IT University of Copenhagen, København, Denmark
| | - Markus A Feufel
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Felix Balzer
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Malte L Schmieding
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
10
|
Exploratory study: Evaluation of a symptom checker effectiveness for providing a diagnosis and evaluating the situation emergency compared to emergency physicians using simulated and standardized patients. PLoS One 2023; 18:e0277568. [PMID: 36827277 PMCID: PMC9955603 DOI: 10.1371/journal.pone.0277568] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 10/30/2022] [Indexed: 02/25/2023] Open
Abstract
BACKGROUND The overloading of health care systems is an international problem. In this context, new tools such as symptom checker (SC) are emerging to improve patient orientation and triage. This SC should be rigorously evaluated and we can take a cue from the way we evaluate medical students, using objective structured clinical examinations (OSCE) with simulated patients. OBJECTIVE The main objective of this study was to evaluate the efficiency of a symptom checker versus emergency physicians using OSCEs as an assessment method. METHODS We explored a method to evaluate the ability to set a diagnosis and evaluate the emergency of a situation with simulation. A panel of medical experts wrote 220 simulated patients cases. Each situation was played twice by an actor trained to the role: once for the SC, then for an emergency physician. Like a teleconsultation, only the patient's voice was accessible. We performed a prospective non-inferiority study. If primary analysis had failed to detect non-inferiority, we have planned a superiority analysis. RESULTS The SC established only 30% of the main diagnosis as the emergency physician found 81% of these. The emergency physician was also superior compared to the SC in the suggestion of secondary diagnosis (92% versus 52%). In the matter of patient triage (vital emergency or not), there is still a medical superiority (96% versus 71%). We prove a non-inferiority of the SC compared to the physician in terms of interviewing time. CONCLUSIONS AND RELEVANCE We should use simulated patients instead of clinical cases in order to evaluate the effectiveness of SCs.
Collapse
|
11
|
Judson TJ, Pierce L, Tutman A, Mourad M, Neinstein AB, Shuler G, Gonzales R, Odisho AY. Utilization patterns and efficiency gains from use of a fully EHR-integrated COVID-19 self-triage and self-scheduling tool: a retrospective analysis. J Am Med Inform Assoc 2022; 29:2066-2074. [PMID: 36029243 PMCID: PMC9667153 DOI: 10.1093/jamia/ocac161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 08/18/2022] [Accepted: 08/26/2022] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE Symptom checkers can help address high demand for SARS-CoV2 (COVID-19) testing and care by providing patients with self-service access to triage recommendations. However, health systems may be hesitant to invest in these tools, as their associated efficiency gains have not been studied. We aimed to quantify the operational efficiency gains associated with use of an online COVID-19 symptom checker as an alternative to a telephone hotline. METHODS In our health system, ambulatory patients can either use an online symptom checker or a telephone hotline to be triaged and connected to COVID-19 care. We performed a retrospective analysis of adults who used either method between October 20, 2021 and January 10, 2022, using call logs, electronic health record data, and local wages to calculate labor costs. RESULTS Of the 15 549 total COVID-19 triage encounters, 1820 (11.7%) used only the telephone hotline and 13 729 (88.3%) used the symptom checker. Only 271 (2%) of the patients who used the symptom checker also called the hotline. Hotline encounters required more clinician time compared to those involving the symptom checker (17.8 vs 0.4 min/encounter), resulting in higher average labor costs ($24.21 vs $0.55 per encounter). The symptom checker resulted in over 4200 clinician labor hours saved. CONCLUSION When given the option, most patients completed COVID-19 triage and visit scheduling online, resulting in substantial efficiency gains. These benefits may encourage health system investment in such tools.
Collapse
Affiliation(s)
- Timothy J Judson
- Department of Medicine, University of California San Francisco, San Francisco, California, USA
- Center for Digital Health Innovation, University of California San Francisco, San Francisco, California, USA
- Office of Population Health, University of California San Francisco, San Francisco, California, USA
| | - Logan Pierce
- Department of Medicine, University of California San Francisco, San Francisco, California, USA
- Center for Digital Health Innovation, University of California San Francisco, San Francisco, California, USA
| | - Avi Tutman
- Office of Population Health, University of California San Francisco, San Francisco, California, USA
| | - Michelle Mourad
- Department of Medicine, University of California San Francisco, San Francisco, California, USA
- Center for Digital Health Innovation, University of California San Francisco, San Francisco, California, USA
| | - Aaron B Neinstein
- Department of Medicine, University of California San Francisco, San Francisco, California, USA
- Center for Digital Health Innovation, University of California San Francisco, San Francisco, California, USA
| | - Gina Shuler
- Office of Population Health, University of California San Francisco, San Francisco, California, USA
| | - Ralph Gonzales
- Department of Medicine, University of California San Francisco, San Francisco, California, USA
- Clinical Innovation Center, University of California San Francisco, San Francisco, California, USA
| | - Anobel Y Odisho
- Center for Digital Health Innovation, University of California San Francisco, San Francisco, California, USA
- Department of Urology, University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
12
|
Gilbert A, Diep AN, Boufraioua M, Pétré B, Donneau AF, Ghuysen A. Patients' self-triage for unscheduled urgent care: a preliminary study on the accuracy and factors affecting the performance of a Belgian self-triage platform. BMC Health Serv Res 2022; 22:1199. [PMID: 36151563 PMCID: PMC9508742 DOI: 10.1186/s12913-022-08571-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 09/13/2022] [Indexed: 11/17/2022] Open
Abstract
Background Management of unscheduled urgent care is a complex concern for many healthcare providers. Facing the challenge of appropriately dispatching unscheduled care, primary and emergency physicians have collaboratively implemented innovative strategies such as telephone triage. Currently, new original solutions tend to emerge with the development of new technologies. We created an interactive patient self-triage platform, ODISSEE, and aimed to explore its accuracy and potential factors affecting its performance using clinical case scenarios. Methods The ODISSEE platform was developed based on previously validated triage protocols for out-of-hours primary care. ODISSEE is composed of 18 icons leading to algorithmic questions that finally provide an advised orientation (emergency or primary care services). To investigate ODISSEE performance, we used 100 clinical case scenarios, each associated with a preestablished orientation determined by a group of experts. Fifteen volunteers were asked to self-triage with 50 randomly selected scenarios using ODISSEE on a digital tablet. Their triage results were compared with the experts’ references. Results The 15 participants performed a total of 750 self-triages, which matched the experts references regarding the level of care in 85.6% of the cases. The orientation was incorrect in 14.4%, with an undertriage rate of 1.9% and an overtriage rate of 12.5%. The tool’s specificity and sensitivity to advise participants on the appropriate level of care were 69% (95% CI: 64—74) and 97% (95% CI: 95—98) respectively. When combined with advice on the level of urgency, the tool only found the correct orientation in 68.4% with 9.2% of undertriages and 22.4% of overtriages. Some participant characteristics and the types of medical conditions demonstrated a significant association with the tool performance. Conclusion Self-triage apps, such as the ODISSEE platform, could represent an innovative method to allow patients to self-triage to the most appropriate level of care. This study based on clinical vignettes highlights some positive arguments regarding ODISSEE safety, but further research is needed to assess the generalizability of such tools to the population without equity issues. Supplementary Information The online version contains supplementary material available at 10.1186/s12913-022-08571-5.
Collapse
Affiliation(s)
- Allison Gilbert
- Emergency Department, University Hospital Center, Avenue de L'Hôpital 1, 4000, Liège, Belgium.
| | - Anh Nguyet Diep
- Public Health Department, University of Liège, Quartier Hôpital, Av. Hippocrate 13, CHU B23, 4000, Liège, Belgium.,Biostatistics Unit, University of Liège, Quartier Hôpital, Av. Hippocrate 13, CHU B23, 4000, Liège, Belgium
| | - Maryame Boufraioua
- Emergency Department, University Hospital Center, Avenue de L'Hôpital 1, 4000, Liège, Belgium
| | - Benoit Pétré
- Public Health Department, University of Liège, Quartier Hôpital, Av. Hippocrate 13, CHU B23, 4000, Liège, Belgium
| | - Anne-Françoise Donneau
- Public Health Department, University of Liège, Quartier Hôpital, Av. Hippocrate 13, CHU B23, 4000, Liège, Belgium.,Biostatistics Unit, University of Liège, Quartier Hôpital, Av. Hippocrate 13, CHU B23, 4000, Liège, Belgium
| | - Alexandre Ghuysen
- Emergency Department, University Hospital Center, Avenue de L'Hôpital 1, 4000, Liège, Belgium.,Public Health Department, University of Liège, Quartier Hôpital, Av. Hippocrate 13, CHU B23, 4000, Liège, Belgium
| |
Collapse
|
13
|
Fraser HSF, Cohan G, Koehler C, Anderson J, Lawrence A, Pateña J, Bacher I, Ranney ML. Evaluation of Diagnostic and Triage Accuracy and Usability of a Symptom Checker in an Emergency Department: Observational Study. JMIR Mhealth Uhealth 2022; 10:e38364. [PMID: 36121688 PMCID: PMC9531004 DOI: 10.2196/38364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 05/31/2022] [Accepted: 06/10/2022] [Indexed: 11/26/2022] Open
Abstract
Background Symptom checkers are clinical decision support apps for patients, used by tens of millions of people annually. They are designed to provide diagnostic and triage advice and assist users in seeking the appropriate level of care. Little evidence is available regarding their diagnostic and triage accuracy with direct use by patients for urgent conditions. Objective The aim of this study is to determine the diagnostic and triage accuracy and usability of a symptom checker in use by patients presenting to an emergency department (ED). Methods We recruited a convenience sample of English-speaking patients presenting for care in an urban ED. Each consenting patient used a leading symptom checker from Ada Health before the ED evaluation. Diagnostic accuracy was evaluated by comparing the symptom checker’s diagnoses and those of 3 independent emergency physicians viewing the patient-entered symptom data, with the final diagnoses from the ED evaluation. The Ada diagnoses and triage were also critiqued by the independent physicians. The patients completed a usability survey based on the Technology Acceptance Model. Results A total of 40 (80%) of the 50 participants approached completed the symptom checker assessment and usability survey. Their mean age was 39.3 (SD 15.9; range 18-76) years, and they were 65% (26/40) female, 68% (27/40) White, 48% (19/40) Hispanic or Latino, and 13% (5/40) Black or African American. Some cases had missing data or a lack of a clear ED diagnosis; 75% (30/40) were included in the analysis of diagnosis, and 93% (37/40) for triage. The sensitivity for at least one of the final ED diagnoses by Ada (based on its top 5 diagnoses) was 70% (95% CI 54%-86%), close to the mean sensitivity for the 3 physicians (on their top 3 diagnoses) of 68.9%. The physicians rated the Ada triage decisions as 62% (23/37) fully agree and 24% (9/37) safe but too cautious. It was rated as unsafe and too risky in 22% (8/37) of cases by at least one physician, in 14% (5/37) of cases by at least two physicians, and in 5% (2/37) of cases by all 3 physicians. Usability was rated highly; participants agreed or strongly agreed with the 7 Technology Acceptance Model usability questions with a mean score of 84.6%, although “satisfaction” and “enjoyment” were rated low. Conclusions This study provides preliminary evidence that a symptom checker can provide acceptable usability and diagnostic accuracy for patients with various urgent conditions. A total of 14% (5/37) of symptom checker triage recommendations were deemed unsafe and too risky by at least two physicians based on the symptoms recorded, similar to the results of studies on telephone and nurse triage. Larger studies are needed of diagnosis and triage performance with direct patient use in different clinical environments.
Collapse
Affiliation(s)
- Hamish S F Fraser
- Brown Center for Biomedical Informatics, Warren Alpert Medical School, Brown University, Providence, RI, United States
- School of Public Health, Brown University, Providence, RI, United States
| | - Gregory Cohan
- Warren Alpert Medical School, Brown University, Providence, RI, United States
| | - Christopher Koehler
- Department of Emergency Medicine, Brown University, Providence, RI, United States
| | - Jared Anderson
- Department of Emergency Medicine, Brown University, Providence, RI, United States
| | - Alexis Lawrence
- Harvard Medical Faculty Physicians, Department of Emergency Medicine, St Luke's Hospital, New Bedford, MA, United States
| | - John Pateña
- Brown-Lifespan Center for Digital Health, Providence, RI, United States
| | - Ian Bacher
- Brown Center for Biomedical Informatics, Warren Alpert Medical School, Brown University, Providence, RI, United States
| | - Megan L Ranney
- School of Public Health, Brown University, Providence, RI, United States
- Department of Emergency Medicine, Brown University, Providence, RI, United States
- Brown-Lifespan Center for Digital Health, Providence, RI, United States
| |
Collapse
|
14
|
Liu AW, Odisho AY, Brown Rd W, Gonzales R, Neinstein AB, Judson T. Patient Experience and Feedback after Use of an EHR-integrated COVID-19 Symptom Checker. JMIR Hum Factors 2022; 9:e40064. [PMID: 35960593 PMCID: PMC9472505 DOI: 10.2196/40064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/20/2022] [Accepted: 08/06/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Symptom checkers have been widely used during the COVID-19 pandemic to alleviate strain on health systems and offer patients a 24/7 self-service triage option. Although studies suggest that users may positively perceive online symptom checkers, no studies have quantified user feedback after use of an electronic health record (EHR)-integrated COVID-19 symptom checker with self-scheduling functionality. OBJECTIVE We aimed to understand user experience, user satisfaction, and user-reported alternatives to use of a COVID-19 symptom checker with self-triage and self-scheduling functionality. METHODS We launched a patient-portal based self-triage and self-scheduling tool in March 2020 for patients with COVID-19 symptoms, exposures, or questions. We made an optional, anonymous Qualtrics survey available to patients immediately after they completed the symptom checker. RESULTS Between December 16th, 2021 and March 28th, 2022, there were 395 unique responses to the survey. Overall, respondents reported high satisfaction across all demographics, with a median rating of 8 out of 10, and 47.6% of respondents giving a rating of 9 or 10 out of 10. User satisfaction scores were not associated with any demographic factors. The most common user-reported alternatives had the online tool not been available were calling the COVID-19 telephone hotline and sending a patient-portal message to their physician for advice. The ability to schedule a test online was the most important symptom checker feature for respondents. The most common categories of user feedback were regarding other COVID-19 services (e.g. telephone hotline), policies or procedures, or requesting additional features or functionality. CONCLUSIONS This analysis suggests that COVID-19 symptom checkers with self-triage and self-scheduling functionality may have high overall user satisfaction, regardless of user demographics. By allowing users to self-triage and self-schedule tests and visits, tools like this may prevent unnecessary calls and messages to clinicians. Individual feedback suggested that the user experience for this type of tool is highly dependent on the organization's operational workflows for COVID-19 testing and care. The study provides insight for the implementation and improvement of COVID-19 symptom checkers to ensure high user satisfaction. .
Collapse
Affiliation(s)
- Andrew Wayne Liu
- Center for Digital Health Innovation, University of California, San Francisco, 1700 Owens St, Suite 541, San Francisco, US
| | - Anobel Youhana Odisho
- Center for Digital Health Innovation, University of California, San Francisco, 1700 Owens St, Suite 541, San Francisco, US.,Department of Urology, University of California, San Francisco, San Francisco, US
| | - William Brown Rd
- Center for Digital Health Innovation, University of California, San Francisco, 1700 Owens St, Suite 541, San Francisco, US.,Department of Medicine, University of California, San Francisco, 521 Parnassus Avenue, Suite U127, Box 0131, San Francisco, US.,Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, US
| | - Ralph Gonzales
- Department of Medicine, University of California, San Francisco, 521 Parnassus Avenue, Suite U127, Box 0131, San Francisco, US.,Clinical Innovation Center, University of California, San Francisco, San Francisco, US
| | - Aaron B Neinstein
- Center for Digital Health Innovation, University of California, San Francisco, 1700 Owens St, Suite 541, San Francisco, US.,Department of Medicine, University of California, San Francisco, 521 Parnassus Avenue, Suite U127, Box 0131, San Francisco, US
| | - Timothy Judson
- Center for Digital Health Innovation, University of California, San Francisco, 1700 Owens St, Suite 541, San Francisco, US.,Department of Medicine, University of California, San Francisco, 521 Parnassus Avenue, Suite U127, Box 0131, San Francisco, US.,Office of Population Health, University of California, San Francisco, San Francisco, US
| |
Collapse
|
15
|
Wetzel AJ, Koch R, Preiser C, Müller R, Klemmt M, Ranisch R, Ehni HJ, Wiesing U, Rieger MA, Henking T, Joos S. Ethical, Legal, and Social Implications of Symptom Checker Apps in Primary Health Care (CHECK.APP): Protocol for an Interdisciplinary Mixed Methods Study. JMIR Res Protoc 2022; 11:e34026. [PMID: 35576570 PMCID: PMC9152714 DOI: 10.2196/34026] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 02/18/2022] [Accepted: 03/30/2022] [Indexed: 12/19/2022] Open
Abstract
Background Symptom checker apps (SCAs) are accessible tools that provide early symptom assessment for users. The ethical, legal, and social implications of SCAs and their impact on the patient-physician relationship, the health care providers, and the health care system have sparsely been examined. This study protocol describes an approach to investigate the possible impacts and implications of SCAs on different levels of health care provision. It considers the perspectives of the users, nonusers, general practitioners (GPs), and health care experts. Objective We aim to assess a comprehensive overview of the use of SCAs and address problematic issues, if any. The primary outcomes of this study are empirically informed multi-perspective recommendations for different stakeholders on the ethical, legal, and social implications of SCAs. Methods Quantitative and qualitative methods will be used in several overlapping and interconnected study phases. In study phase 1, a comprehensive literature review will be conducted to assess the ethical, legal, social, and systemic impacts of SCAs. Study phase 2 comprises a survey that will be analyzed with a logistic regression. It aims to assess the user degree of SCAs in Germany as well as the predictors for SCA usage. Study phase 3 will investigate self-observational diaries and user interviews, which will be analyzed as integrated cases to assess user perspectives, usage pattern, and arising problems. Study phase 4 will comprise GP interviews to assess their experiences, perspectives, self-image, and concepts and will be analyzed with the basic procedure by Kruse. Moreover, interviews with health care experts will be conducted in study phase 3 and will be analyzed by using the reflexive thematical analysis approach of Braun and Clark. Results Study phase 1 will be completed in November 2021. We expect the results of study phase 2 in December 2021 and February 2022. In study phase 3, interviews are currently being conducted. The final study endpoint will be in February 2023. Conclusions The possible ethical, legal, social, and systemic impacts of a widespread use of SCAs that affect stakeholders and stakeholder groups on different levels of health care will be identified. The proposed methodological approach provides a multifaceted and diverse empirical basis for a broad discussion on these implications. Trial Registration German Clinical Trials Register (DRKS) DRKS00022465; https://tinyurl.com/yx53er67 International Registered Report Identifier (IRRID) DERR1-10.2196/34026
Collapse
Affiliation(s)
- Anna-Jasmin Wetzel
- Institute of General Practice and Interprofessional Care, University Hospital Tübingen, Tübingen, Germany
| | - Roland Koch
- Institute of General Practice and Interprofessional Care, University Hospital Tübingen, Tübingen, Germany
| | - Christine Preiser
- Institute of Occupational and Social Medicine and Health Services Research, University Hospital Tübingen, Tübingen, Germany
| | - Regina Müller
- Institute of Ethics and History of Medicine, University Tübingen, Tübingen, Germany
| | - Malte Klemmt
- Institute of Applied Social Science, University of Applied Science Würzburg-Schweinfurt, Würzburg, Germany
| | - Robert Ranisch
- Faculty of Health Science Brandenburg, University of Potsdam, Potsdam, Germany
| | - Hans-Jörg Ehni
- Institute of Ethics and History of Medicine, University Tübingen, Tübingen, Germany
| | - Urban Wiesing
- Institute of Ethics and History of Medicine, University Tübingen, Tübingen, Germany
| | - Monika A Rieger
- Institute of Occupational and Social Medicine and Health Services Research, University Hospital Tübingen, Tübingen, Germany
| | - Tanja Henking
- Institute of Applied Social Science, University of Applied Science Würzburg-Schweinfurt, Würzburg, Germany
| | - Stefanie Joos
- Institute of General Practice and Interprofessional Care, University Hospital Tübingen, Tübingen, Germany
| |
Collapse
|
16
|
Schmieding ML, Kopka M, Schmidt K, Schulz-Niethammer S, Balzer F, Feufel MA. Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation. J Med Internet Res 2022; 24:e31810. [PMID: 35536633 PMCID: PMC9131144 DOI: 10.2196/31810] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 11/19/2021] [Accepted: 01/30/2022] [Indexed: 12/16/2022] Open
Abstract
Background Symptom checkers are digital tools assisting laypersons in self-assessing the urgency and potential causes of their medical complaints. They are widely used but face concerns from both patients and health care professionals, especially regarding their accuracy. A 2015 landmark study substantiated these concerns using case vignettes to demonstrate that symptom checkers commonly err in their triage assessment. Objective This study aims to revisit the landmark index study to investigate whether and how symptom checkers’ capabilities have evolved since 2015 and how they currently compare with laypersons’ stand-alone triage appraisal. Methods In early 2020, we searched for smartphone and web-based applications providing triage advice. We evaluated these apps on the same 45 case vignettes as the index study. Using descriptive statistics, we compared our findings with those of the index study and with publicly available data on laypersons’ triage capability. Results We retrieved 22 symptom checkers providing triage advice. The median triage accuracy in 2020 (55.8%, IQR 15.1%) was close to that in 2015 (59.1%, IQR 15.5%). The apps in 2020 were less risk averse (odds 1.11:1, the ratio of overtriage errors to undertriage errors) than those in 2015 (odds 2.82:1), missing >40% of emergencies. Few apps outperformed laypersons in either deciding whether emergency care was required or whether self-care was sufficient. No apps outperformed the laypersons on both decisions. Conclusions Triage performance of symptom checkers has, on average, not improved over the course of 5 years. It decreased in 2 use cases (advice on when emergency care is required and when no health care is needed for the moment). However, triage capability varies widely within the sample of symptom checkers. Whether it is beneficial to seek advice from symptom checkers depends on the app chosen and on the specific question to be answered. Future research should develop resources (eg, case vignette repositories) to audit the capabilities of symptom checkers continuously and independently and provide guidance on when and to whom they should be recommended.
Collapse
Affiliation(s)
- Malte L Schmieding
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Marvin Kopka
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.,Cognitive Psychology and Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Konrad Schmidt
- Institute of General Practice and Family Medicine, Jena University Hospital, Germany, Jena, Germany.,Institute of General Practice and Family Medicine, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Sven Schulz-Niethammer
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| | - Felix Balzer
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Markus A Feufel
- Division of Ergonomics, Department of Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
17
|
Kopka M, Feufel MA, Balzer F, Schmieding ML. Triage Capability of Laypersons: Retrospective, Exploratory Analysis (Preprint). JMIR Form Res 2022; 6:e38977. [PMID: 36222793 PMCID: PMC9607917 DOI: 10.2196/38977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 08/08/2022] [Accepted: 08/16/2022] [Indexed: 11/19/2022] Open
Abstract
Background Although medical decision-making may be thought of as a task involving health professionals, many decisions, including critical health–related decisions are made by laypersons alone. Specifically, as the first step to most care episodes, it is the patient who determines whether and where to seek health care (triage). Overcautious self-assessments (ie, overtriaging) may lead to overutilization of health care facilities and overcrowded emergency departments, whereas imprudent decisions (ie, undertriaging) constitute a risk to the patient’s health. Recently, patient-facing decision support systems, commonly known as symptom checkers, have been developed to assist laypersons in these decisions. Objective The purpose of this study is to identify factors influencing laypersons’ ability to self-triage and their risk averseness in self-triage decisions. Methods We analyzed publicly available data on 91 laypersons appraising 45 short fictitious patient descriptions (case vignettes; N=4095 appraisals). Using signal detection theory and descriptive and inferential statistics, we explored whether the type of medical decision laypersons face, their confidence in their decision, and sociodemographic factors influence their triage accuracy and the type of errors they make. We distinguished between 2 decisions: whether emergency care was required (decision 1) and whether self-care was sufficient (decision 2). Results The accuracy of detecting emergencies (decision 1) was higher (mean 82.2%, SD 5.9%) than that of deciding whether any type of medical care is required (decision 2, mean 75.9%, SD 5.25%; t>90=8.4; P<.001; Cohen d=0.9). Sensitivity for decision 1 was lower (mean 67.5%, SD 16.4%) than its specificity (mean 89.6%, SD 8.6%) whereas sensitivity for decision 2 was higher (mean 90.5%, SD 8.3%) than its specificity (mean 46.7%, SD 15.95%). Female participants were more risk averse and overtriaged more often than male participants, but age and level of education showed no association with participants’ risk averseness. Participants’ triage accuracy was higher when they were certain about their appraisal (2114/3381, 62.5%) than when being uncertain (378/714, 52.9%). However, most errors occurred when participants were certain of their decision (1267/1603, 79%). Participants were more commonly certain of their overtriage errors (mean 80.9%, SD 23.8%) than their undertriage errors (mean 72.5%, SD 30.9%; t>89=3.7; P<.001; d=0.39). Conclusions Our study suggests that laypersons are overcautious in deciding whether they require medical care at all, but they miss identifying a considerable portion of emergencies. Our results further indicate that women are more risk averse than men in both types of decisions. Layperson participants made most triage errors when they were certain of their own appraisal. Thus, they might not follow or even seek advice (eg, from symptom checkers) in most instances where advice would be useful.
Collapse
Affiliation(s)
- Marvin Kopka
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Cognitive Psychology and Ergonomics, Department of Psychology and Ergonomics (IPA), Technische Universität Berlin, Berlin, Germany
| | - Markus A Feufel
- Division of Ergonomics, Department of Psychology and Ergonomics (IPA), Technische Universität Berlin, Berlin, Germany
| | - Felix Balzer
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Malte L Schmieding
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
18
|
Millen E, Salim N, Azadzoy H, Bane MM, O'Donnell L, Schmude M, Bode P, Tuerk E, Vaidya R, Gilbert SH. Study protocol for a pilot prospective, observational study investigating the condition suggestion and urgency advice accuracy of a symptom assessment app in sub-Saharan Africa: the AFYA-'Health' Study. BMJ Open 2022; 12:e055915. [PMID: 35410928 PMCID: PMC9003603 DOI: 10.1136/bmjopen-2021-055915] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
INTRODUCTION Due to a global shortage of healthcare workers, there is a lack of basic healthcare for 4 billion people worldwide, particularly affecting low-income and middle-income countries. The utilisation of AI-based healthcare tools such as symptom assessment applications (SAAs) has the potential to reduce the burden on healthcare systems. The purpose of the AFYA Study (AI-based Assessment oF health sYmptoms in TAnzania) is to evaluate the accuracy of the condition suggestions and urgency advice provided by a user on a Swahili language Ada SAA. METHODS AND ANALYSIS This study is designed as an observational prospective clinical study. The setting is a waiting room of a Tanzanian district hospital. It will include patients entering the outpatient clinic with various conditions and age groups, including children and adolescents. Patients will be asked to use the SAA before proceeding to usual care. After usual care, they will have a consultation with a study-provided physician. Patients and healthcare practitioners will be blinded to the SAA's results. An expert panel will compare the Ada SAA's condition suggestions and urgency advice to usual care and study provided differential diagnoses and triage. The primary outcome measures are the accuracy and comprehensiveness of the Ada SAA evaluated against the gold standard differential diagnoses. ETHICS AND DISSEMINATION Ethical approval was received by the ethics committee (EC) of Muhimbili University of Health and Allied Sciences with an approval number MUHAS-REC-09-2019-044 and the National Institute for Medical Research, NIMR/HQ/R.8c/Vol. I/922. All amendments to the protocol are reported and adapted on the basis of the requirements of the EC. The results from this study will be submitted to peer-reviewed journals, local and international stakeholders, and will be communicated in editorials/articles by Ada Health. TRIAL REGISTRATION NUMBER NCT04958577.
Collapse
Affiliation(s)
| | - Nahya Salim
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, United Republic of Tanzania
| | | | - Mustafa Miraji Bane
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, United Republic of Tanzania
| | | | | | | | | | | | - Stephen Henry Gilbert
- Ada Health GmbH, Berlin, Germany
- EKFZ for Digital Health, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
19
|
Hennemann S, Kuhn S, Witthöft M, Jungmann SM. Diagnostic Performance of an App-Based Symptom Checker in Mental Disorders: Comparative Study in Psychotherapy Outpatients. JMIR Ment Health 2022; 9:e32832. [PMID: 35099395 PMCID: PMC8844983 DOI: 10.2196/32832] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 11/09/2021] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Digital technologies have become a common starting point for health-related information-seeking. Web- or app-based symptom checkers aim to provide rapid and accurate condition suggestions and triage advice but have not yet been investigated for mental disorders in routine health care settings. OBJECTIVE This study aims to test the diagnostic performance of a widely available symptom checker in the context of formal diagnosis of mental disorders when compared with therapists' diagnoses based on structured clinical interviews. METHODS Adult patients from an outpatient psychotherapy clinic used the app-based symptom checker Ada-check your health (ADA; Ada Health GmbH) at intake. Accuracy was assessed as the agreement of the first and 1 of the first 5 condition suggestions of ADA with at least one of the interview-based therapist diagnoses. In addition, sensitivity, specificity, and interrater reliabilities (Gwet first-order agreement coefficient [AC1]) were calculated for the 3 most prevalent disorder categories. Self-reported usability (assessed using the System Usability Scale) and acceptance of ADA (assessed using an adapted feedback questionnaire) were evaluated. RESULTS A total of 49 patients (30/49, 61% women; mean age 33.41, SD 12.79 years) were included in this study. Across all patients, the interview-based diagnoses matched ADA's first condition suggestion in 51% (25/49; 95% CI 37.5-64.4) of cases and 1 of the first 5 condition suggestions in 69% (34/49; 95% CI 55.4-80.6) of cases. Within the main disorder categories, the accuracy of ADA's first condition suggestion was 0.82 for somatoform and associated disorders, 0.65 for affective disorders, and 0.53 for anxiety disorders. Interrater reliabilities ranged from low (AC1=0.15 for anxiety disorders) to good (AC1=0.76 for somatoform and associated disorders). The usability of ADA was rated as high in the System Usability Scale (mean 81.51, SD 11.82, score range 0-100). Approximately 71% (35/49) of participants would have preferred a face-to-face over an app-based diagnostic. CONCLUSIONS Overall, our findings suggest that a widely available symptom checker used in the formal diagnosis of mental disorders could provide clinicians with a list of condition suggestions with moderate-to-good accuracy. However, diagnostic performance was heterogeneous between disorder categories and included low interrater reliability. Although symptom checkers have some potential to complement the diagnostic process as a screening tool, the diagnostic performance should be tested in larger samples and in comparison with further diagnostic instruments.
Collapse
Affiliation(s)
- Severin Hennemann
- Department of Clinical Psychology, Psychotherapy and Experimental Psychopathology, University of Mainz, Mainz, Germany
| | - Sebastian Kuhn
- Department of Digital Medicine, Medical Faculty OWL, Bielefeld University, Bielefeld, Germany
| | - Michael Witthöft
- Department of Clinical Psychology, Psychotherapy and Experimental Psychopathology, University of Mainz, Mainz, Germany
| | - Stefanie M Jungmann
- Department of Clinical Psychology, Psychotherapy and Experimental Psychopathology, University of Mainz, Mainz, Germany
| |
Collapse
|
20
|
Arellano Carmona K, Chittamuru D, Kravitz RL, Ramondt S, Ramírez AS. Beyond Dr. Google: Health information seeking from an intelligent online symptom checker: Cross-Sectional Questionnaire Study (Preprint). J Med Internet Res 2022; 24:e36322. [PMID: 35984690 PMCID: PMC9440406 DOI: 10.2196/36322] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 04/21/2022] [Accepted: 05/24/2022] [Indexed: 01/20/2023] Open
Abstract
Background The ever-growing amount of health information available on the web is increasing the demand for tools providing personalized and actionable health information. Such tools include symptom checkers that provide users with a potential diagnosis after responding to a set of probes about their symptoms. Although the potential for their utility is great, little is known about such tools’ actual use and effects. Objective We aimed to understand who uses a web-based artificial intelligence–powered symptom checker and its purposes, how they evaluate the experience of the web-based interview and quality of the information, what they intend to do with the recommendation, and predictors of future use. Methods Cross-sectional survey of web-based health information seekers following the completion of a symptom checker visit (N=2437). Measures of comprehensibility, confidence, usefulness, health-related anxiety, empowerment, and intention to use in the future were assessed. ANOVAs and the Wilcoxon rank sum test examined mean outcome differences in racial, ethnic, and sex groups. The relationship between perceptions of the symptom checker and intention to follow recommended actions was assessed using multilevel logistic regression. Results Buoy users were well-educated (1384/1704, 81.22% college or higher), primarily White (1227/1693, 72.47%), and female (2069/2437, 84.89%). Most had insurance (1449/1630, 88.89%), a regular health care provider (1307/1709, 76.48%), and reported good health (1000/1703, 58.72%). Three types of symptoms—pain (855/2437, 35.08%), gynecological issues (293/2437, 12.02%), and masses or lumps (204/2437, 8.37%)—accounted for almost half (1352/2437, 55.48%) of site visits. Buoy’s top three primary recommendations split across less-serious triage categories: primary care physician in 2 weeks (754/2141, 35.22%), self-treatment (452/2141, 21.11%), and primary care in 1 to 2 days (373/2141, 17.42%). Common diagnoses were musculoskeletal (303/2437, 12.43%), gynecological (304/2437, 12.47%) and skin conditions (297/2437, 12.19%), and infectious diseases (300/2437, 12.31%). Users generally reported high confidence in Buoy, found it useful and easy to understand, and said that Buoy made them feel less anxious and more empowered to seek medical help. Users for whom Buoy recommended “Waiting/Watching” or “Self-Treatment” had strongest intentions to comply, whereas those advised to seek primary care had weaker intentions. Compared with White users, Latino and Black users had significantly more confidence in Buoy (P<.05), and the former also found it significantly more useful (P<.05). Latino (odds ratio 1.96, 95% CI 1.22-3.25) and Black (odds ratio 2.37, 95% CI 1.57-3.66) users also had stronger intentions to discuss recommendations with a provider than White users. Conclusions Results demonstrate the potential utility of a web-based health information tool to empower people to seek care and reduce health-related anxiety. However, despite encouraging results suggesting the tool may fulfill unmet health information needs among women and Black and Latino adults, analyses of the user base illustrate persistent second-level digital divide effects.
Collapse
Affiliation(s)
| | - Deepti Chittamuru
- School of Social Sciences, Humanities and Arts, University of California, Merced, CA, United States
| | | | - Steven Ramondt
- Department of Donor Medicine Research, Sanquin Research, Amsterdam, Netherlands
- Department of Communication Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - A Susana Ramírez
- School of Social Sciences, Humanities and Arts, University of California, Merced, CA, United States
| |
Collapse
|
21
|
Chan F, Lai S, Pieterman M, Richardson L, Singh A, Peters J, Toy A, Piccininni C, Rouault T, Wong K, Quong JK, Wakabayashi AT, Pawelec-Brzychczy A. Performance of a new symptom checker in patient triage: Canadian cohort study. PLoS One 2021; 16:e0260696. [PMID: 34852016 PMCID: PMC8635379 DOI: 10.1371/journal.pone.0260696] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 11/15/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Computerized algorithms known as symptom checkers aim to help patients decide what to do should they have a new medical concern. However, despite widespread implementation, most studies on symptom checkers have involved simulated patients. Only limited evidence currently exists about symptom checker safety or accuracy when used by real patients. We developed a new prototype symptom checker and assessed its safety and accuracy in a prospective cohort of patients presenting to primary care and emergency departments with new medical concerns. METHOD A prospective cohort study was done to assess the prototype's performance. The cohort consisted of adult patients (≥16 years old) who presented to hospital emergency departments and family physician clinics. Primary outcomes were safety and accuracy of triage recommendations to seek hospital care, seek primary care, or manage symptoms at home. RESULTS Data from 281 hospital patients and 300 clinic patients were collected and analyzed. Sensitivity to emergencies was 100% (10/10 encounters). Sensitivity to urgencies was 90% (73/81) and 97% (34/35) for hospital and primary care patients, respectively. The prototype was significantly more accurate than patients at triage (73% versus 58%, p<0.01). Compliance with triage recommendations in this cohort using this iteration of the symptom checker would have reduced hospital visits by 55% but cause potential harm in 2-3% from delay in care. INTERPRETATION The prototype symptom checker was superior to patients in deciding the most appropriate treatment setting for medical issues. This symptom checker could reduce a significant number of unnecessary hospital visits, with accuracy and safety outcomes comparable to existing data on telephone triage.
Collapse
Affiliation(s)
- Forson Chan
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Simon Lai
- University of British Columbia, Faculty of Medicine, Health Sciences Mall, Vancouver, Canada
| | - Marcus Pieterman
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Lisa Richardson
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Amanda Singh
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Jocelynn Peters
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Alex Toy
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Caroline Piccininni
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Taiysa Rouault
- University of British Columbia, Faculty of Medicine, Health Sciences Mall, Vancouver, Canada
| | - Kristie Wong
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | | | - Adrienne T. Wakabayashi
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| | - Anna Pawelec-Brzychczy
- Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, WCPHFM, London, ON, Canada
| |
Collapse
|
22
|
Kopka M, Schmieding ML, Rieger T, Roesler E, Balzer F, Feufel MA. Trust Me, I’m Not a Doctor! Determinants of Laypersons’ Trust in Medical Decision Aids: Experimental Study (Preprint). JMIR Hum Factors 2021; 9:e35219. [PMID: 35503248 PMCID: PMC9115664 DOI: 10.2196/35219] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 02/09/2022] [Accepted: 03/06/2022] [Indexed: 11/13/2022] Open
Abstract
Background Objective Methods Results Conclusions Trial Registration
Collapse
Affiliation(s)
- Marvin Kopka
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- Cognitive Psychology and Ergonomics, Department of Psychology and Ergonomics (IPA), Technische Universität Berlin, Berlin, Germany
| | - Malte L Schmieding
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Tobias Rieger
- Work, Engineering and Organizational Psychology, Department of Psychology and Ergonomics (IPA), Technische Universität Berlin, Berlin, Germany
| | - Eileen Roesler
- Work, Engineering and Organizational Psychology, Department of Psychology and Ergonomics (IPA), Technische Universität Berlin, Berlin, Germany
| | - Felix Balzer
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Markus A Feufel
- Division of Ergonomics, Department of Psychology and Ergonomics (IPA), Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
23
|
Woodcock C, Mittelstadt B, Busbridge D, Blank G. The Impact of Explanations on Layperson Trust in Artificial Intelligence-Driven Symptom Checker Apps: Experimental Study. J Med Internet Res 2021; 23:e29386. [PMID: 34730544 PMCID: PMC8600426 DOI: 10.2196/29386] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/11/2021] [Accepted: 07/27/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Artificial intelligence (AI)-driven symptom checkers are available to millions of users globally and are advocated as a tool to deliver health care more efficiently. To achieve the promoted benefits of a symptom checker, laypeople must trust and subsequently follow its instructions. In AI, explanations are seen as a tool to communicate the rationale behind black-box decisions to encourage trust and adoption. However, the effectiveness of the types of explanations used in AI-driven symptom checkers has not yet been studied. Explanations can follow many forms, including why-explanations and how-explanations. Social theories suggest that why-explanations are better at communicating knowledge and cultivating trust among laypeople. OBJECTIVE The aim of this study is to ascertain whether explanations provided by a symptom checker affect explanatory trust among laypeople and whether this trust is impacted by their existing knowledge of disease. METHODS A cross-sectional survey of 750 healthy participants was conducted. The participants were shown a video of a chatbot simulation that resulted in the diagnosis of either a migraine or temporal arteritis, chosen for their differing levels of epidemiological prevalence. These diagnoses were accompanied by one of four types of explanations. Each explanation type was selected either because of its current use in symptom checkers or because it was informed by theories of contrastive explanation. Exploratory factor analysis of participants' responses followed by comparison-of-means tests were used to evaluate group differences in trust. RESULTS Depending on the treatment group, two or three variables were generated, reflecting the prior knowledge and subsequent mental model that the participants held. When varying explanation type by disease, migraine was found to be nonsignificant (P=.65) and temporal arteritis, marginally significant (P=.09). Varying disease by explanation type resulted in statistical significance for input influence (P=.001), social proof (P=.049), and no explanation (P=.006), with counterfactual explanation (P=.053). The results suggest that trust in explanations is significantly affected by the disease being explained. When laypeople have existing knowledge of a disease, explanations have little impact on trust. Where the need for information is greater, different explanation types engender significantly different levels of trust. These results indicate that to be successful, symptom checkers need to tailor explanations to each user's specific question and discount the diseases that they may also be aware of. CONCLUSIONS System builders developing explanations for symptom-checking apps should consider the recipient's knowledge of a disease and tailor explanations to each user's specific need. Effort should be placed on generating explanations that are personalized to each user of a symptom checker to fully discount the diseases that they may be aware of and to close their information gap.
Collapse
Affiliation(s)
- Claire Woodcock
- Oxford Internet Institute, University of Oxford, Oxford, United Kingdom
| | - Brent Mittelstadt
- Oxford Internet Institute, University of Oxford, Oxford, United Kingdom
| | | | - Grant Blank
- Oxford Internet Institute, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
24
|
Gilbert S, Fenech M, Upadhyay S, Wicks P, Novorol C. Quality of condition suggestions and urgency advice provided by the Ada symptom assessment app evaluated with vignettes optimised for Australia. Aust J Prim Health 2021; 27:377-381. [PMID: 34706813 DOI: 10.1071/py21032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 05/11/2021] [Indexed: 11/23/2022]
Abstract
When people face a health problem, they often first ask, 'Is there an app for that?'. We investigated the quality of advice provided by the Ada symptom assessment application to address the question, 'How do I know the app on my phone is safe and provides good advice?'. The app was tested with 48 independently created vignettes developed for a previous study, including 18 specifically developed for the Australian setting, using an independently developed methodology to evaluate the accuracy of condition suggestions and urgency advice. The correct condition was listed first in 65% of vignettes, and in the Top 3 results in 83% of vignettes. The urgency advice in the app exactly matched the gold standard 63% of vignettes. The app's accuracy of condition suggestion and urgency advice is higher than that of the best-performing symptom assessment app reported in a previous study (61%, 77% and 52% for conditions suggested in the Top 1, Top 3 and exactly matching urgency advice respectively). These results are relevant to the application of symptom assessment in primary and community health, where medical quality and safety should determine app choice.
Collapse
Affiliation(s)
- Stephen Gilbert
- Ada Health GmbH, Karl-Liebknecht-Straße 1, 10178 Berlin, Germany; and EKFZ for Digital Health, University Hospital Carl Gustav Carus Dresden, Technische Universität Dresden, Dresden, Germany; and Corresponding author.
| | - Matthew Fenech
- Ada Health GmbH, Karl-Liebknecht-Straße 1, 10178 Berlin, Germany
| | | | - Paul Wicks
- Ada Health GmbH, Karl-Liebknecht-Straße 1, 10178 Berlin, Germany
| | - Claire Novorol
- Ada Health GmbH, Karl-Liebknecht-Straße 1, 10178 Berlin, Germany
| |
Collapse
|
25
|
Levine DM, Mehrotra A. Assessment of Diagnosis and Triage in Validated Case Vignettes Among Nonphysicians Before and After Internet Search. JAMA Netw Open 2021; 4:e213287. [PMID: 33779741 PMCID: PMC8008286 DOI: 10.1001/jamanetworkopen.2021.3287] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
IMPORTANCE When confronted with new medical symptoms, many people turn to the internet to understand why they are ill as well as whether and where they should get care. Such searches may be harmful because they may facilitate misdiagnosis and inappropriate triage. OBJECTIVE To empirically measure the association of an internet search for health information with diagnosis, triage, and anxiety by laypeople. DESIGN, SETTING, AND PARTICIPANTS This survey study used a nationally representative sample of US adults who were recruited through an online platform between April 1, 2019, and April 15, 2019. A total of 48 validated case vignettes of both common (eg, viral illness) and severe (eg, heart attack) conditions were used. Participants were asked to relay their diagnosis, triage, and anxiety regarding 1 of these cases before and after searching the internet for health information. EXPOSURES Short, validated case vignettes written at or below the sixth-grade reading level randomly assigned to participants. MAIN OUTCOMES AND MEASURES Correct diagnosis, correct triage, and flipping (changing) or anchoring (not changing) diagnosis and triage decisions were the main outcomes. Multivariable modeling was performed to identify patient factors associated with correct triage and diagnosis. RESULTS Of the 5000 participants, 2549 were female (51.0%), 3819 were White (76.4%), and the mean (SD) age was 45.0 (16.9) years. Mean internet search time was 12.1 (95% CI, 10.7-13.5) minutes per case. No difference in triage accuracy was found before and after search (74.5% vs 74.1%; difference, -0.4 [95% CI, -1.4 to 0.6]; P = .06), but improved diagnostic accuracy was found (49.8% vs 54.0%; difference, 4.2% [95% CI, 3.1%-5.3%]; P < .001). Most participants (4254 [85.1%]) were anchored on their diagnosis. Of the 14.9% of participants (n = 746) who flipped their diagnosis, 9.6% (n = 478) flipped from incorrect to correct and 5.4% (n = 268) flipped from correct to incorrect. The following groups had an increased rate of correct diagnosis: adults 40 years or older (eg, 40-49 years: 5.1 [95% CI, 0.8-9.4] percentage points better than those aged <30 years; P = .02), women (9.4 [95% CI, 6.8-12.0] percentage points better than men; P < .001), and those with perceived poor health status (16.3 [95% CI, 6.9-25.6] percentage points better than those with excellent status; P = .001) and with more than 2 chronic diseases (6.8 [95% CI, 1.5-12.1] percentage points better than those with 0 conditions; P = .01). CONCLUSIONS AND RELEVANCE This study found that an internet search for health information was associated with small increases in diagnostic accuracy but not with triage accuracy.
Collapse
Affiliation(s)
- David M. Levine
- Division of General Internal Medicine and Primary Care, Brigham and Women’s Hospital, Boston, Massachusetts
- Harvard Medical School, Boston, Massachusetts
| | - Ateev Mehrotra
- Harvard Medical School, Boston, Massachusetts
- Department of Health Care Policy, Harvard Medical School, Boston, Massachusetts
- Division of General Medicine and Primary Care, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| |
Collapse
|
26
|
Versluijs Y, Brown LE, Ring D. Does a Previsit Phone Call from the Surgeon Reduce Decision Conflict? Telemed J E Health 2021; 27:1282-1287. [PMID: 33538643 DOI: 10.1089/tmj.2020.0475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Background: There is some evidence that previsit strategies can make in-person visits more productive and efficient. We compared between people who received a phone call before a musculoskeletal specialty visit and people who did not with respect to several factors: (1) decision conflict (difficulty deciding between two or more options), (2) perceived clinician empathy after an in-person visit, and (3) arrival for the scheduled in-person appointment. We also recorded the specialist's opinion that the phone call alone could adequately replace an in-person visit while maintaining quality, safety, and effectiveness. Materials and Methods: In this prospective randomized-controlled trial, 122 patients were enrolled and randomized to receive a previsit phone call by an orthopedic surgeon before a scheduled visit or not. After the in-person visit, patients completed a (1) demographic questionnaire including age, gender, race/ethnicity, marital status, level of education, work status, and comorbidities; (2) Decision Conflict Scale; and (3) Jefferson Scale of Patient Perceptions of Physician Empathy. Results: No significant difference was found between the two groups in decision conflict, perceived empathy, or not attending the scheduled visit. Of the 55 successful phone calls, the surgeon felt that 50 (91%) had the potential to safely and effectively replace an in-person visit. Conclusion: Although a previsit phone call did not reduce decision conflict or improve the patient experience as measured after one visit, there may be merit in studying an increased number of touch points, particularly with some subsets of illness featuring substantial stress or misconceptions. The identified potential for the application and transfer of specialty expertise through telephone alone also merits additional study.
Collapse
Affiliation(s)
- Yvonne Versluijs
- Department of Surgery and Perioperative Care, Dell Medical School-The University of Texas at Austin, Austin, Texas, USA.,Department of Trauma Surgery, Leiden University Medical Center, Leiden, the Netherlands
| | - Laura E Brown
- Center for Health Communication, Dell Medical School-The University of Texas at Austin, Austin, Texas, USA
| | - David Ring
- Department of Surgery and Perioperative Care, Dell Medical School-The University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
27
|
Goetz CM, Arnetz JE, Sudan S, Arnetz BB. Perceptions of virtual primary care physicians: A focus group study of medical and data science graduate students. PLoS One 2020; 15:e0243641. [PMID: 33332409 PMCID: PMC7745971 DOI: 10.1371/journal.pone.0243641] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 11/20/2020] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Artificial and virtual technologies in healthcare have advanced rapidly, and healthcare systems have been adapting care accordingly. An intriguing new development is the virtual physician, which can diagnose and treat patients independently. METHODS AND FINDINGS This qualitative study of advanced degree students aimed to assess their perceptions of using a virtual primary care physician as a patient. Four focus groups were held: first year medical students, fourth year medical students, first year engineering/data science graduate students, and fourth year engineering/data science graduate students. The focus groups were audiotaped, transcribed verbatim, and content analyses of the transcripts was performed using a data-driven inductive approach. Themes identified concerned advantages, disadvantages, and the future of virtual primary care physicians. Within those main categories, 13 themes emerged and 31 sub-themes. DISCUSSION While participants appreciated that a virtual primary care physician would be convenient, efficient, and cost-effective, they also expressed concern about data privacy and the potential for misdiagnosis. To garner trust from its potential users, future virtual primary physicians should be programmed with a sufficient amount of trustworthy data and have a high level of transparency and accountability for patients.
Collapse
Affiliation(s)
- Courtney M. Goetz
- Department of Family Medicine, College of Human Medicine, Michigan State University, East Lansing, Michigan, United States of America
| | - Judith E. Arnetz
- Department of Family Medicine, College of Human Medicine, Michigan State University, East Lansing, Michigan, United States of America
| | - Sukhesh Sudan
- Department of Family Medicine, College of Human Medicine, Michigan State University, East Lansing, Michigan, United States of America
| | - Bengt B. Arnetz
- Department of Family Medicine, College of Human Medicine, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
28
|
Gilbert S, Mehl A, Baluch A, Cawley C, Challiner J, Fraser H, Millen E, Montazeri M, Multmeier J, Pick F, Richter C, Türk E, Upadhyay S, Virani V, Vona N, Wicks P, Novorol C. How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs. BMJ Open 2020; 10:e040269. [PMID: 33328258 PMCID: PMC7745523 DOI: 10.1136/bmjopen-2020-040269] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
OBJECTIVES To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps. DESIGN Vignettes study. SETTING 200 primary care vignettes. INTERVENTION/COMPARATOR For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes' gold-standard. PRIMARY OUTCOME MEASURES (1) Proportion of conditions 'covered' by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of 'safe' urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative). RESULTS Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs-Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs-Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs-Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10-3). CONCLUSIONS The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.
Collapse
Affiliation(s)
| | | | | | | | | | - Hamish Fraser
- Brown Center for Biomedical Informatics, Brown University, Rhode Island, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Ćirković A. Evaluation of Four Artificial Intelligence-Assisted Self-Diagnosis Apps on Three Diagnoses: Two-Year Follow-Up Study. J Med Internet Res 2020; 22:e18097. [PMID: 33275113 PMCID: PMC7748958 DOI: 10.2196/18097] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 08/04/2020] [Accepted: 10/30/2020] [Indexed: 12/15/2022] Open
Abstract
Background Consumer-oriented mobile self-diagnosis apps have been developed using undisclosed algorithms, presumably based on machine learning and other artificial intelligence (AI) technologies. The US Food and Drug Administration now discerns apps with learning AI algorithms from those with stable ones and treats the former as medical devices. To the author’s knowledge, no self-diagnosis app testing has been performed in the field of ophthalmology so far. Objective The objective of this study was to test apps that were previously mentioned in the scientific literature on a set of diagnoses in a deliberate time interval, comparing the results and looking for differences that hint at “nonlocked” learning algorithms. Methods Four apps from the literature were chosen (Ada, Babylon, Buoy, and Your.MD). A set of three ophthalmology diagnoses (glaucoma, retinal tear, dry eye syndrome) representing three levels of urgency was used to simultaneously test the apps’ diagnostic efficiency and treatment recommendations in this specialty. Two years was the chosen time interval between the tests (2018 and 2020). Scores were awarded by one evaluating physician using a defined scheme. Results Two apps (Ada and Your.MD) received significantly higher scores than the other two. All apps either worsened in their results between 2018 and 2020 or remained unchanged at a low level. The variation in the results over time indicates “nonlocked” learning algorithms using AI technologies. None of the apps provided correct diagnoses and treatment recommendations for all three diagnoses in 2020. Two apps (Babylon and Your.MD) asked significantly fewer questions than the other two (P<.001). Conclusions “Nonlocked” algorithms are used by self-diagnosis apps. The diagnostic efficiency of the tested apps seems to worsen over time, with some apps being more capable than others. Systematic studies on a wider scale are necessary for health care providers and patients to correctly assess the safety and efficacy of such apps and for correct classification by health care regulating authorities.
Collapse
|
30
|
Morse KE, Ostberg NP, Jones VG, Chan AS. Use Characteristics and Triage Acuity of a Digital Symptom Checker in a Large Integrated Health System: Population-Based Descriptive Study. J Med Internet Res 2020; 22:e20549. [PMID: 33170799 PMCID: PMC7717918 DOI: 10.2196/20549] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/07/2020] [Accepted: 11/07/2020] [Indexed: 12/11/2022] Open
Abstract
Background Pressure on the US health care system has been increasing due to a combination of aging populations, rising health care expenditures, and most recently, the COVID-19 pandemic. Responses to this pressure are hindered in part by reliance on a limited supply of highly trained health care professionals, creating a need for scalable technological solutions. Digital symptom checkers are artificial intelligence–supported software tools that use a conversational “chatbot” format to support rapid diagnosis and consistent triage. The COVID-19 pandemic has brought new attention to these tools due to the need to avoid face-to-face contact and preserve urgent care capacity. However, evidence-based deployment of these chatbots requires an understanding of user demographics and associated triage recommendations generated by a large general population. Objective In this study, we evaluate the user demographics and levels of triage acuity provided by a symptom checker chatbot deployed in partnership with a large integrated health system in the United States. Methods This population-based descriptive study included all web-based symptom assessments completed on the website and patient portal of the Sutter Health system (24 hospitals in Northern California) from April 24, 2019, to February 1, 2020. User demographics were compared to relevant US Census population data. Results A total of 26,646 symptom assessments were completed during the study period. Most assessments (17,816/26,646, 66.9%) were completed by female users. The mean user age was 34.3 years (SD 14.4 years), compared to a median age of 37.3 years of the general population. The most common initial symptom was abdominal pain (2060/26,646, 7.7%). A substantial number of assessments (12,357/26,646, 46.4%) were completed outside of typical physician office hours. Most users were advised to seek medical care on the same day (7299/26,646, 27.4%) or within 2-3 days (6301/26,646, 23.6%). Over a quarter of the assessments indicated a high degree of urgency (7723/26,646, 29.0%). Conclusions Users of the symptom checker chatbot were broadly representative of our patient population, although they skewed toward younger and female users. The triage recommendations were comparable to those of nurse-staffed telephone triage lines. Although the emergence of COVID-19 has increased the interest in remote medical assessment tools, it is important to take an evidence-based approach to their deployment.
Collapse
Affiliation(s)
- Keith E Morse
- Department of Pediatrics, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Nicolai P Ostberg
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Veena G Jones
- Clinical Leadership Team, Sutter Health, Sacramento, CA, United States.,Palo Alto Medical Foundation Research Institute, Palo Alto, CA, United States
| | - Albert S Chan
- Clinical Leadership Team, Sutter Health, Sacramento, CA, United States.,Palo Alto Medical Foundation Research Institute, Palo Alto, CA, United States
| |
Collapse
|
31
|
Coquet J, Blayney DW, Brooks JD, Hernandez-Boussard T. Association between patient-initiated emails and overall 2-year survival in cancer patients undergoing chemotherapy: Evidence from the real-world setting. Cancer Med 2020; 9:8552-8561. [PMID: 32986931 PMCID: PMC7666724 DOI: 10.1002/cam4.3483] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/09/2020] [Accepted: 09/02/2020] [Indexed: 11/12/2022] Open
Abstract
PURPOSE Prior studies suggest email communication between patients and providers may improve patient engagement and health outcomes. The purpose of this study was to determine whether patient-initiated emails are associated with overall survival benefits among cancer patients undergoing chemotherapy. PATIENTS AND METHODS We identified patient-initiated emails through the patient portal in electronic health records (EHR) among 9900 cancer patients receiving chemotherapy between 2013 and 2018. Email users were defined as patients who sent at least one email 12 months before to 2 months after chemotherapy started. A propensity score-matched cohort analysis was carried out to reduce bias due to confounding (age, primary cancer type, gender, insurance payor, ethnicity, race, stage, income, Charlson score, county of residence). The cohort included 3223 email users and 3223 non-email users. The primary outcome was overall 2-year survival stratified by email use. Secondary outcomes included number of face-to-face visits, prescriptions, and telephone calls. The healthcare teams' response to emails and other forms of communication was also investigated. Finally, a quality measure related to chemotherapy-related inpatient and emergency department visits was evaluated. RESULTS Overall 2-year survival was higher in patients who were email users, with an adjusted hazard ratio of 0.80 (95 CI 0.72-0.90; p < 0.001). Email users had higher rates of healthcare utilization, including face-to-face visits (63 vs. 50; p < 0.001), drug prescriptions (28 vs. 21; p < 0.001), and phone calls (18 vs. 16; p < 0.001). Clinical quality outcome measure of inpatient use was better among email users (p = 0.015). CONCLUSION Patient-initiated emails are associated with a survival benefit among cancer patients receiving chemotherapy and may be a proxy for patient engagement. As value-based payment models emphasize incorporating the patients' voice into their care, email communications could serve as a novel source of patient-generated data.
Collapse
Affiliation(s)
- Jean Coquet
- Department of Medicine, Stanford University, Stanford, CA, USA
| | - Douglas W Blayney
- Department of Medicine, Stanford University, Stanford, CA, USA.,Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - James D Brooks
- Department of Urology, Stanford University School of Medicine, Stanford, CA, USA
| | - Tina Hernandez-Boussard
- Department of Medicine, Stanford University, Stanford, CA, USA.,Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.,Department of Surgery, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
32
|
Meskó B, Görög M. A short guide for medical professionals in the era of artificial intelligence. NPJ Digit Med 2020; 3:126. [PMID: 33043150 PMCID: PMC7518439 DOI: 10.1038/s41746-020-00333-z] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 08/31/2020] [Indexed: 02/08/2023] Open
Abstract
Artificial intelligence (A.I.) is expected to significantly influence the practice of medicine and the delivery of healthcare in the near future. While there are only a handful of practical examples for its medical use with enough evidence, hype and attention around the topic are significant. There are so many papers, conference talks, misleading news headlines and study interpretations that a short and visual guide any medical professional can refer back to in their professional life might be useful. For this, it is critical that physicians understand the basics of the technology so they can see beyond the hype, evaluate A.I.-based studies and clinical validation; as well as acknowledge the limitations and opportunities of A.I. This paper aims to serve as a short, visual and digestible repository of information and details every physician might need to know in the age of A.I. We describe the simple definition of A.I., its levels, its methods, the differences between the methods with medical examples, the potential benefits, dangers, challenges of A.I., as well as attempt to provide a futuristic vision about using it in an everyday medical practice.
Collapse
Affiliation(s)
- Bertalan Meskó
- The Medical Futurist Institute, Budapest, Hungary
- Semmelweis University, Budapest, Hungary
| | - Marton Görög
- The Medical Futurist Institute, Budapest, Hungary
| |
Collapse
|
33
|
Miller S, Gilbert S, Virani V, Wicks P. Patients' Utilization and Perception of an Artificial Intelligence-Based Symptom Assessment and Advice Technology in a British Primary Care Waiting Room: Exploratory Pilot Study. JMIR Hum Factors 2020; 7:e19713. [PMID: 32540836 PMCID: PMC7382011 DOI: 10.2196/19713] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 06/11/2020] [Accepted: 06/14/2020] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND When someone needs to know whether and when to seek medical attention, there are a range of options to consider. Each will have consequences for the individual (primarily considering trust, convenience, usefulness, and opportunity costs) and for the wider health system (affecting clinical throughput, cost, and system efficiency). Digital symptom assessment technologies that leverage artificial intelligence may help patients navigate to the right type of care with the correct degree of urgency. However, a recent review highlighted a gap in the literature on the real-world usability of these technologies. OBJECTIVE We sought to explore the usability, acceptability, and utility of one such symptom assessment technology, Ada, in a primary care setting. METHODS Patients with a new complaint attending a primary care clinic in South London were invited to use a custom version of the Ada symptom assessment mobile app. This exploratory pilot study was conducted between November 2017 and January 2018 in a practice with 20,000 registered patients. Participants were asked to complete an Ada self-assessment about their presenting complaint on a study smartphone, with assistance provided if required. Perceptions on the app and its utility were collected through a self-completed study questionnaire following completion of the Ada self-assessment. RESULTS Over a 3-month period, 523 patients participated. Most were female (n=325, 62.1%), mean age 39.79 years (SD 17.7 years), with a larger proportion (413/506, 81.6%) of working-age individuals (aged 15-64) than the general population (66.0%). Participants rated Ada's ease of use highly, with most (511/522, 97.8%) reporting it was very or quite easy. Most would use Ada again (443/503, 88.1%) and agreed they would recommend it to a friend or relative (444/520, 85.3%). We identified a number of age-related trends among respondents, with a directional trend for more young respondents to report Ada had provided helpful advice (50/54, 93%, 18-24-year olds reported helpful) than older respondents (19/32, 59%, adults aged 70+ reported helpful). We found no sex differences on any of the usability questions fielded. While most respondents reported that using the symptom checker would not have made a difference in their care-seeking behavior (425/494, 86.0%), a sizable minority (63/494, 12.8%) reported they would have used lower-intensity care such as self-care, pharmacy, or delaying their appointment. The proportion was higher for patients aged 18-24 (11/50, 22%) than aged 70+ (0/28, 0%). CONCLUSIONS In this exploratory pilot study, the digital symptom checker was rated as highly usable and acceptable by patients in a primary care setting. Further research is needed to confirm whether the app might appropriately direct patients to timely care, and understand how this might save resources for the health system. More work is also needed to ensure the benefits accrue equally to older age groups.
Collapse
|
34
|
|