Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Winn AN, Somai M, Fergestrom N, Crotty BH. Association of Use of Online Symptom Checkers With Patients' Plans for Seeking Care. JAMA Netw Open 2019;2:e1918561. [PMID: 31880791 PMCID: PMC6991310 DOI: 10.1001/jamanetworkopen.2019.18561] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Winn AN, Somai M, Fergestrom N, Crotty BH. Association of Use of Online Symptom Checkers With Patients' Plans for Seeking Care. JAMA Netw Open 2019;2:e1918561. [PMID: 31880791 PMCID: PMC6991310 DOI: 10.1001/jamanetworkopen.2019.18561] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Hammoud M, Douglas S, Darmach M, Alawneh S, Sanyal S, Kanbour Y. Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study. JMIR AI 2024;3:e46875. [PMID: 38875676 DOI: 10.2196/46875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/15/2023] [Accepted: 03/02/2024] [Indexed: 06/16/2024]

Abstract

BACKGROUND

Medical self-diagnostic tools (or symptom checkers) are becoming an integral part of digital health and our daily lives, whereby patients are increasingly using them to identify the underlying causes of their symptoms. As such, it is essential to rigorously investigate and comprehensively report the diagnostic performance of symptom checkers using standard clinical and scientific approaches.

OBJECTIVE

This study aims to evaluate and report the accuracies of a few known and new symptom checkers using a standard and transparent methodology, which allows the scientific community to cross-validate and reproduce the reported results, a step much needed in health informatics.

METHODS

We propose a 4-stage experimentation methodology that capitalizes on the standard clinical vignette approach to evaluate 6 symptom checkers. To this end, we developed and peer-reviewed 400 vignettes, each approved by at least 5 out of 7 independent and experienced primary care physicians. To establish a frame of reference and interpret the results of symptom checkers accordingly, we further compared the best-performing symptom checker against 3 primary care physicians with an average experience of 16.6 (SD 9.42) years. To measure accuracy, we used 7 standard metrics, including M1 as a measure of a symptom checker's or a physician's ability to return a vignette's main diagnosis at the top of their differential list, F1-score as a trade-off measure between recall and precision, and Normalized Discounted Cumulative Gain (NDCG) as a measure of a differential list's ranking quality, among others.

RESULTS

The diagnostic accuracies of the 6 tested symptom checkers vary significantly. For instance, the differences in the M1, F1-score, and NDCG results between the best-performing and worst-performing symptom checkers or ranges were 65.3%, 39.2%, and 74.2%, respectively. The same was observed among the participating human physicians, whereby the M1, F1-score, and NDCG ranges were 22.8%, 15.3%, and 21.3%, respectively. When compared against each other, physicians outperformed the best-performing symptom checker by an average of 1.2% using F1-score, whereas the best-performing symptom checker outperformed physicians by averages of 10.2% and 25.1% using M1 and NDCG, respectively.

CONCLUSIONS

The performance variation between symptom checkers is substantial, suggesting that symptom checkers cannot be treated as a single entity. On a different note, the best-performing symptom checker was an artificial intelligence (AI)-based one, shedding light on the promise of AI in improving the diagnostic capabilities of symptom checkers, especially as AI keeps advancing exponentially.

Collapse

Wetzel AJ, Klemmt M, Müller R, Rieger MA, Joos S, Koch R. Only the anxious ones? Identifying characteristics of symptom checker app users: a cross-sectional survey. BMC Med Inform Decis Mak 2024;24:21. [PMID: 38262993 PMCID: PMC10804572 DOI: 10.1186/s12911-024-02430-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/16/2024] [Indexed: 01/25/2024] Open

Abstract

BACKGROUND

Symptom checker applications (SCAs) may help laypeople classify their symptoms and receive recommendations on medically appropriate actions. Further research is necessary to estimate the influence of user characteristics, attitudes and (e)health-related competencies.

OBJECTIVE

The objective of this study is to identify meaningful predictors for SCA use considering user characteristics.

METHODS

An explorative cross-sectional survey was conducted to investigate German citizens' demographics, eHealth literacy, hypochondria, self-efficacy, and affinity for technology using German language-validated questionnaires. A total of 869 participants were eligible for inclusion in the study. As n = 67 SCA users were assessed and matched 1:1 with non-users, a sample of n = 134 participants were assessed in the main analysis. A four-step analysis was conducted involving explorative predictor selection, model comparisons, and parameter estimates for selected predictors, including sensitivity and post hoc analyses.

RESULTS

Hypochondria and self-efficacy were identified as meaningful predictors of SCA use. Hypochondria showed a consistent and significant effect across all analyses OR: 1.24-1.26 (95% CI: 1.1-1.4). Self-efficacy OR: 0.64-0.93 (95% CI: 0.3-1.4) showed inconsistent and nonsignificant results, leaving its role in SCA use unclear. Over half of the SCA users in our sample met the classification for hypochondria (cut-off on the WI of 5).

CONCLUSIONS

Hypochondria has emerged as a significant predictor of SCA use with a consistently stable effect, yet according to the literature, individuals with this trait may be less likely to benefit from SCA despite their greater likelihood of using it. These users could be further unsettled by risk-averse triage and unlikely but serious diagnosis suggestions.

TRIAL REGISTRATION

The study was registered in the German Clinical Trials Register (DRKS) DRKS00022465, DERR1- https://doi.org/10.2196/34026 .

Collapse

Wetzel AJ, Koch R, Koch N, Klemmt M, Müller R, Preiser C, Rieger M, Rösel I, Ranisch R, Ehni HJ, Joos S. 'Better see a doctor?' Status quo of symptom checker apps in Germany: A cross-sectional survey with a mixed-methods design (CHECK.APP). Digit Health 2024;10:20552076241231555. [PMID: 38434790 PMCID: PMC10908232 DOI: 10.1177/20552076241231555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2024] [Indexed: 03/05/2024] Open

Li BR, Wang J. Research status of internet-delivered cognitive behavioral therapy in cancer patients. World J Psychiatry 2023;13:831-837. [DOI: 10.5498/wjp.v13.i11.831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/17/2023] [Accepted: 09/27/2023] [Indexed: 11/17/2023] Open

Gellert GA, Rasławska-Socha J, Marcjasz N, Price T, Kuszczyński K, Młodawska A, Jędruch A, Orzechowski PM. How Virtual Triage Can Improve Patient Experience and Satisfaction: A Narrative Review and Look Forward. TELEMEDICINE REPORTS 2023;4:292-306. [PMID: 37817871 PMCID: PMC10561746 DOI: 10.1089/tmr.2023.0037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/21/2023] [Indexed: 10/12/2023]

Wiedermann CJ, Mahlknecht A, Piccoliori G, Engl A. Redesigning Primary Care: The Emergence of Artificial-Intelligence-Driven Symptom Diagnostic Tools. J Pers Med 2023;13:1379. [PMID: 37763147 PMCID: PMC10532810 DOI: 10.3390/jpm13091379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 09/13/2023] [Accepted: 09/14/2023] [Indexed: 09/29/2023] Open

Abstract

Modern healthcare is facing a juxtaposition of increasing patient demands owing to an aging population and a decreasing general practitioner workforce, leading to strained access to primary care. The coronavirus disease 2019 pandemic has emphasized the potential for alternative consultation methods, highlighting opportunities to minimize unnecessary care. This article discusses the role of artificial-intelligence-driven symptom checkers, particularly their efficiency, utility, and challenges in primary care. Based on a study conducted in Italian general practices, insights from both physicians and patients were gathered regarding this emergent technology, highlighting differences in perceived utility, user satisfaction, and potential challenges. While symptom checkers are seen as potential tools for addressing healthcare challenges, concerns regarding their accuracy and the potential for misdiagnosis persist. Patients generally viewed them positively, valuing their ease of use and the empowerment they provide in managing health. However, some general practitioners perceive these tools as challenges to their expertise. This article proposes that artificial-intelligence-based symptom checkers can optimize medical-history taking for the benefit of both general practitioners and patients, with potential enhancements in complex diagnostic tasks rather than routine diagnoses. It underscores the importance of carefully integrating digital innovations while preserving the essential human touch in healthcare. Symptom checkers offer promising solutions; ensuring their accuracy, reliability, and effective integration into primary care requires rigorous research, clinical guidance, and an understanding of varied user perceptions. Collaboration among technologists, clinicians, and patients is paramount for the successful evolution of digital tools in healthcare.

Collapse

Hurvitz N, Ilan Y. The Constrained-Disorder Principle Assists in Overcoming Significant Challenges in Digital Health: Moving from "Nice to Have" to Mandatory Systems. Clin Pract 2023;13:994-1014. [PMID: 37623270 PMCID: PMC10453547 DOI: 10.3390/clinpract13040089] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/16/2023] [Accepted: 08/18/2023] [Indexed: 08/26/2023] Open

Nov O, Singh N, Mann D. Putting ChatGPT's Medical Advice to the (Turing) Test: Survey Study. JMIR MEDICAL EDUCATION 2023;9:e46939. [PMID: 37428540 PMCID: PMC10366957 DOI: 10.2196/46939] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/26/2023] [Accepted: 06/14/2023] [Indexed: 07/11/2023]

Abstract

BACKGROUND

Chatbots are being piloted to draft responses to patient questions, but patients' ability to distinguish between provider and chatbot responses and patients' trust in chatbots' functions are not well established.

OBJECTIVE

This study aimed to assess the feasibility of using ChatGPT (Chat Generative Pre-trained Transformer) or a similar artificial intelligence-based chatbot for patient-provider communication.

METHODS

A survey study was conducted in January 2023. Ten representative, nonadministrative patient-provider interactions were extracted from the electronic health record. Patients' questions were entered into ChatGPT with a request for the chatbot to respond using approximately the same word count as the human provider's response. In the survey, each patient question was followed by a provider- or ChatGPT-generated response. Participants were informed that 5 responses were provider generated and 5 were chatbot generated. Participants were asked-and incentivized financially-to correctly identify the response source. Participants were also asked about their trust in chatbots' functions in patient-provider communication, using a Likert scale from 1-5.

RESULTS

A US-representative sample of 430 study participants aged 18 and older were recruited on Prolific, a crowdsourcing platform for academic studies. In all, 426 participants filled out the full survey. After removing participants who spent less than 3 minutes on the survey, 392 respondents remained. Overall, 53.3% (209/392) of respondents analyzed were women, and the average age was 47.1 (range 18-91) years. The correct classification of responses ranged between 49% (192/392) to 85.7% (336/392) for different questions. On average, chatbot responses were identified correctly in 65.5% (1284/1960) of the cases, and human provider responses were identified correctly in 65.1% (1276/1960) of the cases. On average, responses toward patients' trust in chatbots' functions were weakly positive (mean Likert score 3.4 out of 5), with lower trust as the health-related complexity of the task in the questions increased.

CONCLUSIONS

ChatGPT responses to patient questions were weakly distinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower-risk health questions. It is important to continue studying patient-chatbot interaction as chatbots move from administrative to more clinical roles in health care.

Collapse

Kopka M, Scatturin L, Napierala H, Fürstenau D, Feufel MA, Balzer F, Schmieding ML. Characteristics of Users and Nonusers of Symptom Checkers in Germany: Cross-Sectional Survey Study. J Med Internet Res 2023;25:e46231. [PMID: 37338970 DOI: 10.2196/46231] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/12/2023] [Accepted: 05/03/2023] [Indexed: 06/21/2023] Open

Abstract

BACKGROUND

Previous studies have revealed that users of symptom checkers (SCs, apps that support self-diagnosis and self-triage) are predominantly female, are younger than average, and have higher levels of formal education. Little data are available for Germany, and no study has so far compared usage patterns with people's awareness of SCs and the perception of usefulness.

OBJECTIVE

We explored the sociodemographic and individual characteristics that are associated with the awareness, usage, and perceived usefulness of SCs in the German population.

METHODS

We conducted a cross-sectional online survey among 1084 German residents in July 2022 regarding personal characteristics and people's awareness and usage of SCs. Using random sampling from a commercial panel, we collected participant responses stratified by gender, state of residence, income, and age to reflect the German population. We analyzed the collected data exploratively.

RESULTS

Of all respondents, 16.3% (177/1084) were aware of SCs and 6.5% (71/1084) had used them before. Those aware of SCs were younger (mean 38.8, SD 14.6 years, vs mean 48.3, SD 15.7 years), were more often female (107/177, 60.5%, vs 453/907, 49.9%), and had higher formal education levels (eg, 72/177, 40.7%, vs 238/907, 26.2%, with a university/college degree) than those unaware. The same observation applied to users compared to nonusers. It disappeared, however, when comparing users to nonusers who were aware of SCs. Among users, 40.8% (29/71) considered these tools useful. Those considering them useful reported higher self-efficacy (mean 4.21, SD 0.66, vs mean 3.63, SD 0.81, on a scale of 1-5) and a higher net household income (mean EUR 2591.63, SD EUR 1103.96 [mean US $2798.96, SD US $1192.28], vs mean EUR 1626.60, SD EUR 649.05 [mean US $1756.73, SD US $700.97]) than those who considered them not useful. More women considered SCs unhelpful (13/44, 29.5%) compared to men (4/26, 15.4%).

CONCLUSIONS

Concurring with studies from other countries, our findings show associations between sociodemographic characteristics and SC usage in a German sample: users were on average younger, of higher socioeconomic status, and more commonly female compared to nonusers. However, usage cannot be explained by sociodemographic differences alone. It rather seems that sociodemographics explain who is or is not aware of the technology, but those who are aware of SCs are equally likely to use them, independently of sociodemographic differences. Although in some groups (eg, people with anxiety disorder), more participants reported to know and use SCs, they tended to perceive them as less useful. In other groups (eg, male participants), fewer respondents were aware of SCs, but those who used them perceived them to be more useful. Thus, SCs should be designed to fit specific user needs, and strategies should be developed to help reach individuals who could benefit but are not aware of SCs yet.

Collapse

Exploratory study: Evaluation of a symptom checker effectiveness for providing a diagnosis and evaluating the situation emergency compared to emergency physicians using simulated and standardized patients. PLoS One 2023;18:e0277568. [PMID: 36827277 PMCID: PMC9955603 DOI: 10.1371/journal.pone.0277568] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 10/30/2022] [Indexed: 02/25/2023] Open

Judson TJ, Pierce L, Tutman A, Mourad M, Neinstein AB, Shuler G, Gonzales R, Odisho AY. Utilization patterns and efficiency gains from use of a fully EHR-integrated COVID-19 self-triage and self-scheduling tool: a retrospective analysis. J Am Med Inform Assoc 2022;29:2066-2074. [PMID: 36029243 PMCID: PMC9667153 DOI: 10.1093/jamia/ocac161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 08/18/2022] [Accepted: 08/26/2022] [Indexed: 11/13/2022] Open

Gilbert A, Diep AN, Boufraioua M, Pétré B, Donneau AF, Ghuysen A. Patients' self-triage for unscheduled urgent care: a preliminary study on the accuracy and factors affecting the performance of a Belgian self-triage platform. BMC Health Serv Res 2022;22:1199. [PMID: 36151563 PMCID: PMC9508742 DOI: 10.1186/s12913-022-08571-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 09/13/2022] [Indexed: 11/17/2022] Open

Abstract

Background

Management of unscheduled urgent care is a complex concern for many healthcare providers. Facing the challenge of appropriately dispatching unscheduled care, primary and emergency physicians have collaboratively implemented innovative strategies such as telephone triage. Currently, new original solutions tend to emerge with the development of new technologies. We created an interactive patient self-triage platform, ODISSEE, and aimed to explore its accuracy and potential factors affecting its performance using clinical case scenarios.

Methods

The ODISSEE platform was developed based on previously validated triage protocols for out-of-hours primary care. ODISSEE is composed of 18 icons leading to algorithmic questions that finally provide an advised orientation (emergency or primary care services). To investigate ODISSEE performance, we used 100 clinical case scenarios, each associated with a preestablished orientation determined by a group of experts. Fifteen volunteers were asked to self-triage with 50 randomly selected scenarios using ODISSEE on a digital tablet. Their triage results were compared with the experts’ references.

Results

The 15 participants performed a total of 750 self-triages, which matched the experts references regarding the level of care in 85.6% of the cases. The orientation was incorrect in 14.4%, with an undertriage rate of 1.9% and an overtriage rate of 12.5%. The tool’s specificity and sensitivity to advise participants on the appropriate level of care were 69% (95% CI: 64—74) and 97% (95% CI: 95—98) respectively. When combined with advice on the level of urgency, the tool only found the correct orientation in 68.4% with 9.2% of undertriages and 22.4% of overtriages. Some participant characteristics and the types of medical conditions demonstrated a significant association with the tool performance.

Conclusion

Self-triage apps, such as the ODISSEE platform, could represent an innovative method to allow patients to self-triage to the most appropriate level of care. This study based on clinical vignettes highlights some positive arguments regarding ODISSEE safety, but further research is needed to assess the generalizability of such tools to the population without equity issues.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12913-022-08571-5.

Collapse

Fraser HSF, Cohan G, Koehler C, Anderson J, Lawrence A, Pateña J, Bacher I, Ranney ML. Evaluation of Diagnostic and Triage Accuracy and Usability of a Symptom Checker in an Emergency Department: Observational Study. JMIR Mhealth Uhealth 2022;10:e38364. [PMID: 36121688 PMCID: PMC9531004 DOI: 10.2196/38364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 05/31/2022] [Accepted: 06/10/2022] [Indexed: 11/26/2022] Open

Abstract

Background

Symptom checkers are clinical decision support apps for patients, used by tens of millions of people annually. They are designed to provide diagnostic and triage advice and assist users in seeking the appropriate level of care. Little evidence is available regarding their diagnostic and triage accuracy with direct use by patients for urgent conditions.

Objective

The aim of this study is to determine the diagnostic and triage accuracy and usability of a symptom checker in use by patients presenting to an emergency department (ED).

Methods

We recruited a convenience sample of English-speaking patients presenting for care in an urban ED. Each consenting patient used a leading symptom checker from Ada Health before the ED evaluation. Diagnostic accuracy was evaluated by comparing the symptom checker’s diagnoses and those of 3 independent emergency physicians viewing the patient-entered symptom data, with the final diagnoses from the ED evaluation. The Ada diagnoses and triage were also critiqued by the independent physicians. The patients completed a usability survey based on the Technology Acceptance Model.

Results

A total of 40 (80%) of the 50 participants approached completed the symptom checker assessment and usability survey. Their mean age was 39.3 (SD 15.9; range 18-76) years, and they were 65% (26/40) female, 68% (27/40) White, 48% (19/40) Hispanic or Latino, and 13% (5/40) Black or African American. Some cases had missing data or a lack of a clear ED diagnosis; 75% (30/40) were included in the analysis of diagnosis, and 93% (37/40) for triage. The sensitivity for at least one of the final ED diagnoses by Ada (based on its top 5 diagnoses) was 70% (95% CI 54%-86%), close to the mean sensitivity for the 3 physicians (on their top 3 diagnoses) of 68.9%. The physicians rated the Ada triage decisions as 62% (23/37) fully agree and 24% (9/37) safe but too cautious. It was rated as unsafe and too risky in 22% (8/37) of cases by at least one physician, in 14% (5/37) of cases by at least two physicians, and in 5% (2/37) of cases by all 3 physicians. Usability was rated highly; participants agreed or strongly agreed with the 7 Technology Acceptance Model usability questions with a mean score of 84.6%, although “satisfaction” and “enjoyment” were rated low.

Conclusions

This study provides preliminary evidence that a symptom checker can provide acceptable usability and diagnostic accuracy for patients with various urgent conditions. A total of 14% (5/37) of symptom checker triage recommendations were deemed unsafe and too risky by at least two physicians based on the symptoms recorded, similar to the results of studies on telephone and nurse triage. Larger studies are needed of diagnosis and triage performance with direct patient use in different clinical environments.

Collapse

Liu AW, Odisho AY, Brown Rd W, Gonzales R, Neinstein AB, Judson T. Patient Experience and Feedback after Use of an EHR-integrated COVID-19 Symptom Checker. JMIR Hum Factors 2022;9:e40064. [PMID: 35960593 PMCID: PMC9472505 DOI: 10.2196/40064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/20/2022] [Accepted: 08/06/2022] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Symptom checkers have been widely used during the COVID-19 pandemic to alleviate strain on health systems and offer patients a 24/7 self-service triage option. Although studies suggest that users may positively perceive online symptom checkers, no studies have quantified user feedback after use of an electronic health record (EHR)-integrated COVID-19 symptom checker with self-scheduling functionality.

OBJECTIVE

We aimed to understand user experience, user satisfaction, and user-reported alternatives to use of a COVID-19 symptom checker with self-triage and self-scheduling functionality.

METHODS

We launched a patient-portal based self-triage and self-scheduling tool in March 2020 for patients with COVID-19 symptoms, exposures, or questions. We made an optional, anonymous Qualtrics survey available to patients immediately after they completed the symptom checker.

RESULTS

Between December 16th, 2021 and March 28th, 2022, there were 395 unique responses to the survey. Overall, respondents reported high satisfaction across all demographics, with a median rating of 8 out of 10, and 47.6% of respondents giving a rating of 9 or 10 out of 10. User satisfaction scores were not associated with any demographic factors. The most common user-reported alternatives had the online tool not been available were calling the COVID-19 telephone hotline and sending a patient-portal message to their physician for advice. The ability to schedule a test online was the most important symptom checker feature for respondents. The most common categories of user feedback were regarding other COVID-19 services (e.g. telephone hotline), policies or procedures, or requesting additional features or functionality.

CONCLUSIONS

This analysis suggests that COVID-19 symptom checkers with self-triage and self-scheduling functionality may have high overall user satisfaction, regardless of user demographics. By allowing users to self-triage and self-schedule tests and visits, tools like this may prevent unnecessary calls and messages to clinicians. Individual feedback suggested that the user experience for this type of tool is highly dependent on the organization's operational workflows for COVID-19 testing and care. The study provides insight for the implementation and improvement of COVID-19 symptom checkers to ensure high user satisfaction. .

Collapse

Wetzel AJ, Koch R, Preiser C, Müller R, Klemmt M, Ranisch R, Ehni HJ, Wiesing U, Rieger MA, Henking T, Joos S. Ethical, Legal, and Social Implications of Symptom Checker Apps in Primary Health Care (CHECK.APP): Protocol for an Interdisciplinary Mixed Methods Study. JMIR Res Protoc 2022;11:e34026. [PMID: 35576570 PMCID: PMC9152714 DOI: 10.2196/34026] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 02/18/2022] [Accepted: 03/30/2022] [Indexed: 12/19/2022] Open

Abstract

Background

Symptom checker apps (SCAs) are accessible tools that provide early symptom assessment for users. The ethical, legal, and social implications of SCAs and their impact on the patient-physician relationship, the health care providers, and the health care system have sparsely been examined. This study protocol describes an approach to investigate the possible impacts and implications of SCAs on different levels of health care provision. It considers the perspectives of the users, nonusers, general practitioners (GPs), and health care experts.

Objective

We aim to assess a comprehensive overview of the use of SCAs and address problematic issues, if any. The primary outcomes of this study are empirically informed multi-perspective recommendations for different stakeholders on the ethical, legal, and social implications of SCAs.

Methods

Quantitative and qualitative methods will be used in several overlapping and interconnected study phases. In study phase 1, a comprehensive literature review will be conducted to assess the ethical, legal, social, and systemic impacts of SCAs. Study phase 2 comprises a survey that will be analyzed with a logistic regression. It aims to assess the user degree of SCAs in Germany as well as the predictors for SCA usage. Study phase 3 will investigate self-observational diaries and user interviews, which will be analyzed as integrated cases to assess user perspectives, usage pattern, and arising problems. Study phase 4 will comprise GP interviews to assess their experiences, perspectives, self-image, and concepts and will be analyzed with the basic procedure by Kruse. Moreover, interviews with health care experts will be conducted in study phase 3 and will be analyzed by using the reflexive thematical analysis approach of Braun and Clark.

Results

Study phase 1 will be completed in November 2021. We expect the results of study phase 2 in December 2021 and February 2022. In study phase 3, interviews are currently being conducted. The final study endpoint will be in February 2023.

Conclusions

The possible ethical, legal, social, and systemic impacts of a widespread use of SCAs that affect stakeholders and stakeholder groups on different levels of health care will be identified. The proposed methodological approach provides a multifaceted and diverse empirical basis for a broad discussion on these implications.

Trial Registration

German Clinical Trials Register (DRKS) DRKS00022465; https://tinyurl.com/yx53er67

International Registered Report Identifier (IRRID)

DERR1-10.2196/34026

Collapse

Schmieding ML, Kopka M, Schmidt K, Schulz-Niethammer S, Balzer F, Feufel MA. Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation. J Med Internet Res 2022;24:e31810. [PMID: 35536633 PMCID: PMC9131144 DOI: 10.2196/31810] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 11/19/2021] [Accepted: 01/30/2022] [Indexed: 12/16/2022] Open

Abstract

Background

Symptom checkers are digital tools assisting laypersons in self-assessing the urgency and potential causes of their medical complaints. They are widely used but face concerns from both patients and health care professionals, especially regarding their accuracy. A 2015 landmark study substantiated these concerns using case vignettes to demonstrate that symptom checkers commonly err in their triage assessment.

Objective

This study aims to revisit the landmark index study to investigate whether and how symptom checkers’ capabilities have evolved since 2015 and how they currently compare with laypersons’ stand-alone triage appraisal.

Methods

In early 2020, we searched for smartphone and web-based applications providing triage advice. We evaluated these apps on the same 45 case vignettes as the index study. Using descriptive statistics, we compared our findings with those of the index study and with publicly available data on laypersons’ triage capability.

Results

We retrieved 22 symptom checkers providing triage advice. The median triage accuracy in 2020 (55.8%, IQR 15.1%) was close to that in 2015 (59.1%, IQR 15.5%). The apps in 2020 were less risk averse (odds 1.11:1, the ratio of overtriage errors to undertriage errors) than those in 2015 (odds 2.82:1), missing >40% of emergencies. Few apps outperformed laypersons in either deciding whether emergency care was required or whether self-care was sufficient. No apps outperformed the laypersons on both decisions.

Conclusions

Triage performance of symptom checkers has, on average, not improved over the course of 5 years. It decreased in 2 use cases (advice on when emergency care is required and when no health care is needed for the moment). However, triage capability varies widely within the sample of symptom checkers. Whether it is beneficial to seek advice from symptom checkers depends on the app chosen and on the specific question to be answered. Future research should develop resources (eg, case vignette repositories) to audit the capabilities of symptom checkers continuously and independently and provide guidance on when and to whom they should be recommended.

Collapse

Kopka M, Feufel MA, Balzer F, Schmieding ML. Triage Capability of Laypersons: Retrospective, Exploratory Analysis (Preprint). JMIR Form Res 2022;6:e38977. [PMID: 36222793 PMCID: PMC9607917 DOI: 10.2196/38977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 08/08/2022] [Accepted: 08/16/2022] [Indexed: 11/19/2022] Open

Abstract

Background

Although medical decision-making may be thought of as a task involving health professionals, many decisions, including critical health–related decisions are made by laypersons alone. Specifically, as the first step to most care episodes, it is the patient who determines whether and where to seek health care (triage). Overcautious self-assessments (ie, overtriaging) may lead to overutilization of health care facilities and overcrowded emergency departments, whereas imprudent decisions (ie, undertriaging) constitute a risk to the patient’s health. Recently, patient-facing decision support systems, commonly known as symptom checkers, have been developed to assist laypersons in these decisions.

Objective

The purpose of this study is to identify factors influencing laypersons’ ability to self-triage and their risk averseness in self-triage decisions.

Methods

We analyzed publicly available data on 91 laypersons appraising 45 short fictitious patient descriptions (case vignettes; N=4095 appraisals). Using signal detection theory and descriptive and inferential statistics, we explored whether the type of medical decision laypersons face, their confidence in their decision, and sociodemographic factors influence their triage accuracy and the type of errors they make. We distinguished between 2 decisions: whether emergency care was required (decision 1) and whether self-care was sufficient (decision 2).

Results

The accuracy of detecting emergencies (decision 1) was higher (mean 82.2%, SD 5.9%) than that of deciding whether any type of medical care is required (decision 2, mean 75.9%, SD 5.25%; t_>90=8.4; P<.001; Cohen d=0.9). Sensitivity for decision 1 was lower (mean 67.5%, SD 16.4%) than its specificity (mean 89.6%, SD 8.6%) whereas sensitivity for decision 2 was higher (mean 90.5%, SD 8.3%) than its specificity (mean 46.7%, SD 15.95%). Female participants were more risk averse and overtriaged more often than male participants, but age and level of education showed no association with participants’ risk averseness. Participants’ triage accuracy was higher when they were certain about their appraisal (2114/3381, 62.5%) than when being uncertain (378/714, 52.9%). However, most errors occurred when participants were certain of their decision (1267/1603, 79%). Participants were more commonly certain of their overtriage errors (mean 80.9%, SD 23.8%) than their undertriage errors (mean 72.5%, SD 30.9%; t_>89=3.7; P<.001; d=0.39).

Conclusions

Our study suggests that laypersons are overcautious in deciding whether they require medical care at all, but they miss identifying a considerable portion of emergencies. Our results further indicate that women are more risk averse than men in both types of decisions. Layperson participants made most triage errors when they were certain of their own appraisal. Thus, they might not follow or even seek advice (eg, from symptom checkers) in most instances where advice would be useful.

Collapse

Millen E, Salim N, Azadzoy H, Bane MM, O'Donnell L, Schmude M, Bode P, Tuerk E, Vaidya R, Gilbert SH. Study protocol for a pilot prospective, observational study investigating the condition suggestion and urgency advice accuracy of a symptom assessment app in sub-Saharan Africa: the AFYA-'Health' Study. BMJ Open 2022;12:e055915. [PMID: 35410928 PMCID: PMC9003603 DOI: 10.1136/bmjopen-2021-055915] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open

Abstract

INTRODUCTION

Due to a global shortage of healthcare workers, there is a lack of basic healthcare for 4 billion people worldwide, particularly affecting low-income and middle-income countries. The utilisation of AI-based healthcare tools such as symptom assessment applications (SAAs) has the potential to reduce the burden on healthcare systems. The purpose of the AFYA Study (AI-based Assessment oF health sYmptoms in TAnzania) is to evaluate the accuracy of the condition suggestions and urgency advice provided by a user on a Swahili language Ada SAA.

METHODS AND ANALYSIS

This study is designed as an observational prospective clinical study. The setting is a waiting room of a Tanzanian district hospital. It will include patients entering the outpatient clinic with various conditions and age groups, including children and adolescents. Patients will be asked to use the SAA before proceeding to usual care. After usual care, they will have a consultation with a study-provided physician. Patients and healthcare practitioners will be blinded to the SAA's results. An expert panel will compare the Ada SAA's condition suggestions and urgency advice to usual care and study provided differential diagnoses and triage. The primary outcome measures are the accuracy and comprehensiveness of the Ada SAA evaluated against the gold standard differential diagnoses.

ETHICS AND DISSEMINATION

Ethical approval was received by the ethics committee (EC) of Muhimbili University of Health and Allied Sciences with an approval number MUHAS-REC-09-2019-044 and the National Institute for Medical Research, NIMR/HQ/R.8c/Vol. I/922. All amendments to the protocol are reported and adapted on the basis of the requirements of the EC. The results from this study will be submitted to peer-reviewed journals, local and international stakeholders, and will be communicated in editorials/articles by Ada Health.

TRIAL REGISTRATION NUMBER

NCT04958577.

Collapse

Hennemann S, Kuhn S, Witthöft M, Jungmann SM. Diagnostic Performance of an App-Based Symptom Checker in Mental Disorders: Comparative Study in Psychotherapy Outpatients. JMIR Ment Health 2022;9:e32832. [PMID: 35099395 PMCID: PMC8844983 DOI: 10.2196/32832] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 11/09/2021] [Indexed: 12/31/2022] Open

Abstract

BACKGROUND

Digital technologies have become a common starting point for health-related information-seeking. Web- or app-based symptom checkers aim to provide rapid and accurate condition suggestions and triage advice but have not yet been investigated for mental disorders in routine health care settings.

OBJECTIVE

This study aims to test the diagnostic performance of a widely available symptom checker in the context of formal diagnosis of mental disorders when compared with therapists' diagnoses based on structured clinical interviews.

METHODS

Adult patients from an outpatient psychotherapy clinic used the app-based symptom checker Ada-check your health (ADA; Ada Health GmbH) at intake. Accuracy was assessed as the agreement of the first and 1 of the first 5 condition suggestions of ADA with at least one of the interview-based therapist diagnoses. In addition, sensitivity, specificity, and interrater reliabilities (Gwet first-order agreement coefficient [AC1]) were calculated for the 3 most prevalent disorder categories. Self-reported usability (assessed using the System Usability Scale) and acceptance of ADA (assessed using an adapted feedback questionnaire) were evaluated.

RESULTS

A total of 49 patients (30/49, 61% women; mean age 33.41, SD 12.79 years) were included in this study. Across all patients, the interview-based diagnoses matched ADA's first condition suggestion in 51% (25/49; 95% CI 37.5-64.4) of cases and 1 of the first 5 condition suggestions in 69% (34/49; 95% CI 55.4-80.6) of cases. Within the main disorder categories, the accuracy of ADA's first condition suggestion was 0.82 for somatoform and associated disorders, 0.65 for affective disorders, and 0.53 for anxiety disorders. Interrater reliabilities ranged from low (AC1=0.15 for anxiety disorders) to good (AC1=0.76 for somatoform and associated disorders). The usability of ADA was rated as high in the System Usability Scale (mean 81.51, SD 11.82, score range 0-100). Approximately 71% (35/49) of participants would have preferred a face-to-face over an app-based diagnostic.

CONCLUSIONS

Overall, our findings suggest that a widely available symptom checker used in the formal diagnosis of mental disorders could provide clinicians with a list of condition suggestions with moderate-to-good accuracy. However, diagnostic performance was heterogeneous between disorder categories and included low interrater reliability. Although symptom checkers have some potential to complement the diagnostic process as a screening tool, the diagnostic performance should be tested in larger samples and in comparison with further diagnostic instruments.

Collapse

Arellano Carmona K, Chittamuru D, Kravitz RL, Ramondt S, Ramírez AS. Beyond Dr. Google: Health information seeking from an intelligent online symptom checker: Cross-Sectional Questionnaire Study (Preprint). J Med Internet Res 2022;24:e36322. [PMID: 35984690 PMCID: PMC9440406 DOI: 10.2196/36322] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 04/21/2022] [Accepted: 05/24/2022] [Indexed: 01/20/2023] Open

Abstract

Background

The ever-growing amount of health information available on the web is increasing the demand for tools providing personalized and actionable health information. Such tools include symptom checkers that provide users with a potential diagnosis after responding to a set of probes about their symptoms. Although the potential for their utility is great, little is known about such tools’ actual use and effects.

Objective

We aimed to understand who uses a web-based artificial intelligence–powered symptom checker and its purposes, how they evaluate the experience of the web-based interview and quality of the information, what they intend to do with the recommendation, and predictors of future use.

Methods

Cross-sectional survey of web-based health information seekers following the completion of a symptom checker visit (N=2437). Measures of comprehensibility, confidence, usefulness, health-related anxiety, empowerment, and intention to use in the future were assessed. ANOVAs and the Wilcoxon rank sum test examined mean outcome differences in racial, ethnic, and sex groups. The relationship between perceptions of the symptom checker and intention to follow recommended actions was assessed using multilevel logistic regression.

Results

Buoy users were well-educated (1384/1704, 81.22% college or higher), primarily White (1227/1693, 72.47%), and female (2069/2437, 84.89%). Most had insurance (1449/1630, 88.89%), a regular health care provider (1307/1709, 76.48%), and reported good health (1000/1703, 58.72%). Three types of symptoms—pain (855/2437, 35.08%), gynecological issues (293/2437, 12.02%), and masses or lumps (204/2437, 8.37%)—accounted for almost half (1352/2437, 55.48%) of site visits. Buoy’s top three primary recommendations split across less-serious triage categories: primary care physician in 2 weeks (754/2141, 35.22%), self-treatment (452/2141, 21.11%), and primary care in 1 to 2 days (373/2141, 17.42%). Common diagnoses were musculoskeletal (303/2437, 12.43%), gynecological (304/2437, 12.47%) and skin conditions (297/2437, 12.19%), and infectious diseases (300/2437, 12.31%). Users generally reported high confidence in Buoy, found it useful and easy to understand, and said that Buoy made them feel less anxious and more empowered to seek medical help. Users for whom Buoy recommended “Waiting/Watching” or “Self-Treatment” had strongest intentions to comply, whereas those advised to seek primary care had weaker intentions. Compared with White users, Latino and Black users had significantly more confidence in Buoy (P<.05), and the former also found it significantly more useful (P<.05). Latino (odds ratio 1.96, 95% CI 1.22-3.25) and Black (odds ratio 2.37, 95% CI 1.57-3.66) users also had stronger intentions to discuss recommendations with a provider than White users.

Conclusions

Results demonstrate the potential utility of a web-based health information tool to empower people to seek care and reduce health-related anxiety. However, despite encouraging results suggesting the tool may fulfill unmet health information needs among women and Black and Latino adults, analyses of the user base illustrate persistent second-level digital divide effects.

Collapse

Chan F, Lai S, Pieterman M, Richardson L, Singh A, Peters J, Toy A, Piccininni C, Rouault T, Wong K, Quong JK, Wakabayashi AT, Pawelec-Brzychczy A. Performance of a new symptom checker in patient triage: Canadian cohort study. PLoS One 2021;16:e0260696. [PMID: 34852016 PMCID: PMC8635379 DOI: 10.1371/journal.pone.0260696] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 11/15/2021] [Indexed: 11/19/2022] Open

Kopka M, Schmieding ML, Rieger T, Roesler E, Balzer F, Feufel MA. Trust Me, I’m Not a Doctor! Determinants of Laypersons’ Trust in Medical Decision Aids: Experimental Study (Preprint). JMIR Hum Factors 2021;9:e35219. [PMID: 35503248 PMCID: PMC9115664 DOI: 10.2196/35219] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 02/09/2022] [Accepted: 03/06/2022] [Indexed: 11/13/2022] Open

Woodcock C, Mittelstadt B, Busbridge D, Blank G. The Impact of Explanations on Layperson Trust in Artificial Intelligence-Driven Symptom Checker Apps: Experimental Study. J Med Internet Res 2021;23:e29386. [PMID: 34730544 PMCID: PMC8600426 DOI: 10.2196/29386] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/11/2021] [Accepted: 07/27/2021] [Indexed: 01/22/2023] Open

Abstract

BACKGROUND

Artificial intelligence (AI)-driven symptom checkers are available to millions of users globally and are advocated as a tool to deliver health care more efficiently. To achieve the promoted benefits of a symptom checker, laypeople must trust and subsequently follow its instructions. In AI, explanations are seen as a tool to communicate the rationale behind black-box decisions to encourage trust and adoption. However, the effectiveness of the types of explanations used in AI-driven symptom checkers has not yet been studied. Explanations can follow many forms, including why-explanations and how-explanations. Social theories suggest that why-explanations are better at communicating knowledge and cultivating trust among laypeople.

OBJECTIVE

The aim of this study is to ascertain whether explanations provided by a symptom checker affect explanatory trust among laypeople and whether this trust is impacted by their existing knowledge of disease.

METHODS

A cross-sectional survey of 750 healthy participants was conducted. The participants were shown a video of a chatbot simulation that resulted in the diagnosis of either a migraine or temporal arteritis, chosen for their differing levels of epidemiological prevalence. These diagnoses were accompanied by one of four types of explanations. Each explanation type was selected either because of its current use in symptom checkers or because it was informed by theories of contrastive explanation. Exploratory factor analysis of participants' responses followed by comparison-of-means tests were used to evaluate group differences in trust.

RESULTS

Depending on the treatment group, two or three variables were generated, reflecting the prior knowledge and subsequent mental model that the participants held. When varying explanation type by disease, migraine was found to be nonsignificant (P=.65) and temporal arteritis, marginally significant (P=.09). Varying disease by explanation type resulted in statistical significance for input influence (P=.001), social proof (P=.049), and no explanation (P=.006), with counterfactual explanation (P=.053). The results suggest that trust in explanations is significantly affected by the disease being explained. When laypeople have existing knowledge of a disease, explanations have little impact on trust. Where the need for information is greater, different explanation types engender significantly different levels of trust. These results indicate that to be successful, symptom checkers need to tailor explanations to each user's specific question and discount the diseases that they may also be aware of.

CONCLUSIONS

System builders developing explanations for symptom-checking apps should consider the recipient's knowledge of a disease and tailor explanations to each user's specific need. Effort should be placed on generating explanations that are personalized to each user of a symptom checker to fully discount the diseases that they may be aware of and to close their information gap.

Collapse

Gilbert S, Fenech M, Upadhyay S, Wicks P, Novorol C. Quality of condition suggestions and urgency advice provided by the Ada symptom assessment app evaluated with vignettes optimised for Australia. Aust J Prim Health 2021;27:377-381. [PMID: 34706813 DOI: 10.1071/py21032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 05/11/2021] [Indexed: 11/23/2022]

Levine DM, Mehrotra A. Assessment of Diagnosis and Triage in Validated Case Vignettes Among Nonphysicians Before and After Internet Search. JAMA Netw Open 2021;4:e213287. [PMID: 33779741 PMCID: PMC8008286 DOI: 10.1001/jamanetworkopen.2021.3287] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

IMPORTANCE

When confronted with new medical symptoms, many people turn to the internet to understand why they are ill as well as whether and where they should get care. Such searches may be harmful because they may facilitate misdiagnosis and inappropriate triage.

OBJECTIVE

To empirically measure the association of an internet search for health information with diagnosis, triage, and anxiety by laypeople.

DESIGN, SETTING, AND PARTICIPANTS

This survey study used a nationally representative sample of US adults who were recruited through an online platform between April 1, 2019, and April 15, 2019. A total of 48 validated case vignettes of both common (eg, viral illness) and severe (eg, heart attack) conditions were used. Participants were asked to relay their diagnosis, triage, and anxiety regarding 1 of these cases before and after searching the internet for health information.

EXPOSURES

Short, validated case vignettes written at or below the sixth-grade reading level randomly assigned to participants.

MAIN OUTCOMES AND MEASURES

Correct diagnosis, correct triage, and flipping (changing) or anchoring (not changing) diagnosis and triage decisions were the main outcomes. Multivariable modeling was performed to identify patient factors associated with correct triage and diagnosis.

RESULTS

Of the 5000 participants, 2549 were female (51.0%), 3819 were White (76.4%), and the mean (SD) age was 45.0 (16.9) years. Mean internet search time was 12.1 (95% CI, 10.7-13.5) minutes per case. No difference in triage accuracy was found before and after search (74.5% vs 74.1%; difference, -0.4 [95% CI, -1.4 to 0.6]; P = .06), but improved diagnostic accuracy was found (49.8% vs 54.0%; difference, 4.2% [95% CI, 3.1%-5.3%]; P < .001). Most participants (4254 [85.1%]) were anchored on their diagnosis. Of the 14.9% of participants (n = 746) who flipped their diagnosis, 9.6% (n = 478) flipped from incorrect to correct and 5.4% (n = 268) flipped from correct to incorrect. The following groups had an increased rate of correct diagnosis: adults 40 years or older (eg, 40-49 years: 5.1 [95% CI, 0.8-9.4] percentage points better than those aged <30 years; P = .02), women (9.4 [95% CI, 6.8-12.0] percentage points better than men; P < .001), and those with perceived poor health status (16.3 [95% CI, 6.9-25.6] percentage points better than those with excellent status; P = .001) and with more than 2 chronic diseases (6.8 [95% CI, 1.5-12.1] percentage points better than those with 0 conditions; P = .01).

CONCLUSIONS AND RELEVANCE

This study found that an internet search for health information was associated with small increases in diagnostic accuracy but not with triage accuracy.

Collapse

Versluijs Y, Brown LE, Ring D. Does a Previsit Phone Call from the Surgeon Reduce Decision Conflict? Telemed J E Health 2021;27:1282-1287. [PMID: 33538643 DOI: 10.1089/tmj.2020.0475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract

Background: There is some evidence that previsit strategies can make in-person visits more productive and efficient. We compared between people who received a phone call before a musculoskeletal specialty visit and people who did not with respect to several factors: (1) decision conflict (difficulty deciding between two or more options), (2) perceived clinician empathy after an in-person visit, and (3) arrival for the scheduled in-person appointment. We also recorded the specialist's opinion that the phone call alone could adequately replace an in-person visit while maintaining quality, safety, and effectiveness. Materials and Methods: In this prospective randomized-controlled trial, 122 patients were enrolled and randomized to receive a previsit phone call by an orthopedic surgeon before a scheduled visit or not. After the in-person visit, patients completed a (1) demographic questionnaire including age, gender, race/ethnicity, marital status, level of education, work status, and comorbidities; (2) Decision Conflict Scale; and (3) Jefferson Scale of Patient Perceptions of Physician Empathy. Results: No significant difference was found between the two groups in decision conflict, perceived empathy, or not attending the scheduled visit. Of the 55 successful phone calls, the surgeon felt that 50 (91%) had the potential to safely and effectively replace an in-person visit. Conclusion: Although a previsit phone call did not reduce decision conflict or improve the patient experience as measured after one visit, there may be merit in studying an increased number of touch points, particularly with some subsets of illness featuring substantial stress or misconceptions. The identified potential for the application and transfer of specialty expertise through telephone alone also merits additional study.

Collapse

Goetz CM, Arnetz JE, Sudan S, Arnetz BB. Perceptions of virtual primary care physicians: A focus group study of medical and data science graduate students. PLoS One 2020;15:e0243641. [PMID: 33332409 PMCID: PMC7745971 DOI: 10.1371/journal.pone.0243641] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 11/20/2020] [Indexed: 11/18/2022] Open

Gilbert S, Mehl A, Baluch A, Cawley C, Challiner J, Fraser H, Millen E, Montazeri M, Multmeier J, Pick F, Richter C, Türk E, Upadhyay S, Virani V, Vona N, Wicks P, Novorol C. How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs. BMJ Open 2020;10:e040269. [PMID: 33328258 PMCID: PMC7745523 DOI: 10.1136/bmjopen-2020-040269] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Abstract

OBJECTIVES

To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps.

DESIGN

Vignettes study.

SETTING

200 primary care vignettes.

INTERVENTION/COMPARATOR

For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes' gold-standard.

PRIMARY OUTCOME MEASURES

(1) Proportion of conditions 'covered' by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of 'safe' urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative).

RESULTS

Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs-Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs-Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs-Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10^-3).

CONCLUSIONS

The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.

Collapse

Ćirković A. Evaluation of Four Artificial Intelligence-Assisted Self-Diagnosis Apps on Three Diagnoses: Two-Year Follow-Up Study. J Med Internet Res 2020;22:e18097. [PMID: 33275113 PMCID: PMC7748958 DOI: 10.2196/18097] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 08/04/2020] [Accepted: 10/30/2020] [Indexed: 12/15/2022] Open

Abstract

Background

Consumer-oriented mobile self-diagnosis apps have been developed using undisclosed algorithms, presumably based on machine learning and other artificial intelligence (AI) technologies. The US Food and Drug Administration now discerns apps with learning AI algorithms from those with stable ones and treats the former as medical devices. To the author’s knowledge, no self-diagnosis app testing has been performed in the field of ophthalmology so far.

Objective

The objective of this study was to test apps that were previously mentioned in the scientific literature on a set of diagnoses in a deliberate time interval, comparing the results and looking for differences that hint at “nonlocked” learning algorithms.

Methods

Four apps from the literature were chosen (Ada, Babylon, Buoy, and Your.MD). A set of three ophthalmology diagnoses (glaucoma, retinal tear, dry eye syndrome) representing three levels of urgency was used to simultaneously test the apps’ diagnostic efficiency and treatment recommendations in this specialty. Two years was the chosen time interval between the tests (2018 and 2020). Scores were awarded by one evaluating physician using a defined scheme.

Results

Two apps (Ada and Your.MD) received significantly higher scores than the other two. All apps either worsened in their results between 2018 and 2020 or remained unchanged at a low level. The variation in the results over time indicates “nonlocked” learning algorithms using AI technologies. None of the apps provided correct diagnoses and treatment recommendations for all three diagnoses in 2020. Two apps (Babylon and Your.MD) asked significantly fewer questions than the other two (P<.001).

Conclusions

“Nonlocked” algorithms are used by self-diagnosis apps. The diagnostic efficiency of the tested apps seems to worsen over time, with some apps being more capable than others. Systematic studies on a wider scale are necessary for health care providers and patients to correctly assess the safety and efficacy of such apps and for correct classification by health care regulating authorities.

Collapse

Morse KE, Ostberg NP, Jones VG, Chan AS. Use Characteristics and Triage Acuity of a Digital Symptom Checker in a Large Integrated Health System: Population-Based Descriptive Study. J Med Internet Res 2020;22:e20549. [PMID: 33170799 PMCID: PMC7717918 DOI: 10.2196/20549] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/07/2020] [Accepted: 11/07/2020] [Indexed: 12/11/2022] Open

Abstract

Background

Pressure on the US health care system has been increasing due to a combination of aging populations, rising health care expenditures, and most recently, the COVID-19 pandemic. Responses to this pressure are hindered in part by reliance on a limited supply of highly trained health care professionals, creating a need for scalable technological solutions. Digital symptom checkers are artificial intelligence–supported software tools that use a conversational “chatbot” format to support rapid diagnosis and consistent triage. The COVID-19 pandemic has brought new attention to these tools due to the need to avoid face-to-face contact and preserve urgent care capacity. However, evidence-based deployment of these chatbots requires an understanding of user demographics and associated triage recommendations generated by a large general population.

Objective

In this study, we evaluate the user demographics and levels of triage acuity provided by a symptom checker chatbot deployed in partnership with a large integrated health system in the United States.

Methods

This population-based descriptive study included all web-based symptom assessments completed on the website and patient portal of the Sutter Health system (24 hospitals in Northern California) from April 24, 2019, to February 1, 2020. User demographics were compared to relevant US Census population data.

Results

A total of 26,646 symptom assessments were completed during the study period. Most assessments (17,816/26,646, 66.9%) were completed by female users. The mean user age was 34.3 years (SD 14.4 years), compared to a median age of 37.3 years of the general population. The most common initial symptom was abdominal pain (2060/26,646, 7.7%). A substantial number of assessments (12,357/26,646, 46.4%) were completed outside of typical physician office hours. Most users were advised to seek medical care on the same day (7299/26,646, 27.4%) or within 2-3 days (6301/26,646, 23.6%). Over a quarter of the assessments indicated a high degree of urgency (7723/26,646, 29.0%).

Conclusions

Users of the symptom checker chatbot were broadly representative of our patient population, although they skewed toward younger and female users. The triage recommendations were comparable to those of nurse-staffed telephone triage lines. Although the emergence of COVID-19 has increased the interest in remote medical assessment tools, it is important to take an evidence-based approach to their deployment.

Collapse

Coquet J, Blayney DW, Brooks JD, Hernandez-Boussard T. Association between patient-initiated emails and overall 2-year survival in cancer patients undergoing chemotherapy: Evidence from the real-world setting. Cancer Med 2020;9:8552-8561. [PMID: 32986931 PMCID: PMC7666724 DOI: 10.1002/cam4.3483] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/09/2020] [Accepted: 09/02/2020] [Indexed: 11/12/2022] Open

Abstract

PURPOSE

Prior studies suggest email communication between patients and providers may improve patient engagement and health outcomes. The purpose of this study was to determine whether patient-initiated emails are associated with overall survival benefits among cancer patients undergoing chemotherapy.

PATIENTS AND METHODS

We identified patient-initiated emails through the patient portal in electronic health records (EHR) among 9900 cancer patients receiving chemotherapy between 2013 and 2018. Email users were defined as patients who sent at least one email 12 months before to 2 months after chemotherapy started. A propensity score-matched cohort analysis was carried out to reduce bias due to confounding (age, primary cancer type, gender, insurance payor, ethnicity, race, stage, income, Charlson score, county of residence). The cohort included 3223 email users and 3223 non-email users. The primary outcome was overall 2-year survival stratified by email use. Secondary outcomes included number of face-to-face visits, prescriptions, and telephone calls. The healthcare teams' response to emails and other forms of communication was also investigated. Finally, a quality measure related to chemotherapy-related inpatient and emergency department visits was evaluated.

RESULTS

Overall 2-year survival was higher in patients who were email users, with an adjusted hazard ratio of 0.80 (95 CI 0.72-0.90; p < 0.001). Email users had higher rates of healthcare utilization, including face-to-face visits (63 vs. 50; p < 0.001), drug prescriptions (28 vs. 21; p < 0.001), and phone calls (18 vs. 16; p < 0.001). Clinical quality outcome measure of inpatient use was better among email users (p = 0.015).

CONCLUSION

Patient-initiated emails are associated with a survival benefit among cancer patients receiving chemotherapy and may be a proxy for patient engagement. As value-based payment models emphasize incorporating the patients' voice into their care, email communications could serve as a novel source of patient-generated data.

Collapse

Meskó B, Görög M. A short guide for medical professionals in the era of artificial intelligence. NPJ Digit Med 2020;3:126. [PMID: 33043150 PMCID: PMC7518439 DOI: 10.1038/s41746-020-00333-z] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 08/31/2020] [Indexed: 02/08/2023] Open

Miller S, Gilbert S, Virani V, Wicks P. Patients' Utilization and Perception of an Artificial Intelligence-Based Symptom Assessment and Advice Technology in a British Primary Care Waiting Room: Exploratory Pilot Study. JMIR Hum Factors 2020;7:e19713. [PMID: 32540836 PMCID: PMC7382011 DOI: 10.2196/19713] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 06/11/2020] [Accepted: 06/14/2020] [Indexed: 12/25/2022] Open

Abstract

BACKGROUND

When someone needs to know whether and when to seek medical attention, there are a range of options to consider. Each will have consequences for the individual (primarily considering trust, convenience, usefulness, and opportunity costs) and for the wider health system (affecting clinical throughput, cost, and system efficiency). Digital symptom assessment technologies that leverage artificial intelligence may help patients navigate to the right type of care with the correct degree of urgency. However, a recent review highlighted a gap in the literature on the real-world usability of these technologies.

OBJECTIVE

We sought to explore the usability, acceptability, and utility of one such symptom assessment technology, Ada, in a primary care setting.

METHODS

Patients with a new complaint attending a primary care clinic in South London were invited to use a custom version of the Ada symptom assessment mobile app. This exploratory pilot study was conducted between November 2017 and January 2018 in a practice with 20,000 registered patients. Participants were asked to complete an Ada self-assessment about their presenting complaint on a study smartphone, with assistance provided if required. Perceptions on the app and its utility were collected through a self-completed study questionnaire following completion of the Ada self-assessment.

RESULTS

Over a 3-month period, 523 patients participated. Most were female (n=325, 62.1%), mean age 39.79 years (SD 17.7 years), with a larger proportion (413/506, 81.6%) of working-age individuals (aged 15-64) than the general population (66.0%). Participants rated Ada's ease of use highly, with most (511/522, 97.8%) reporting it was very or quite easy. Most would use Ada again (443/503, 88.1%) and agreed they would recommend it to a friend or relative (444/520, 85.3%). We identified a number of age-related trends among respondents, with a directional trend for more young respondents to report Ada had provided helpful advice (50/54, 93%, 18-24-year olds reported helpful) than older respondents (19/32, 59%, adults aged 70+ reported helpful). We found no sex differences on any of the usability questions fielded. While most respondents reported that using the symptom checker would not have made a difference in their care-seeking behavior (425/494, 86.0%), a sizable minority (63/494, 12.8%) reported they would have used lower-intensity care such as self-care, pharmacy, or delaying their appointment. The proportion was higher for patients aged 18-24 (11/50, 22%) than aged 70+ (0/28, 0%).

CONCLUSIONS

In this exploratory pilot study, the digital symptom checker was rated as highly usable and acceptable by patients in a primary care setting. Further research is needed to confirm whether the app might appropriately direct patients to timely care, and understand how this might save resources for the health system. More work is also needed to ensure the benefits accrue equally to older age groups.

Collapse

Dunn AG. Will online symptom checkers improve health care in Australia? Med J Aust 2020;212:512-513. [PMID: 32441062 DOI: 10.5694/mja2.50621] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]