1
|
Koscelny SN, Sadralashrafi S, Neyens DM. Generative AI responses are a dime a dozen; Making them count is the challenge - Evaluating information presentation styles in healthcare chatbots using hierarchical Bayesian regression models. APPLIED ERGONOMICS 2025; 128:104515. [PMID: 40250134 DOI: 10.1016/j.apergo.2025.104515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 12/16/2024] [Accepted: 03/29/2025] [Indexed: 04/20/2025]
Abstract
The emergence of large language models offers new opportunities to deliver effective healthcare information through web-based healthcare chatbots. Health information is often complex and technical, making it crucial to design human-AI interactions that effectively meet user needs. Employing a 2x2 between subjects design, we controlled for two independent variables: communication style (conversational vs. informative) and language style (technical vs. non-technical). We used hierarchical Bayesian regression models to assess the impact varying information presentation styles on effectiveness, trustworthiness, and usability. The findings revealed perceptions of low usability significantly decreased the effectiveness of the healthcare chatbot. Additionally, participants exposed to the conversational style of the chatbot had significantly increased likelihoods to perceive it with higher usability but were also more likely to be less trusting of the chatbot. These results indicate varying information presentation styles can impact user experience and offers insights for future research with healthcare chatbots and other AI systems.
Collapse
Affiliation(s)
- Samuel N Koscelny
- Department of Industrial Engineering, 100 Freeman Hall, Clemson University, Clemson, SC, 29634, USA.
| | - Sara Sadralashrafi
- Department of Industrial Engineering, 100 Freeman Hall, Clemson University, Clemson, SC, 29634, USA.
| | - David M Neyens
- Department of Industrial Engineering, 100 Freeman Hall, Clemson University, Clemson, SC, 29634, USA; Department of Bioengineering, Clemson University, Clemson, SC, 29634, USA.
| |
Collapse
|
2
|
Sundaramoorthy S, Ratra V, Shankar V, Dorairajan R, Maskati Q, Fredrick TN, Ratra A, Ratra D. Conversational Guide for Cataract Surgery Complications: A Comparative Study of Surgeons versus Large Language Model-Based Chatbot Generated Instructions for Patient Interaction. Ophthalmic Epidemiol 2025:1-8. [PMID: 40172978 DOI: 10.1080/09286586.2025.2484772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 01/21/2025] [Accepted: 03/19/2025] [Indexed: 04/04/2025]
Abstract
PURPOSE It is difficult to explain the complications of surgery to patients. Care has to be taken to convey the facts clearly and objectively while expressing concern for their wellbeing. This study compared responses from surgeons with responses from a large language model (LLM)-based chatbot. METHODS We presented 10 common scenarios of cataract surgery complications to seven senior surgeons and a chatbot. The responses were graded by two independent graders for comprehension, readability, and complexity of language using previously validated indices. The responses were analyzed for accuracy and completeness. Honesty and empathy were graded for both groups. Scores were averaged and tabulated. RESULTS The readability scores for the surgeons (10.64) were significantly less complex than the chatbot (12.54) (p < 0.001). The responses from the surgeons were shorter, whereas the chatbot tended to give more detailed answers. The average accuracy and completeness score of chatbot-generated conversations was 2.36 (0.55), which was similar to the surgeons' score of 2.58 (0.36) (p = 0.164). The responses from the chatbot were more generalized, lacking specific alternative measures. While empathy scores were higher for surgeons (1.81 vs. 1.20, p = 0.041), honesty scores showed no significant difference. CONCLUSIONS The LLM-based chatbot gave a detailed description of the complication but was less specific about the alternative measures. The surgeons had a more in-depth understanding of the situation. The chatbot showed complete honesty but scored less for empathy. With more training using complex real-world scenarios and specialized ophthalmologic data, the chatbots could be used to assist the surgeons in counselling patients for postoperative complications.
Collapse
Affiliation(s)
| | - Vineet Ratra
- Department of Comprehensive Ophthalmology, Medical Research Foundation, Sankara Nethralaya, Chennai, India
| | - Vijay Shankar
- Department of Cataract and Refractive Surgery, Shanker Eye clinic, Chennai, India
| | - Ramesh Dorairajan
- Department of Cataract and Refractive Surgery, Sundar Eye Clinic, Chennai, India
| | - Quresh Maskati
- Department of Cataract and Refractive Surgery, Maskati Eye Clinic, Mumbai, India
| | - T Nirmal Fredrick
- Department of Comprehensive Ophthalmology, Nirmal's Eye Clinic, Chennai, India
| | - Aashna Ratra
- Department of Ophthalmology, Stanley Medical College, Chennai, India
| | - Dhanashree Ratra
- Department of Vitreoretinal Diseases, Medical Research Foundation, Sankara Nethralaya, Chennai, India
| |
Collapse
|
3
|
Wang H, Masselos K, Tong J, Connor HRM, Scully J, Zhang S, Rafla D, Posarelli M, Tan JCK, Agar A, Kalloniatis M, Phu J. ChatGPT for Addressing Patient-centered Frequently Asked Questions in Glaucoma Clinical Practice. Ophthalmol Glaucoma 2025; 8:157-166. [PMID: 39424063 DOI: 10.1016/j.ogla.2024.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 09/17/2024] [Accepted: 10/11/2024] [Indexed: 10/21/2024]
Abstract
PURPOSE Large language models such as ChatGPT-3.5 are often used by the public to answer questions related to daily life, including health advice. This study evaluated the responses of ChatGPT-3.5 in answering patient-centered frequently asked questions (FAQs) relevant in glaucoma clinical practice. DESIGN Prospective cross-sectional survey. PARTICIPANTS Expert graders. METHODS Twelve experts across a range of clinical, education, and research practices in optometry and ophthalmology. Over 200 patient-centric FAQs from authoritative professional society, hospital and advocacy websites were distilled and filtered into 40 questions across 4 themes: definition and risk factors, diagnosis and testing, lifestyle and other accompanying conditions, and treatment and follow-up. The questions were individually input into ChatGPT-3.5 to generate responses. The responses were graded by the 12 experts individually. MAIN OUTCOME MEASURES A 5-point Likert scale (1 = strongly disagree; 5 = strongly agree) was used to grade ChatGPT-3.5 responses across 4 domains: coherency, factuality, comprehensiveness, and safety. RESULTS Across all themes and domains, median scores were all 4 ("agree"). Comprehensiveness had the lowest scores across domains (mean 3.7 ± 0.9), followed by factuality (mean 3.9 ± 0.9) and coherency and safety (mean 4.1 ± 0.8 for both). Examination of the individual 40 questions showed that 8 (20%), 17 (42.5%), 24 (60%), and 8 (20%) of the questions had average scores below 4 (i.e., below "agree") for the coherency, factuality, comprehensiveness, and safety domains, respectively. Free-text comments by the experts highlighted omissions of facts and comprehensiveness (e.g., secondary glaucoma) and remarked on the vagueness of some responses (i.e., that the response did not account for individual patient circumstances). CONCLUSIONS ChatGPT-3.5 responses to FAQs in glaucoma were generally agreeable in terms of coherency, factuality, comprehensiveness, and safety. However, areas of weakness were identified, precluding recommendations for routine use to provide patients with tailored counseling in glaucoma, especially with respect to development of glaucoma and its management. FINANCIAL DISCLOSURE(S) Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Henrietta Wang
- School of Optometry and Vision Science, University of New South Wales, Kensington, New South Wales, Australia; Centre for Eye Health, University of New South Wales, Kensington, New South Wales, Australia
| | - Katherine Masselos
- Centre for Eye Health, University of New South Wales, Kensington, New South Wales, Australia; Department of Ophthalmology, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Janelle Tong
- School of Optometry and Vision Science, University of New South Wales, Kensington, New South Wales, Australia; Centre for Eye Health, University of New South Wales, Kensington, New South Wales, Australia
| | - Heather R M Connor
- School of Medicine (Optometry), Deakin University, Waurn Ponds, Victoria, Australia; The Royal Victorian Eye and Ear Hospital, East Melbourne, Victoria, Australia
| | - Janelle Scully
- Australian College of Optometry, Carlton, Victoria, Australia
| | - Sophia Zhang
- School of Optometry and Vision Science, University of New South Wales, Kensington, New South Wales, Australia; Centre for Eye Health, University of New South Wales, Kensington, New South Wales, Australia
| | - Daniel Rafla
- School of Optometry and Vision Science, University of New South Wales, Kensington, New South Wales, Australia
| | - Matteo Posarelli
- Department of Ophthalmology, Liverpool University Hospitals, Liverpool, UK
| | - Jeremy C K Tan
- Department of Ophthalmology, Prince of Wales Hospital, Randwick, New South Wales, Australia; Faculty of Medicine and Health, University of New South Wales, Kensington, New South Wales, Australia
| | - Ashish Agar
- Centre for Eye Health, University of New South Wales, Kensington, New South Wales, Australia; Department of Ophthalmology, Prince of Wales Hospital, Randwick, New South Wales, Australia; Faculty of Medicine and Health, University of New South Wales, Kensington, New South Wales, Australia
| | - Michael Kalloniatis
- School of Optometry and Vision Science, University of New South Wales, Kensington, New South Wales, Australia; School of Medicine (Optometry), Deakin University, Waurn Ponds, Victoria, Australia; University of Houston College of Optometry, Houston, Texas
| | - Jack Phu
- School of Optometry and Vision Science, University of New South Wales, Kensington, New South Wales, Australia; Centre for Eye Health, University of New South Wales, Kensington, New South Wales, Australia; Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia; Concord Clinical School, Concord Repatriation General Hospital, Concord, New South Wales, Australia.
| |
Collapse
|
4
|
Chow JCL, Li K. Developing Effective Frameworks for Large Language Model-Based Medical Chatbots: Insights From Radiotherapy Education With ChatGPT. JMIR Cancer 2025; 11:e66633. [PMID: 39965195 PMCID: PMC11888077 DOI: 10.2196/66633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 12/15/2024] [Accepted: 01/16/2025] [Indexed: 02/20/2025] Open
Abstract
This Viewpoint proposes a robust framework for developing a medical chatbot dedicated to radiotherapy education, emphasizing accuracy, reliability, privacy, ethics, and future innovations. By analyzing existing research, the framework evaluates chatbot performance and identifies challenges such as content accuracy, bias, and system integration. The findings highlight opportunities for advancements in natural language processing, personalized learning, and immersive technologies. When designed with a focus on ethical standards and reliability, large language model-based chatbots could significantly impact radiotherapy education and health care delivery, positioning them as valuable tools for future developments in medical education globally.
Collapse
Affiliation(s)
- James C L Chow
- Department of Medical Physics, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Radiation Oncology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Kay Li
- Department of English, Faculty of Arts and Science, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
5
|
Pupong K, Hunsrisakhun J, Pithpornchaiyakul S, Naorungroj S. Development of Chatbot-Based Oral Health Care for Young Children and Evaluation of its Effectiveness, Usability, and Acceptability: Mixed Methods Study. JMIR Pediatr Parent 2025; 8:e62738. [PMID: 39899732 PMCID: PMC11809939 DOI: 10.2196/62738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 11/24/2024] [Accepted: 11/26/2024] [Indexed: 02/05/2025] Open
Abstract
Background Chatbots are increasingly accepted in public health for their ability to replicate human-like communication and provide scalable, 24/7 services. The high prevalence of dental caries in children underscores the need for early and effective intervention. Objective This study aimed to develop the 30-Day FunDee chatbot and evaluate its effectiveness, usability, and acceptability in delivering oral health education to caregivers of children aged 6 to 36 months. Methods The chatbot was created using the artificial intelligence (AI) chatbot behavior change model, integrating behavioral change theories into content designed for 3-5 minutes of daily use over 30 days. A pre-post experimental study was conducted from December 2021 to February 2022 in Hat Yai District, Songkhla Province, and Maelan District, Pattani Province, Thailand. Fifty-eight caregivers completed a web-based structured questionnaire at baseline and 2 months post baseline to evaluate knowledge, protection motivation theory-based perceptions, and tooth-brushing practices. Usability was assessed via chatbot logfiles and a web-based questionnaire at 2 months post baseline. Acceptability was evaluated through three methods: (1) open-ended chatbot interactions on day 30, (2) a web-based structured questionnaire at 2 months post baseline, and (3) semistructured telephone interviews with 15 participants 2 weeks post intervention. Participants for interviews were stratified by adherence levels and randomly selected from Hatyai and Maelan districts. All self-reported variables were measured on a 5-point Likert scale (1=lowest, 5=highest). Results The chatbot was successfully developed based on the 4 components of the AI chatbot behavior change model. Participants had a mean age of 34.5 (SD 8.6) years. The frequency of tooth brushing among caregivers significantly improved, increasing from 72.4% at baseline to 93.1% two months post baseline (P=.006). Protection motivation theory-based perceptions also showed significant improvement, with mean scores rising from 4.0 (SD 0.6) at baseline to 4.5 (SD 0.6) two months post baseline (P<.001). The chatbot received high ratings for satisfaction (4.7/5, SD 0.6) and usability (4.7/5, SD 0.5). Participants engaged with the chatbot for an average of 24.7 (SD 7.2) days out of 30. Caregivers praised the chatbot's content quality, empathetic communication, and multimedia design, but noted the intervention's lengthy duration and messaging system as limitations. Conclusions The 30-Day FunDee chatbot effectively enhanced caregivers' perceptions of oral health care and improved tooth-brushing practices for children aged 6-36 months. High user satisfaction and engagement demonstrate its potential as an innovative tool for oral health education. These findings warrant further validation through large-scale, randomized controlled trials.
Collapse
Affiliation(s)
- Kittiwara Pupong
- Dental Public Health Division, Maelan Hospital, Pattani, Thailand
| | - Jaranya Hunsrisakhun
- Department of Preventive Dentistry, Faculty of Dentistry, Prince of Songkla University, 15 Kanjanavanich Rd, Hatyai, Songkhla, 90112, Thailand, 66 74429875, 66 74429875
- Improvement of Oral Health Care Research Unit, Faculty of Dentistry, Prince of Songkla University, Hatyai, Songkhla, Thailand
| | - Samerchit Pithpornchaiyakul
- Department of Preventive Dentistry, Faculty of Dentistry, Prince of Songkla University, 15 Kanjanavanich Rd, Hatyai, Songkhla, 90112, Thailand, 66 74429875, 66 74429875
- Improvement of Oral Health Care Research Unit, Faculty of Dentistry, Prince of Songkla University, Hatyai, Songkhla, Thailand
| | - Supawadee Naorungroj
- Department of Conservative Dentistry, Faculty of Dentistry, Prince of Songkla University, Hatyai, Songkhla, Thailand
| |
Collapse
|
6
|
Su B, Jones R, Chen K, Kostenko E, Schmid M, DeMaria AL, Villa A, Swarup M, Weida J, Tuuli MG. Chatbot for patient education for prenatal aneuploidy testing: A multicenter randomized controlled trial. PATIENT EDUCATION AND COUNSELING 2025; 131:108557. [PMID: 39642634 DOI: 10.1016/j.pec.2024.108557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 11/03/2024] [Accepted: 11/14/2024] [Indexed: 12/09/2024]
Abstract
INTRODUCTION Digital tools could assist obstetric providers by delivering information given increasing options for fetal aneuploidy screening. PURPOSE To determine the impact of a chatbot for pre-test education and counseling in low-risk pregnancies. METHODS Two sites participated in this randomized controlled trial. Patients in the intervention group used a chatbot prior to the provider visit, while patients in the control group only received education by the provider. The primary outcome was change in patient knowledge scores after provider education. Analysis was by intention to treat. RESULTS Overall, 258 women participated (n = 130; intervention and n = 128; control). Knowledge gain was significantly higher among patients using the chatbot (mean increase in correct answers [out of 20]: +4.1 vs +1.9, p < 0.001). Both groups reported high satisfaction, with no statistically significant difference between intervention and control groups (mean patient satisfaction [1-10]: 8.2 vs 8.5 respectively, p = 0.35). Providers also reported high satisfaction scores with no significant difference between intervention and control groups (mean provider satisfaction [1 - 10]: 8.7 vs 8.4 respectively, p = 0.13). CONCLUSIONS Pre-test education via a chatbot can increase patient knowledge of prenatal testing choices, with high patient and provider satisfaction.
Collapse
Affiliation(s)
- Bowdoin Su
- Ariosa Diagnostics, Inc., Roche Diagnostics Solutions, San Jose, CA, USA.
| | - Renee Jones
- Ariosa Diagnostics, Inc., Roche Diagnostics Solutions, San Jose, CA, USA
| | - Kelly Chen
- Ariosa Diagnostics, Inc., Roche Diagnostics Solutions, San Jose, CA, USA
| | - Emilia Kostenko
- Ariosa Diagnostics, Inc., Roche Diagnostics Solutions, San Jose, CA, USA
| | - Maximilian Schmid
- Ariosa Diagnostics, Inc., Roche Diagnostics Solutions, San Jose, CA, USA
| | - Andrea L DeMaria
- Department of Public Health, Purdue University, West Lafayette, IN, USA
| | - Andrew Villa
- New Horizons Women's Care Branch of Arizona Ob/Gyn Affiliates, Chandler, AZ, USA
| | - Monte Swarup
- New Horizons Women's Care Branch of Arizona Ob/Gyn Affiliates, Chandler, AZ, USA
| | - Jennifer Weida
- Department of Obstetrics & Gynecology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Methodius G Tuuli
- Department of Obstetrics & Gynecology, Brown University School of Medicine, Providence, RI, USA
| |
Collapse
|
7
|
Mohamed Jasim K, Malathi A, Bhardwaj S, Aw ECX. A systematic review of AI-based chatbot usages in healthcare services. J Health Organ Manag 2025. [PMID: 39865955 DOI: 10.1108/jhom-12-2023-0376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2025]
Abstract
PURPOSE This systematic literature review aims to provide a comprehensive and structured synthesis of the existing knowledge about chatbots in healthcare from both a theoretical and methodological perspective. DESIGN/METHODOLOGY/APPROACH To this end, a systematic literature review was conducted with 89 articles selected through a SPAR-4-SLR systematic procedure. The document for this systematic review was collected from Scopus database. The VoSviewer software facilitates the analysis of keyword co-occurrence to form the fundamental structure of the subject field. FINDINGS In addition, this study proposes a future research agenda revolving around three main themes such as (1) telemedicine, (2) mental health and (3) medical information. ORIGINALITY/VALUE This study underscores the significance, implications and predictors of chatbot usage in healthcare services. It is concluded that adopting the proposed future direction and further research on chatbots in healthcare will help to refine chatbot systems to better meet the needs of patients.
Collapse
Affiliation(s)
- K Mohamed Jasim
- VIT Business School, Vellore Institute of Technology, Vellore, India
| | - A Malathi
- VIT Business School, Vellore Institute of Technology, Vellore, India
| | - Seema Bhardwaj
- Symbiosis Institute of Business Management, Nagpur, Symbiosis International (Deemed University), Pune, India
- Middlesex University, Dubai, United Arab Emirates
| | - Eugene Cheng-Xi Aw
- UCSI University Kuala Lumpur Campus, Kuala Lumpur, Malaysia
- Faculty of International Tourism and Management, City University of Macau, Macau, China
| |
Collapse
|
8
|
Wahab N, Forsyth RA. Experiences of patients with hard-to-heal wounds: insights from a pilot survey. J Wound Care 2024; 33:788-794. [PMID: 39388206 DOI: 10.12968/jowc.2024.0109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
OBJECTIVE To learn about the experiences of people who seek treatment for hard-to-heal wounds, we distributed a nationwide pilot survey, asking questions about the nature of their wound, how it shaped their daily lives, pathways to receiving care and experiences with treatment. The long-term objective is to quantify the journey of patients with hard-to-heal wounds to identify ideal intervention points that will lead to the best outcomes. This article summarises the findings, implications, limitations and suggestions for future research. METHOD Qualitative data were self-reported from patients with hard-to-heal wounds (open for ≥4 weeks) in a pilot chatbot survey, (Wound Expert Survey (WES)) provided online in the US on Meta platforms (Facebook and Instagram) between 2021 and 2022. RESULTS The US national pilot survey attracted responses from 780 patients, 27 of whom provided a video testimonial. Some 57% of patients delayed treatment because they believed their wound would heal on its own, and only 4% saw a wound care specialist. Respondents reported the cost of care as the most frequent reason for not following all of a doctor's treatment recommendations. Queries regarding quality of life (QoL) revealed that more than half (65%) said they have negative thoughts associated with their wound at least every few days. Some 19% of respondents said their wound had an odour and, of them, 34% said odour had a major or severe negative impact on their self-confidence. Economically, nearly one-quarter of respondents said having a wound led to a drop in their total household income and 17% said their wound led to a change in their employment status. CONCLUSION A national pilot survey of patients with hard-to-heal wounds revealed that many delay seeking professional assistance and only a small minority see a wound care specialist. Experiencing an ulcer, even for a few months, can have significant negative effects on a patient's QoL. Patients frequently had negative thoughts associated with their wound, and odour compounded these negative effects, leading to major or severe negative impacts on self-confidence. Households experienced a decline in income, due to both the direct reduction or loss of patient employment and the additional time spent by family members assisting in patient recovery. Thus, a variety of factors contribute to poor outcomes for patients with hard-to-heal wounds. To validate and extend these preliminary results, future surveys of patients with hard-to-heal wounds should focus on additional reasons patients do not seek professional help sooner. To improve health outcomes and QoL, assessment of patient socioeconomic variables should occur whenever wound closure stalls.
Collapse
Affiliation(s)
- Naz Wahab
- Wound Care Experts, NV, US
- HCA Mountain View Hospital, NV, US
- Roseman University College of Medicine, NV, US
- Common Spirit Dignity Hospitals, NV, US
| | - R Allyn Forsyth
- MIMEDX Group Inc., GA, US
- Department of Biology, San Diego State University, CA, US
| |
Collapse
|
9
|
Laymouna M, Ma Y, Lessard D, Schuster T, Engler K, Lebouché B. Roles, Users, Benefits, and Limitations of Chatbots in Health Care: Rapid Review. J Med Internet Res 2024; 26:e56930. [PMID: 39042446 PMCID: PMC11303905 DOI: 10.2196/56930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 04/07/2024] [Accepted: 04/12/2024] [Indexed: 07/24/2024] Open
Abstract
BACKGROUND Chatbots, or conversational agents, have emerged as significant tools in health care, driven by advancements in artificial intelligence and digital technology. These programs are designed to simulate human conversations, addressing various health care needs. However, no comprehensive synthesis of health care chatbots' roles, users, benefits, and limitations is available to inform future research and application in the field. OBJECTIVE This review aims to describe health care chatbots' characteristics, focusing on their diverse roles in the health care pathway, user groups, benefits, and limitations. METHODS A rapid review of published literature from 2017 to 2023 was performed with a search strategy developed in collaboration with a health sciences librarian and implemented in the MEDLINE and Embase databases. Primary research studies reporting on chatbot roles or benefits in health care were included. Two reviewers dual-screened the search results. Extracted data on chatbot roles, users, benefits, and limitations were subjected to content analysis. RESULTS The review categorized chatbot roles into 2 themes: delivery of remote health services, including patient support, care management, education, skills building, and health behavior promotion, and provision of administrative assistance to health care providers. User groups spanned across patients with chronic conditions as well as patients with cancer; individuals focused on lifestyle improvements; and various demographic groups such as women, families, and older adults. Professionals and students in health care also emerged as significant users, alongside groups seeking mental health support, behavioral change, and educational enhancement. The benefits of health care chatbots were also classified into 2 themes: improvement of health care quality and efficiency and cost-effectiveness in health care delivery. The identified limitations encompassed ethical challenges, medicolegal and safety concerns, technical difficulties, user experience issues, and societal and economic impacts. CONCLUSIONS Health care chatbots offer a wide spectrum of applications, potentially impacting various aspects of health care. While they are promising tools for improving health care efficiency and quality, their integration into the health care system must be approached with consideration of their limitations to ensure optimal, safe, and equitable use.
Collapse
Affiliation(s)
- Moustafa Laymouna
- Department of Family Medicine, Faculty of Medicine and Health Sciences, McGill University, Montreal, QC, Canada
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Infectious Diseases and Immunity in Global Health Program, Research Institute of McGill University Health Centre, Montreal, QC, Canada
| | - Yuanchao Ma
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Infectious Diseases and Immunity in Global Health Program, Research Institute of McGill University Health Centre, Montreal, QC, Canada
- Chronic and Viral Illness Service, Division of Infectious Disease, Department of Medicine, McGill University Health Centre, Montreal, QC, Canada
- Department of Biomedical Engineering, Polytechnique Montréal, Montreal, QC, Canada
| | - David Lessard
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Infectious Diseases and Immunity in Global Health Program, Research Institute of McGill University Health Centre, Montreal, QC, Canada
- Chronic and Viral Illness Service, Division of Infectious Disease, Department of Medicine, McGill University Health Centre, Montreal, QC, Canada
| | - Tibor Schuster
- Department of Family Medicine, Faculty of Medicine and Health Sciences, McGill University, Montreal, QC, Canada
| | - Kim Engler
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Infectious Diseases and Immunity in Global Health Program, Research Institute of McGill University Health Centre, Montreal, QC, Canada
- Chronic and Viral Illness Service, Division of Infectious Disease, Department of Medicine, McGill University Health Centre, Montreal, QC, Canada
| | - Bertrand Lebouché
- Department of Family Medicine, Faculty of Medicine and Health Sciences, McGill University, Montreal, QC, Canada
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
- Infectious Diseases and Immunity in Global Health Program, Research Institute of McGill University Health Centre, Montreal, QC, Canada
- Chronic and Viral Illness Service, Division of Infectious Disease, Department of Medicine, McGill University Health Centre, Montreal, QC, Canada
| |
Collapse
|
10
|
Sezgin E, Kocaballi AB, Dolce M, Skeens M, Militello L, Huang Y, Stevens J, Kemper AR. Chatbot for Social Need Screening and Resource Sharing With Vulnerable Families: Iterative Design and Evaluation Study. JMIR Hum Factors 2024; 11:e57114. [PMID: 39028995 PMCID: PMC11297373 DOI: 10.2196/57114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 05/03/2024] [Accepted: 05/24/2024] [Indexed: 07/21/2024] Open
Abstract
BACKGROUND Health outcomes are significantly influenced by unmet social needs. Although screening for social needs has become common in health care settings, there is often poor linkage to resources after needs are identified. The structural barriers (eg, staffing, time, and space) to helping address social needs could be overcome by a technology-based solution. OBJECTIVE This study aims to present the design and evaluation of a chatbot, DAPHNE (Dialog-Based Assistant Platform for Healthcare and Needs Ecosystem), which screens for social needs and links patients and families to resources. METHODS This research used a three-stage study approach: (1) an end-user survey to understand unmet needs and perception toward chatbots, (2) iterative design with interdisciplinary stakeholder groups, and (3) a feasibility and usability assessment. In study 1, a web-based survey was conducted with low-income US resident households (n=201). Following that, in study 2, web-based sessions were held with an interdisciplinary group of stakeholders (n=10) using thematic and content analysis to inform the chatbot's design and development. Finally, in study 3, the assessment on feasibility and usability was completed via a mix of a web-based survey and focus group interviews following scenario-based usability testing with community health workers (family advocates; n=4) and social workers (n=9). We reported descriptive statistics and chi-square test results for the household survey. Content analysis and thematic analysis were used to analyze qualitative data. Usability score was descriptively reported. RESULTS Among the survey participants, employed and younger individuals reported a higher likelihood of using a chatbot to address social needs, in contrast to the oldest age group. Regarding designing the chatbot, the stakeholders emphasized the importance of provider-technology collaboration, inclusive conversational design, and user education. The participants found that the chatbot's capabilities met expectations and that the chatbot was easy to use (System Usability Scale score=72/100). However, there were common concerns about the accuracy of suggested resources, electronic health record integration, and trust with a chatbot. CONCLUSIONS Chatbots can provide personalized feedback for families to identify and meet social needs. Our study highlights the importance of user-centered iterative design and development of chatbots for social needs. Future research should examine the efficacy, cost-effectiveness, and scalability of chatbot interventions to address social needs.
Collapse
Affiliation(s)
- Emre Sezgin
- Nationwide Children's Hospital, Columbus, OH, United States
| | - A Baki Kocaballi
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia
| | - Millie Dolce
- Nationwide Children's Hospital, Columbus, OH, United States
| | - Micah Skeens
- Nationwide Children's Hospital, Columbus, OH, United States
| | | | - Yungui Huang
- Nationwide Children's Hospital, Columbus, OH, United States
| | - Jack Stevens
- Nationwide Children's Hospital, Columbus, OH, United States
| | - Alex R Kemper
- Nationwide Children's Hospital, Columbus, OH, United States
| |
Collapse
|
11
|
Nadarzynski T, Knights N, Husbands D, Graham CA, Llewellyn CD, Buchanan T, Montgomery I, Ridge D. Achieving health equity through conversational AI: A roadmap for design and implementation of inclusive chatbots in healthcare. PLOS DIGITAL HEALTH 2024; 3:e0000492. [PMID: 38696359 PMCID: PMC11065243 DOI: 10.1371/journal.pdig.0000492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 03/25/2024] [Indexed: 05/04/2024]
Abstract
BACKGROUND The rapid evolution of conversational and generative artificial intelligence (AI) has led to the increased deployment of AI tools in healthcare settings. While these conversational AI tools promise efficiency and expanded access to healthcare services, there are growing concerns ethically, practically and in terms of inclusivity. This study aimed to identify activities which reduce bias in conversational AI and make their designs and implementation more equitable. METHODS A qualitative research approach was employed to develop an analytical framework based on the content analysis of 17 guidelines about AI use in clinical settings. A stakeholder consultation was subsequently conducted with a total of 33 ethnically diverse community members, AI designers, industry experts and relevant health professionals to further develop a roadmap for equitable design and implementation of conversational AI in healthcare. Framework analysis was conducted on the interview data. RESULTS A 10-stage roadmap was developed to outline activities relevant to equitable conversational AI design and implementation phases: 1) Conception and planning, 2) Diversity and collaboration, 3) Preliminary research, 4) Co-production, 5) Safety measures, 6) Preliminary testing, 7) Healthcare integration, 8) Service evaluation and auditing, 9) Maintenance, and 10) Termination. DISCUSSION We have made specific recommendations to increase conversational AI's equity as part of healthcare services. These emphasise the importance of a collaborative approach and the involvement of patient groups in navigating the rapid evolution of conversational AI technologies. Further research must assess the impact of recommended activities on chatbots' fairness and their ability to reduce health inequalities.
Collapse
Affiliation(s)
- Tom Nadarzynski
- School of Social Sciences, University of Westminster, London, United Kingdom
| | - Nicky Knights
- School of Social Sciences, University of Westminster, London, United Kingdom
| | - Deborah Husbands
- School of Social Sciences, University of Westminster, London, United Kingdom
| | - Cynthia A. Graham
- Kinsey Institute and Department of Gender Studies, Indiana University, Bloomington, United States of America
| | - Carrie D. Llewellyn
- Brighton and Sussex Medical School, University of Sussex, Brighton, United Kingdom
| | - Tom Buchanan
- School of Social Sciences, University of Westminster, London, United Kingdom
| | | | - Damien Ridge
- School of Social Sciences, University of Westminster, London, United Kingdom
| |
Collapse
|
12
|
Chou YH, Lin C, Lee SH, Lee YF, Cheng LC. User-Friendly Chatbot to Mitigate the Psychological Stress of Older Adults During the COVID-19 Pandemic: Development and Usability Study. JMIR Form Res 2024; 8:e49462. [PMID: 38477965 DOI: 10.2196/49462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 11/19/2023] [Accepted: 02/13/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND To safeguard the most vulnerable individuals during the COVID-19 pandemic, numerous governments enforced measures such as stay-at-home orders, social distancing, and self-isolation. These social restrictions had a particularly negative effect on older adults, as they are more vulnerable and experience increased loneliness, which has various adverse effects, including increasing the risk of mental health problems and mortality. Chatbots can potentially reduce loneliness and provide companionship during a pandemic. However, existing chatbots do not cater to the specific needs of older adult populations. OBJECTIVE We aimed to develop a user-friendly chatbot tailored to the specific needs of older adults with anxiety or depressive disorders during the COVID-19 pandemic and to examine their perspectives on mental health chatbot use. The primary research objective was to investigate whether chatbots can mitigate the psychological stress of older adults during COVID-19. METHODS Participants were older adults belonging to two age groups (≥65 years and <65 years) from a psychiatric outpatient department who had been diagnosed with depressive or anxiety disorders by certified psychiatrists according to the Diagnostic and Statistical Manual of Mental Disorders (Fifth Edition) (DSM-5) criteria. The participants were required to use mobile phones, have internet access, and possess literacy skills. The chatbot's content includes monitoring and tracking health data and providing health information. Participants had access to the chatbot for at least 4 weeks. Self-report questionnaires for loneliness, depression, and anxiety were administered before and after chatbot use. The participants also rated their attitudes toward the chatbot. RESULTS A total of 35 participants (mean age 65.21, SD 7.51 years) were enrolled in the trial, comprising 74% (n=26) female and 26% (n=9) male participants. The participants demonstrated a high utilization rate during the intervention, with over 82% engaging with the chatbot daily. Loneliness significantly improved in the older group ≥65 years. This group also responded positively to the chatbot, as evidenced by changes in University of California Los Angeles Loneliness Scale scores, suggesting that this demographic can derive benefits from chatbot interaction. Conversely, the younger group, <65 years, exhibited no significant changes in loneliness after the intervention. Both the older and younger age groups provided good scores in relation to chatbot design with respect to usability (mean scores of 6.33 and 6.05, respectively) and satisfaction (mean scores of 5.33 and 5.15, respectively), rated on a 7-point Likert scale. CONCLUSIONS The chatbot interface was found to be user-friendly and demonstrated promising results among participants 65 years and older who were receiving care at psychiatric outpatient clinics and experiencing relatively stable symptoms of depression and anxiety. The chatbot not only provided caring companionship but also showed the potential to alleviate loneliness during the challenging circumstances of a pandemic.
Collapse
Affiliation(s)
- Ya-Hsin Chou
- Department of Psychiatry, Taoyuan Chang Gung Memorial Hospital, Taoyuan County, Taiwan
| | - Chemin Lin
- Department of Psychiatry, Keelung Chang Gung Memorial Hospital, Keelung City, Taiwan
- College of Medicine, Chang Gung University, Taoyuan County, Taiwan
- Community Medicine Research Center, Chang Gung Memorial Hospital, Keelung, Taiwan
| | - Shwu-Hua Lee
- College of Medicine, Chang Gung University, Taoyuan County, Taiwan
- Department of Psychiatry, Linkou Chang Gung Memorial Hospital, Taoyuan County, Taiwan
| | - Yen-Fen Lee
- Department of Information and Finance Management, National Taipei University of Technology, Taipei, Taiwan
| | - Li-Chen Cheng
- Department of Information and Finance Management, National Taipei University of Technology, Taipei, Taiwan
| |
Collapse
|
13
|
Ding H, Simmich J, Vaezipour A, Andrews N, Russell T. Evaluation framework for conversational agents with artificial intelligence in health interventions: a systematic scoping review. J Am Med Inform Assoc 2024; 31:746-761. [PMID: 38070173 PMCID: PMC10873847 DOI: 10.1093/jamia/ocad222] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 11/04/2023] [Accepted: 11/24/2023] [Indexed: 02/18/2024] Open
Abstract
OBJECTIVES Conversational agents (CAs) with emerging artificial intelligence present new opportunities to assist in health interventions but are difficult to evaluate, deterring their applications in the real world. We aimed to synthesize existing evidence and knowledge and outline an evaluation framework for CA interventions. MATERIALS AND METHODS We conducted a systematic scoping review to investigate designs and outcome measures used in the studies that evaluated CAs for health interventions. We then nested the results into an overarching digital health framework proposed by the World Health Organization (WHO). RESULTS The review included 81 studies evaluating CAs in experimental (n = 59), observational (n = 15) trials, and other research designs (n = 7). Most studies (n = 72, 89%) were published in the past 5 years. The proposed CA-evaluation framework includes 4 evaluation stages: (1) feasibility/usability, (2) efficacy, (3) effectiveness, and (4) implementation, aligning with WHO's stepwise evaluation strategy. Across these stages, this article presents the essential evidence of different study designs (n = 8), sample sizes, and main evaluation categories (n = 7) with subcategories (n = 40). The main evaluation categories included (1) functionality, (2) safety and information quality, (3) user experience, (4) clinical and health outcomes, (5) costs and cost benefits, (6) usage, adherence, and uptake, and (7) user characteristics for implementation research. Furthermore, the framework highlighted the essential evaluation areas (potential primary outcomes) and gaps across the evaluation stages. DISCUSSION AND CONCLUSION This review presents a new framework with practical design details to support the evaluation of CA interventions in healthcare research. PROTOCOL REGISTRATION The Open Science Framework (https://osf.io/9hq2v) on March 22, 2021.
Collapse
Affiliation(s)
- Hang Ding
- RECOVER Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD, Australia
- STARS Education and Research Alliance, Surgical Treatment and Rehabilitation Service (STARS), The University of Queensland and Metro North Health, Brisbane, QLD, Australia
| | - Joshua Simmich
- RECOVER Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD, Australia
- STARS Education and Research Alliance, Surgical Treatment and Rehabilitation Service (STARS), The University of Queensland and Metro North Health, Brisbane, QLD, Australia
| | - Atiyeh Vaezipour
- RECOVER Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD, Australia
- STARS Education and Research Alliance, Surgical Treatment and Rehabilitation Service (STARS), The University of Queensland and Metro North Health, Brisbane, QLD, Australia
| | - Nicole Andrews
- RECOVER Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD, Australia
- STARS Education and Research Alliance, Surgical Treatment and Rehabilitation Service (STARS), The University of Queensland and Metro North Health, Brisbane, QLD, Australia
- The Tess Cramond Pain and Research Centre, Metro North Hospital and Health Service, Brisbane, QLD, Australia
- The Occupational Therapy Department, The Royal Brisbane and Women’s Hospital, Metro North Hospital and Health Service, Brisbane, QLD, Australia
| | - Trevor Russell
- RECOVER Injury Research Centre, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD, Australia
- STARS Education and Research Alliance, Surgical Treatment and Rehabilitation Service (STARS), The University of Queensland and Metro North Health, Brisbane, QLD, Australia
| |
Collapse
|
14
|
Gengatharan D, Saggi SS, Bin Abd Razak HR. Pre-operative Planning of High Tibial Osteotomy With ChatGPT: Are We There Yet? Cureus 2024; 16:e54858. [PMID: 38533173 PMCID: PMC10964394 DOI: 10.7759/cureus.54858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2024] [Indexed: 03/28/2024] Open
Abstract
INTRODUCTION ChatGPT (Chat Generative Pre-trained Transformer), developed by OpenAI (San Francisco, CA, USA), has gained attention in the medical field. It has the potential to enhance and simplify tasks, such as preoperative planning in orthopedic surgery. We aimed to test ChatGPT's accuracy in measuring the angle of correction for high tibial osteotomy for cases planned and performed at a tertiary teaching hospital in Singapore. MATERIALS AND METHODS Peri-operative angular parameters from 114 consecutive patients who underwent medial opening wedge high tibial osteotomy (MOWHTO) were used to query ChatGPT 3.0. First ChatGPT 3.0 was queried on what information it required to plan a MOWHTO. Based on its response, pre-operative medial proximal tibial angle (MPTA) and joint line congruence angle (JLCA) were provided. ChatGPT 3.0 then responded with its recommended angle of correction. This was compared against the manually planned surgical correction by our fellowship-trained surgeon. A root mean square analysis was then performed to compare ChatGPT 3.0 and manual planning. RESULTS The root mean square error (RMSE) of ChatGPT 3.0 in predicting correction angle in MWHTO was 2.96, suggesting a very poor model fit. CONCLUSION Although ChatGPT 3.0 represents a significant breakthrough in large language models with extensive capabilities, it is not currently optimized to effectively perform complex pre-operative planning in orthopedic surgery, specifically in the context of MOWHTO. Further refinement and consideration of specific factors are necessary to enhance its accuracy and suitability for such applications.
Collapse
Affiliation(s)
| | | | - Hamid Rahmatullah Bin Abd Razak
- Musculoskeletal Sciences, Duke-Nus Medical School, Singapore, SGP
- Orthopaedic Surgery, Sengkang General Hospital, Singapore, SGP
| |
Collapse
|
15
|
Jiang Z, Huang X, Wang Z, Liu Y, Huang L, Luo X. Embodied Conversational Agents for Chronic Diseases: Scoping Review. J Med Internet Res 2024; 26:e47134. [PMID: 38194260 PMCID: PMC10806449 DOI: 10.2196/47134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 10/19/2023] [Accepted: 11/29/2023] [Indexed: 01/10/2024] Open
Abstract
BACKGROUND Embodied conversational agents (ECAs) are computer-generated animated humanlike characters that interact with users through verbal and nonverbal behavioral cues. They are increasingly used in a range of fields, including health care. OBJECTIVE This scoping review aims to identify the current practice in the development and evaluation of ECAs for chronic diseases. METHODS We applied a methodological framework in this review. A total of 6 databases (ie, PubMed, Embase, CINAHL, ACM Digital Library, IEEE Xplore Digital Library, and Web of Science) were searched using a combination of terms related to ECAs and health in October 2023. Two independent reviewers selected the studies and extracted the data. This review followed the PRISMA-ScR (Preferred Reporting Items of Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) statement. RESULTS The literature search found 6332 papers, of which 36 (0.57%) met the inclusion criteria. Among the 36 studies, 27 (75%) originated from the United States, and 28 (78%) were published from 2020 onward. The reported ECAs covered a wide range of chronic diseases, with a focus on cancers, atrial fibrillation, and type 2 diabetes, primarily to promote screening and self-management. Most ECAs were depicted as middle-aged women based on screenshots and communicated with users through voice and nonverbal behavior. The most frequently reported evaluation outcomes were acceptability and effectiveness. CONCLUSIONS This scoping review provides valuable insights for technology developers and health care professionals regarding the development and implementation of ECAs. It emphasizes the importance of technological advances in the embodiment, personalized strategy, and communication modality and requires in-depth knowledge of user preferences regarding appearance, animation, and intervention content. Future studies should incorporate measures of cost, efficiency, and productivity to provide a comprehensive evaluation of the benefits of using ECAs in health care.
Collapse
Affiliation(s)
- Zhili Jiang
- Department of Nursing, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xiting Huang
- Department of Nursing, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zhiqian Wang
- Department of Nursing, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yang Liu
- Department of Nursing, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Lihua Huang
- Department of Nursing, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xiaolin Luo
- Department of Quality Evaluation, Zhejiang Evaluation Center for Medical Service and Administration, Hangzhou, China
| |
Collapse
|
16
|
Talyshinskii A, Naik N, Hameed BMZ, Juliebø-Jones P, Somani BK. Potential of AI-Driven Chatbots in Urology: Revolutionizing Patient Care Through Artificial Intelligence. Curr Urol Rep 2024; 25:9-18. [PMID: 37723300 PMCID: PMC10787686 DOI: 10.1007/s11934-023-01184-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/05/2023] [Indexed: 09/20/2023]
Abstract
PURPOSE OF REVIEW Artificial intelligence (AI) chatbots have emerged as a potential tool to transform urology by improving patient care and physician efficiency. With an emphasis on their potential advantages and drawbacks, this literature review offers a thorough assessment of the state of AI-driven chatbots in urology today. RECENT FINDINGS The capacity of AI-driven chatbots in urology to give patients individualized and timely medical advice is one of its key advantages. Chatbots can help patients prioritize their symptoms and give advice on the best course of treatment. By automating administrative duties and offering clinical decision support, chatbots can also help healthcare providers. Before chatbots are widely used in urology, there are a few issues that need to be resolved. The precision of chatbot diagnoses and recommendations might be impacted by technical constraints like system errors and flaws. Additionally, issues regarding the security and privacy of patient data must be resolved, and chatbots must adhere to all applicable laws. Important issues that must be addressed include accuracy and dependability because any mistakes or inaccuracies could seriously harm patients. The final obstacle is resistance from patients and healthcare professionals who are hesitant to use new technology or who value in-person encounters. AI-driven chatbots have the potential to significantly improve urology care and efficiency. However, it is essential to thoroughly test and ensure the accuracy of chatbots, address privacy and security concerns, and design user-friendly chatbots that can integrate into existing workflows. By exploring various scenarios and examining the current literature, this review provides an analysis of the prospects and limitations of implementing chatbots in urology.
Collapse
Affiliation(s)
- Ali Talyshinskii
- Department of Urology, Astana Medical University, Astana, Kazakhstan
| | - Nithesh Naik
- Department of Mechanical and Industrial Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - B M Zeeshan Hameed
- Department of Urology, Father Muller Medical College, Mangalore, Karnataka, India
| | - Patrick Juliebø-Jones
- Department of Urology, Haukeland University Hospital, Bergen, Norway.
- Department of Clinical Medicine, University of Bergen, Bergen, Norway.
| | | |
Collapse
|
17
|
Andrews NE, Ireland D, Vijayakumar P, Burvill L, Hay E, Westerman D, Rose T, Schlumpf M, Strong J, Claus A. Acceptability of a Pain History Assessment and Education Chatbot (Dolores) Across Age Groups in Populations With Chronic Pain: Development and Pilot Testing. JMIR Form Res 2023; 7:e47267. [PMID: 37801342 PMCID: PMC10589833 DOI: 10.2196/47267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 08/28/2023] [Accepted: 08/28/2023] [Indexed: 10/07/2023] Open
Abstract
BACKGROUND The delivery of education on pain neuroscience and the evidence for different treatment approaches has become a key component of contemporary persistent pain management. Chatbots, or more formally conversation agents, are increasingly being used in health care settings due to their versatility in providing interactive and individualized approaches to both capture and deliver information. Research focused on the acceptability of diverse chatbot formats can assist in developing a better understanding of the educational needs of target populations. OBJECTIVE This study aims to detail the development and initial pilot testing of a multimodality pain education chatbot (Dolores) that can be used across different age groups and investigate whether acceptability and feedback were comparable across age groups following pilot testing. METHODS Following an initial design phase involving software engineers (n=2) and expert clinicians (n=6), a total of 60 individuals with chronic pain who attended an outpatient clinic at 1 of 2 pain centers in Australia were recruited for pilot testing. The 60 individuals consisted of 20 (33%) adolescents (aged 10-18 years), 20 (33%) young adults (aged 19-35 years), and 20 (33%) adults (aged >35 years) with persistent pain. Participants spent 20 to 30 minutes completing interactive chatbot activities that enabled the Dolores app to gather a pain history and provide education about pain and pain treatments. After the chatbot activities, participants completed a custom-made feedback questionnaire measuring the acceptability constructs pertaining to health education chatbots. To determine the effect of age group on the acceptability ratings and feedback provided, a series of binomial logistic regression models and cumulative odds ordinal logistic regression models with proportional odds were generated. RESULTS Overall, acceptability was high for the following constructs: engagement, perceived value, usability, accuracy, responsiveness, adoption intention, esthetics, and overall quality. The effect of age group on all acceptability ratings was small and not statistically significant. An analysis of open-ended question responses revealed that major frustrations with the app were related to Dolores' speech, which was explored further through a comparative analysis. With respect to providing negative feedback about Dolores' speech, a logistic regression model showed that the effect of age group was statistically significant (χ22=11.7; P=.003) and explained 27.1% of the variance (Nagelkerke R2). Adults and young adults were less likely to comment on Dolores' speech compared with adolescent participants (odds ratio 0.20, 95% CI 0.05-0.84 and odds ratio 0.05, 95% CI 0.01-0.43, respectively). Comments were related to both speech rate (too slow) and quality (unpleasant and robotic). CONCLUSIONS This study provides support for the acceptability of pain history and education chatbots across different age groups. Chatbot acceptability for adolescent cohorts may be improved by enabling the self-selection of speech characteristics such as rate and personable tone.
Collapse
Affiliation(s)
- Nicole Emma Andrews
- RECOVER Injury Research Centre, The University of Queensland, Herston, Australia
- Tess Cramond Pain and Research Centre, The Royal Brisbane and Women's Hospital, Metro North Hospital and Health Service, Herston, Australia
- The Occupational Therapy Department, The Royal Brisbane and Women's Hospital, Metro North Hospital and Health Service, Herston, Australia
- Surgical Treatment and Rehabilitation Service (STARS) Education and Research Alliance, The University of Queensland and Metro North Health, Herston, Australia
| | - David Ireland
- Australian eHealth Research Centre, The Commonwealth Scientific and Industrial Research Organisation, Herston, Australia
| | - Pranavie Vijayakumar
- Australian eHealth Research Centre, The Commonwealth Scientific and Industrial Research Organisation, Herston, Australia
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, Victoria, Australia
| | - Lyza Burvill
- School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, Australia
| | - Elizabeth Hay
- School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, Australia
| | - Daria Westerman
- Queensland Interdisciplinary Paediatric Persistent Pain Service, Queensland Children's Hospital, South Brisbane, Australia
| | - Tanya Rose
- School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, Australia
| | - Mikaela Schlumpf
- Queensland Interdisciplinary Paediatric Persistent Pain Service, Queensland Children's Hospital, South Brisbane, Australia
| | - Jenny Strong
- Tess Cramond Pain and Research Centre, The Royal Brisbane and Women's Hospital, Metro North Hospital and Health Service, Herston, Australia
- The Occupational Therapy Department, The Royal Brisbane and Women's Hospital, Metro North Hospital and Health Service, Herston, Australia
- School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, Australia
| | - Andrew Claus
- Tess Cramond Pain and Research Centre, The Royal Brisbane and Women's Hospital, Metro North Hospital and Health Service, Herston, Australia
- School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, Australia
| |
Collapse
|
18
|
Griffin AC, Khairat S, Bailey SC, Chung AE. A chatbot for hypertension self-management support: user-centered design, development, and usability testing. JAMIA Open 2023; 6:ooad073. [PMID: 37693367 PMCID: PMC10491950 DOI: 10.1093/jamiaopen/ooad073] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 07/02/2023] [Accepted: 08/30/2023] [Indexed: 09/12/2023] Open
Abstract
Objectives Health-related chatbots have demonstrated early promise for improving self-management behaviors but have seldomly been utilized for hypertension. This research focused on the design, development, and usability evaluation of a chatbot for hypertension self-management, called "Medicagent." Materials and Methods A user-centered design process was used to iteratively design and develop a text-based chatbot using Google Cloud's Dialogflow natural language understanding platform. Then, usability testing sessions were conducted among patients with hypertension. Each session was comprised of: (1) background questionnaires, (2) 10 representative tasks within Medicagent, (3) System Usability Scale (SUS) questionnaire, and (4) a brief semi-structured interview. Sessions were video and audio recorded using Zoom. Qualitative and quantitative analyses were used to assess effectiveness, efficiency, and satisfaction of the chatbot. Results Participants (n = 10) completed nearly all tasks (98%, 98/100) and spent an average of 18 min (SD = 10 min) interacting with Medicagent. Only 11 (8.6%) utterances were not successfully mapped to an intent. Medicagent achieved a mean SUS score of 78.8/100, which demonstrated acceptable usability. Several participants had difficulties navigating the conversational interface without menu and back buttons, felt additional information would be useful for redirection when utterances were not recognized, and desired a health professional persona within the chatbot. Discussion The text-based chatbot was viewed favorably for assisting with blood pressure and medication-related tasks and had good usability. Conclusion Flexibility of interaction styles, handling unrecognized utterances gracefully, and having a credible persona were highlighted as design components that may further enrich the user experience of chatbots for hypertension self-management.
Collapse
Affiliation(s)
- Ashley C Griffin
- VA Palo Alto Health Care System, Palo Alto, CA 94025, United States
- Department of Health Policy, Stanford University School of Medicine, Stanford, CA 94305, United States
| | - Saif Khairat
- Carolina Health Informatics Program, University of North Carolina at Chapel Hill (UNC), Chapel Hill, NC 27599, United States
- School of Nursing, UNC, Chapel Hill, NC 27599, United States
| | - Stacy C Bailey
- Division of General Internal Medicine, Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, United States
| | - Arlene E Chung
- Department of Biostatistics & Bioinformatics, Duke School of Medicine, Durham, NC 27710, United States
| |
Collapse
|
19
|
Giansanti D. The Chatbots Are Invading Us: A Map Point on the Evolution, Applications, Opportunities, and Emerging Problems in the Health Domain. Life (Basel) 2023; 13:life13051130. [PMID: 37240775 DOI: 10.3390/life13051130] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 04/26/2023] [Accepted: 04/27/2023] [Indexed: 05/28/2023] Open
Abstract
The inclusion of chatbots is potentially disruptive in society, introducing opportunities, but also important implications that need to be addressed on different domains. The aim of this study is to examine chatbots in-depth, by mapping out their technological evolution, current usage, and potential applications, opportunities, and emerging problems within the health domain. The study examined three points of view. The first point of view traces the technological evolution of chatbots. The second point of view reports the fields of application of the chatbots, giving space to the expectations of use and the expected benefits from a cross-domain point of view, also affecting the health domain. The third and main point of view is that of the analysis of the state of use of chatbots in the health domain based on the scientific literature represented by systematic reviews. The overview identified the topics of greatest interest with the opportunities. The analysis revealed the need for initiatives that simultaneously evaluate multiple domains all together in a synergistic way. Concerted efforts to achieve this are recommended. It is also believed to monitor both the process of osmosis between other sectors and the health domain, as well as the chatbots that can create psychological and behavioural problems with an impact on the health domain.
Collapse
|
20
|
Jabir AI, Martinengo L, Lin X, Torous J, Subramaniam M, Tudor Car L. Evaluating Conversational Agents for Mental Health: Scoping Review of Outcomes and Outcome Measurement Instruments. J Med Internet Res 2023; 25:e44548. [PMID: 37074762 PMCID: PMC10157460 DOI: 10.2196/44548] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/01/2023] [Accepted: 03/31/2023] [Indexed: 04/03/2023] Open
Abstract
BACKGROUND Rapid proliferation of mental health interventions delivered through conversational agents (CAs) calls for high-quality evidence to support their implementation and adoption. Selecting appropriate outcomes, instruments for measuring outcomes, and assessment methods are crucial for ensuring that interventions are evaluated effectively and with a high level of quality. OBJECTIVE We aimed to identify the types of outcomes, outcome measurement instruments, and assessment methods used to assess the clinical, user experience, and technical outcomes in studies that evaluated the effectiveness of CA interventions for mental health. METHODS We undertook a scoping review of the relevant literature to review the types of outcomes, outcome measurement instruments, and assessment methods in studies that evaluated the effectiveness of CA interventions for mental health. We performed a comprehensive search of electronic databases, including PubMed, Cochrane Central Register of Controlled Trials, Embase (Ovid), PsychINFO, and Web of Science, as well as Google Scholar and Google. We included experimental studies evaluating CA mental health interventions. The screening and data extraction were performed independently by 2 review authors in parallel. Descriptive and thematic analyses of the findings were performed. RESULTS We included 32 studies that targeted the promotion of mental well-being (17/32, 53%) and the treatment and monitoring of mental health symptoms (21/32, 66%). The studies reported 203 outcome measurement instruments used to measure clinical outcomes (123/203, 60.6%), user experience outcomes (75/203, 36.9%), technical outcomes (2/203, 1.0%), and other outcomes (3/203, 1.5%). Most of the outcome measurement instruments were used in only 1 study (150/203, 73.9%) and were self-reported questionnaires (170/203, 83.7%), and most were delivered electronically via survey platforms (61/203, 30.0%). No validity evidence was cited for more than half of the outcome measurement instruments (107/203, 52.7%), which were largely created or adapted for the study in which they were used (95/107, 88.8%). CONCLUSIONS The diversity of outcomes and the choice of outcome measurement instruments employed in studies on CAs for mental health point to the need for an established minimum core outcome set and greater use of validated instruments. Future studies should also capitalize on the affordances made available by CAs and smartphones to streamline the evaluation and reduce participants' input burden inherent to self-reporting.
Collapse
Affiliation(s)
- Ahmad Ishqi Jabir
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
- Future Health Technologies, Singapore-ETH Centre, Campus for Research Excellence And Technological Enterprise, Singapore, Singapore
| | - Laura Martinengo
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | - Xiaowen Lin
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | - John Torous
- Beth Israel Deaconess Medical Center, Boston, MA, United States
| | - Mythily Subramaniam
- Institute of Mental Health, Singapore, Singapore
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
| | - Lorainne Tudor Car
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
- Department of Primary Care and Public Health, School of Public Health, Imperial College London, London, United Kingdom
| |
Collapse
|
21
|
Denecke K. Framework for Guiding the Development of High-Quality Conversational Agents in Healthcare. Healthcare (Basel) 2023; 11:healthcare11081061. [PMID: 37107895 PMCID: PMC10137907 DOI: 10.3390/healthcare11081061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 03/29/2023] [Accepted: 04/05/2023] [Indexed: 04/29/2023] Open
Abstract
Evaluating conversational agents (CAs) that are supposed to be applied in healthcare settings and ensuring their quality is essential to avoid patient harm and ensure efficacy of the CA-delivered intervention. However, a guideline for a standardized quality assessment of health CAs is still missing. The objective of this work is to describe a framework that provides guidance for development and evaluation of health CAs. In previous work, consensus on categories for evaluating health CAs has been found. In this work, we identify concrete metrics, heuristics, and checklists for these evaluation categories to form a framework. We focus on a specific type of health CA, namely rule-based systems that are based on written input and output, have a simple personality without any kind of embodiment. First, we identified relevant metrics, heuristics, and checklists to be linked to the evaluation categories through a literature search. Second, five experts judged the metrics regarding their relevance to be considered within evaluation and development of health CAs. The final framework considers nine aspects from a general perspective, five aspects from a response understanding perspective, one aspect from a response generation perspective, and three aspects from an aesthetics perspective. Existing tools and heuristics specifically designed for evaluating CAs were linked to these evaluation aspects (e.g., Bot usability scale, design heuristics for CAs); tools related to mHealth evaluation were adapted when necessary (e.g., aspects from the ISO technical specification for mHealth Apps). The resulting framework comprises aspects to be considered not only as part of a system evaluation, but already during the development. In particular, aspects related to accessibility or security have to be addressed in the design phase (e.g., which input and output options are provided to ensure accessibility?) and have to be verified after the implementation phase. As a next step, transfer of the framework to other types of health CAs has to be studied. The framework has to be validated by applying it during health CA design and development.
Collapse
Affiliation(s)
- Kerstin Denecke
- Institute for Medical Informatics, Bern University of Applied Sciences, Quellgasse 21, 2502 Biel, Switzerland
| |
Collapse
|
22
|
Clinical-chatbot AHP evaluation based on "quality in use" of ISO/IEC 25010. Int J Med Inform 2023; 170:104951. [PMID: 36525800 DOI: 10.1016/j.ijmedinf.2022.104951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 11/23/2022] [Accepted: 12/01/2022] [Indexed: 12/15/2022]
Abstract
BACKGROUND Conversational agents are currently a valid alternative to humans in first-level interviews with users who need information, even in-depth, about services or products. In application domains such as health care, this technology can become pervasive only if the perceived "quality in use" is appropriate. How to measure chatbot quality is an open question. The international standard ISO/IEC 25010 proposes a set of characteristics (effectiveness, efficiency, satisfaction, freedom from risk, and context coverage) to be considered when the "quality in use" of a software system has to be measured. BASIC PROCEDURE This study proposes a clinical chatbot comparison method based on quality. The proposed approach is based on Analytic Hierarchy Process methodology (AHP). FINDINGS Our contribution is twofold. First, we propose a set of measures for each characteristic of ISO/IEC 25010 according to three classes of functionality: providing information, providing prescriptions and process management. Moreover a quantitative method is proposed for making homogeneous the pairwise weights when the AHP is used for the "quality-in-use" comparison. As a case study, a comparison of two versions of a chatbot was performed. CONCLUSIONS The results show that the proposed approach provides an effective reference base for performing quality comparisons of medical chatbots compliant with the ISO/IEC 25010 standard.
Collapse
|
23
|
Eysenbach G, May R. Developing a Technical-Oriented Taxonomy to Define Archetypes of Conversational Agents in Health Care: Literature Review and Cluster Analysis. J Med Internet Res 2023; 25:e41583. [PMID: 36716093 PMCID: PMC9926340 DOI: 10.2196/41583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 11/20/2022] [Accepted: 12/19/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The evolution of artificial intelligence and natural language processing generates new opportunities for conversational agents (CAs) that communicate and interact with individuals. In the health domain, CAs became popular as they allow for simulating the real-life experience in a health care setting, which is the conversation with a physician. However, it is still unclear which technical archetypes of health CAs can be distinguished. Such technical archetypes are required, among other things, for harmonizing evaluation metrics or describing the landscape of health CAs. OBJECTIVE The objective of this work was to develop a technical-oriented taxonomy for health CAs and characterize archetypes of health CAs based on their technical characteristics. METHODS We developed a taxonomy of technical characteristics for health CAs based on scientific literature and empirical data and by applying a taxonomy development framework. To demonstrate the applicability of the taxonomy, we analyzed the landscape of health CAs of the last years based on a literature review. To form technical design archetypes of health CAs, we applied a k-means clustering method. RESULTS Our taxonomy comprises 18 unique dimensions corresponding to 4 perspectives of technical characteristics (setting, data processing, interaction, and agent appearance). Each dimension consists of 2 to 5 characteristics. The taxonomy was validated based on 173 unique health CAs that were identified out of 1671 initially retrieved publications. The 173 CAs were clustered into 4 distinctive archetypes: a text-based ad hoc supporter; a multilingual, hybrid ad hoc supporter; a hybrid, single-language temporary advisor; and, finally, an embodied temporary advisor, rule based with hybrid input and output options. CONCLUSIONS From the cluster analysis, we learned that the time dimension is important from a technical perspective to distinguish health CA archetypes. Moreover, we were able to identify additional distinctive, dominant characteristics that are relevant when evaluating health-related CAs (eg, input and output options or the complexity of the CA personality). Our archetypes reflect the current landscape of health CAs, which is characterized by rule based, simple systems in terms of CA personality and interaction. With an increase in research interest in this field, we expect that more complex systems will arise. The archetype-building process should be repeated after some time to check whether new design archetypes emerge.
Collapse
Affiliation(s)
| | - Richard May
- Harz University of Applied Sciences, Wernigerode, Germany
| |
Collapse
|
24
|
Chagas BA, Pagano AS, Prates RO, Praes EC, Ferreguetti K, Vaz H, Reis ZSN, Ribeiro LB, Ribeiro ALP, Pedroso TM, Beleigoli A, Oliveira CRA, Marcolino MS. Evaluating user experience with a chatbot designed as a public health response to the Covid-19 pandemic in Brazil: a mixed-methods study. JMIR Hum Factors 2023; 10:e43135. [PMID: 36634267 PMCID: PMC10131797 DOI: 10.2196/43135] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 12/18/2022] [Accepted: 01/12/2023] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND The potential of chatbots for screening and monitoring COVID-19 was envisioned since the very outbreak of the disease. Chatbots can help disseminate up-to-date and trustworthy information, promote healthy social behavior and support the provision of healthcare services safely and at scale. In this scenario and in view of its far-reaching post-pandemic impact, it is critically important to evaluate user experience with this kind of application. OBJECTIVE To evaluate the quality of user experience with a chatbot designed in response to the COVID-19 pandemic by a large telehealth service in Brazil, focusing on an analysis of usability with real users and on an exploration of strengths and shortcomings of the chatbot as revealed in reports by participants in simulated scenarios. METHODS We examined a chatbot developed by a multidisciplinary team and used as a component within the workflow of a local public healthcare service. The chatbot had two core functionalities: assisting online screening of COVID-19 symptom severity and providing evidence-based information to the population. From October 2020 to January 2021, we conducted a mixed-methods approach and performed a twofold evaluation of user experience with our chatbot by two methods: (i) a post-task usability Likert-scale survey presented to all users upon concluding their interaction with the bot; and (ii) an interview with volunteer participants who engaged in a simulated interaction with the bot guided by the interviewer. RESULTS Usability assessment with 63 users revealed very good scores for chatbot usefulness (4.57), likelihood of being recommended (4.48), ease of use (4.44) and user satisfaction (4.38). Interviews with 15 volunteers provided insights into strengths and shortcomings in our bot. Comments on positive aspects and problems reported by users were analyzed in terms of recurrent themes. We identified six positive aspects and fifteen issues organized in two main categories: usability of the chatbot and health support offered by it, the former referring to usability of the chatbot and its interactive resources and the latter to the chatbot goal in supporting people during the pandemic through the screening process and education to users through informative content. We found six themes accounting for what people liked most about our chatbot and why they found it useful, three themes pertaining to the usability domain and three regarding health support. Besides positive feedback, our findings identified 15 types of problems producing a negative impact on users, ten of them related to the usability of the chatbot and five related to the health support it provides. CONCLUSIONS Our results indicate that users had an overall positive experience with the chatbot and found the health support relevant. Nonetheless, the qualitative evaluation of the chatbot indicated challenges and directions to be pursued in improving, not only our COVID chatbot, but health chatbots in general. CLINICALTRIAL
Collapse
Affiliation(s)
- Bruno Azevedo Chagas
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, BR
| | | | | | | | | | - Helena Vaz
- Arts Faculty, Universidade Federal de Minas Gerais, Belo Horizonte, BR
| | - Zilma Silveira Nogueira Reis
- Telehealth Center, University Hospital, and Telehealth Network of Minas Gerais, Universidade Federal de Minas Gerais, Avenida Professor Alfredo Balena, 110 1o andar Sala 107 Ala Sul, Belo Horizonte, BR
| | - Leonardo Bonisson Ribeiro
- Telehealth Center, University Hospital, and Telehealth Network of Minas Gerais, Universidade Federal de Minas Gerais, Avenida Professor Alfredo Balena, 110 1o andar Sala 107 Ala Sul, Belo Horizonte, BR
| | - Antonio Luiz Pinho Ribeiro
- Telehealth Center, University Hospital, and Telehealth Network of Minas Gerais, Universidade Federal de Minas Gerais, Avenida Professor Alfredo Balena, 110 1o andar Sala 107 Ala Sul, Belo Horizonte, BR.,Department of Internal Medicine, Medical School, Universidade Federal de Minas Gerais, Belo Horizonte, BR
| | - Thais Marques Pedroso
- Telehealth Center, University Hospital, and Telehealth Network of Minas Gerais, Universidade Federal de Minas Gerais, Avenida Professor Alfredo Balena, 110 1o andar Sala 107 Ala Sul, Belo Horizonte, BR
| | - Alline Beleigoli
- Flinders Digital Health Research Centre and Caring Futures Institute, Flinders University, Adelaide, AU
| | - Clara Rodrigues Alves Oliveira
- Telehealth Center, University Hospital, and Telehealth Network of Minas Gerais, Universidade Federal de Minas Gerais, Avenida Professor Alfredo Balena, 110 1o andar Sala 107 Ala Sul, Belo Horizonte, BR.,Department of Internal Medicine, Medical School, Universidade Federal de Minas Gerais, Belo Horizonte, BR
| | - Milena Soriano Marcolino
- Telehealth Center, University Hospital, and Telehealth Network of Minas Gerais, Universidade Federal de Minas Gerais, Avenida Professor Alfredo Balena, 110 1o andar Sala 107 Ala Sul, Belo Horizonte, BR.,Department of Internal Medicine, Medical School, Universidade Federal de Minas Gerais, Belo Horizonte, BR
| |
Collapse
|
25
|
Hayakawa M, Watanabe O, Shiga K, Fujishita M, Yamaki C, Ogo Y, Takahashi T, Ikeguchi Y, Takayama T. Exploring types of conversational agents for resolving cancer patients' questions and concerns: Analysis of 100 telephone consultations on breast cancer. PATIENT EDUCATION AND COUNSELING 2023; 106:75-84. [PMID: 36244948 DOI: 10.1016/j.pec.2022.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/20/2022] [Accepted: 10/07/2022] [Indexed: 06/16/2023]
Abstract
OBJECTIVE This study was conducted to investigate the types of conversational agents (CA) that can help address questions and concerns ("lay topics" [LTs]). METHODS We analyzed audio recordings of telephone consultations with 100 breast cancer patients and their families. (1) We identified the content and mode of expression of LTs about breast cancer raised during actual telephone consultations. (2) We checked for the presence of clue information (CI) that can help patients resolve their LTs. RESULTS None of the 805 LTs of the 100 callers were the same. Treatment-related questions occurred in 70 of the 100 consultations. CIs were present in 52.5% of the LTs. CONCLUSION The results suggest that chatbots (a type of CA) that offer CIs are more feasible than chatbots that answer each question directly in cancer consultations. Moreover, it is difficult to answer questions directly because preparing answers to all LTs in a breast cancer consultation is challenging owing to LT differences. Therefore, preparing high-quality CIs focused on treatments is required. PRACTICE IMPLICATIONS An increasing number of cancer patients are seeking information to resolve their LTs. CAs can help supplement the limited human resources available if they are supplied with appropriate CIs.
Collapse
Affiliation(s)
- Masayo Hayakawa
- Division of Cancer Information Service, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan.
| | - Otome Watanabe
- Division of Cancer Information Service, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Kumiko Shiga
- Division of Cancer Information Service, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Manami Fujishita
- Center for Cancer Registries, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Chikako Yamaki
- Division of Cancer Information Service, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Yuko Ogo
- Division of Cancer Information Service, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Tomoko Takahashi
- Division of Cancer Information Service, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Yoshiko Ikeguchi
- Department of Nursing, Faculty of Health Science Technology, Bunkyo Gakuin University, Tokyo, Japan
| | - Tomoko Takayama
- Division of Cancer Information Service, Group for Cancer Control Services, Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| |
Collapse
|
26
|
White BK, Martin A, White JA. User Experience of COVID-19 Chatbots: Scoping Review. J Med Internet Res 2022; 24:e35903. [PMID: 36520624 PMCID: PMC9822175 DOI: 10.2196/35903] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 06/02/2022] [Accepted: 12/08/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The COVID-19 pandemic has had global impacts and caused some health systems to experience substantial pressure. The need for accurate health information has been felt widely. Chatbots have great potential to reach people with authoritative information, and a number of chatbots have been quickly developed to disseminate information about COVID-19. However, little is known about user experiences of and perspectives on these tools. OBJECTIVE This study aimed to describe what is known about the user experience and user uptake of COVID-19 chatbots. METHODS A scoping review was carried out in June 2021 using keywords to cover the literature concerning chatbots, user engagement, and COVID-19. The search strategy included databases covering health, communication, marketing, and the COVID-19 pandemic specifically, including MEDLINE Ovid, Embase, CINAHL, ACM Digital Library, Emerald, and EBSCO. Studies that assessed the design, marketing, and user features of COVID-19 chatbots or those that explored user perspectives and experience were included. We excluded papers that were not related to COVID-19; did not include any reporting on user perspectives, experience, or the general use of chatbot features or marketing; or where a version was not available in English. The authors independently screened results for inclusion, using both backward and forward citation checking of the included papers. A thematic analysis was carried out with the included papers. RESULTS A total of 517 papers were sourced from the literature, and 10 were included in the final review. Our scoping review identified a number of factors impacting adoption and engagement including content, trust, digital ability, and acceptability. The papers included discussions about chatbots developed for COVID-19 screening and general COVID-19 information, as well as studies investigating user perceptions and opinions on COVID-19 chatbots. CONCLUSIONS The COVID-19 pandemic presented a unique and specific challenge for digital health interventions. Design and implementation were required at a rapid speed as digital health service adoption accelerated globally. Chatbots for COVID-19 have been developed quickly as the pandemic has challenged health systems. There is a need for more comprehensive and routine reporting of factors impacting adoption and engagement. This paper has shown both the potential of chatbots to reach users in an emergency and the need to better understand how users engage and what they want.
Collapse
Affiliation(s)
- Becky K White
- School of Population Health, Curtin University, Perth, Australia
- Reach Health Promotion Innovations, Perth, Australia
| | | | | |
Collapse
|
27
|
Ta-Johnson VP, Boatfield C, Wang X, DeCero E, Krupica IC, Rasof SD, Motzer A, Pedryc WM. Assessing the Topics and Motivating Factors Behind Human-Social Chatbot Interactions: Thematic Analysis of User Experiences. JMIR Hum Factors 2022; 9:e38876. [PMID: 36190745 PMCID: PMC9577709 DOI: 10.2196/38876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 07/30/2022] [Accepted: 08/29/2022] [Indexed: 11/13/2022] Open
Abstract
Background Although social chatbot usage is expected to increase as language models and artificial intelligence improve, very little is known about the dynamics of human-social chatbot interactions. Specifically, there is a paucity of research examining why human-social chatbot interactions are initiated and the topics that are discussed. Objective We sought to identify the motivating factors behind initiating contact with Replika, a popular social chatbot, and the topics discussed in these interactions. Methods A sample of Replika users completed a survey that included open-ended questions pertaining to the reasons why they initiated contact with Replika and the topics they typically discuss. Thematic analyses were then used to extract themes and subthemes regarding the motivational factors behind Replika use and the types of discussions that take place in conversations with Replika. Results Users initiated contact with Replika out of interest, in search of social support, and to cope with mental and physical health conditions. Users engaged in a wide variety of discussion topics with their Replika, including intellectual topics, life and work, recreation, mental health, connection, Replika, current events, and other people. Conclusions Given the wide range of motivational factors and discussion topics that were reported, our results imply that multifaceted support can be provided by a single social chatbot. While previous research already established that social chatbots can effectively help address mental and physical health issues, these capabilities have been dispersed across several different social chatbots instead of deriving from a single one. Our results also highlight a motivating factor of human-social chatbot usage that has received less attention than other motivating factors: interest. Users most frequently reported using Replika out of interest and sought to explore its capabilities and learn more about artificial intelligence. Thus, while developers and researchers study human-social chatbot interactions with the efficacy of the social chatbot and its targeted user base in mind, it is equally important to consider how its usage can shape public perceptions and support for social chatbots and artificial agents in general.
Collapse
Affiliation(s)
- Vivian P Ta-Johnson
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States
| | - Carolynn Boatfield
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States.,College of Health Professions, Rosalind Franklin University, North Chicago, IL, United States
| | - Xinyu Wang
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States.,Department of Psychology, Columbia University, New York City, NY, United States
| | - Esther DeCero
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States.,School of Health Sciences and Public Health, Loyola University Chicago, Maywood, IL, United States
| | - Isabel C Krupica
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States.,College of Health Professions, Rosalind Franklin University, North Chicago, IL, United States
| | - Sophie D Rasof
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States
| | - Amelie Motzer
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States
| | - Wiktoria M Pedryc
- Department of Psychology, Lake Forest College, Lake Forest, IL, United States
| |
Collapse
|
28
|
Noble JM, Zamani A, Gharaat M, Merrick D, Maeda N, Lambe Foster A, Nikolaidis I, Goud R, Stroulia E, Agyapong VIO, Greenshaw AJ, Lambert S, Gallson D, Porter K, Turner D, Zaiane O. Developing, Implementing, and Evaluating an Artificial Intelligence-Guided Mental Health Resource Navigation Chatbot for Health Care Workers and Their Families During and Following the COVID-19 Pandemic: Protocol for a Cross-sectional Study. JMIR Res Protoc 2022; 11:e33717. [PMID: 35877158 PMCID: PMC9361145 DOI: 10.2196/33717] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 02/11/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background Approximately 1 in 3 Canadians will experience an addiction or mental health challenge at some point in their lifetime. Unfortunately, there are multiple barriers to accessing mental health care, including system fragmentation, episodic care, long wait times, and insufficient support for health system navigation. In addition, stigma may further reduce an individual’s likelihood of seeking support. Digital technologies present new and exciting opportunities to bridge significant gaps in mental health care service provision, reduce barriers pertaining to stigma, and improve health outcomes for patients and mental health system integration and efficiency. Chatbots (ie, software systems that use artificial intelligence to carry out conversations with people) may be explored to support those in need of information or access to services and present the opportunity to address gaps in traditional, fragmented, or episodic mental health system structures on demand with personalized attention. The recent COVID-19 pandemic has exacerbated even further the need for mental health support among Canadians and called attention to the inefficiencies of our system. As health care workers and their families are at an even greater risk of mental illness and psychological distress during the COVID-19 pandemic, this technology will be first piloted with the goal of supporting this vulnerable group. Objective This pilot study seeks to evaluate the effectiveness of the Mental Health Intelligent Information Resource Assistant in supporting health care workers and their families in the Canadian provinces of Alberta and Nova Scotia with the provision of appropriate information on mental health issues, services, and programs based on personalized needs. Methods The effectiveness of the technology will be assessed via voluntary follow-up surveys and an analysis of client interactions and engagement with the chatbot. Client satisfaction with the chatbot will also be assessed. Results This project was initiated on April 1, 2021. Ethics approval was granted on August 12, 2021, by the University of Alberta Health Research Board (PRO00109148) and on April 21, 2022, by the Nova Scotia Health Authority Research Ethics Board (1027474). Data collection is anticipated to take place from May 2, 2022, to May 2, 2023. Publication of preliminary results will be sought in spring or summer 2022, with a more comprehensive evaluation completed by spring 2023 following the collection of a larger data set. Conclusions Our findings can be incorporated into public policy and planning around mental health system navigation by Canadian mental health care providers—from large public health authorities to small community-based, not-for-profit organizations. This may serve to support the development of an additional touch point, or point of entry, for individuals to access the appropriate services or care when they need them, wherever they are. International Registered Report Identifier (IRRID) PRR1-10.2196/33717
Collapse
Affiliation(s)
- Jasmine M Noble
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada.,Department of Psychiatry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada
| | - Ali Zamani
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada.,Alberta Machine Intelligence Institute, Edmonton, AB, Canada
| | - MohamadAli Gharaat
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada.,Alberta Machine Intelligence Institute, Edmonton, AB, Canada
| | - Dylan Merrick
- Department of Indigenous Studies, University of Saskatchewan, Regina, SK, Canada
| | - Nathanial Maeda
- Rehabilitation Robotics Lab, Faculty of Rehabilitation Medicine, University of Alberta, Edmonton, AB, Canada
| | | | | | - Rachel Goud
- Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Eleni Stroulia
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada
| | - Vincent I O Agyapong
- Department of Psychiatry, Faculty of Medicine, Dalhousie University, Halifax, NS, Canada
| | - Andrew J Greenshaw
- Department of Psychiatry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada.,Asia-Pacific Economic Cooperation Digital Hub for Mental Health, Vancouver, BC, Canada
| | - Simon Lambert
- Department of Indigenous Studies, University of Saskatchewan, Regina, SK, Canada.,Network Environments for Indigenous Health Research National Coordinating Centre, Saskatoon, SK, Canada
| | - Dave Gallson
- Mood Disorders Society of Canada, Ottawa, ON, Canada
| | - Ken Porter
- Mood Disorders Society of Canada, Ottawa, ON, Canada
| | - Debbie Turner
- Mood Disorders Society of Canada, Ottawa, ON, Canada
| | - Osmar Zaiane
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada.,Alberta Machine Intelligence Institute, Edmonton, AB, Canada
| |
Collapse
|
29
|
Zidoun Y, Kaladhara S, Powell L, Nour R, Al Suwaidi H, Zary N. Contextual Conversational Agent to address Vaccine Hesitancy: Protocol for a design-based research study. JMIR Res Protoc 2022; 11:e38043. [PMID: 35797423 PMCID: PMC9397500 DOI: 10.2196/38043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/14/2022] [Accepted: 07/07/2022] [Indexed: 11/13/2022] Open
Abstract
Background Since the beginning of the COVID-19 pandemic, people have been exposed to misinformation, leading to many myths about SARS-CoV-2 and the vaccines against it. As this situation does not seem to end soon, many authorities and health organizations, including the World Health Organization (WHO), are utilizing conversational agents (CAs) in their fight against it. Although the impact and usage of these novel digital strategies are noticeable, the design of the CAs remains key to their success. Objective This study describes the use of design-based research (DBR) for contextual CA design to address vaccine hesitancy. In addition, this protocol will examine the impact of DBR on CA design to understand how this iterative process can enhance accuracy and performance. Methods A DBR methodology will be used for this study. Each phase of analysis, design, and evaluation of each design cycle inform the next one via its outcomes. An anticipated generic strategy will be formed after completing the first iteration. Using multiple research studies, frameworks and theoretical approaches are tested and evaluated through the different design cycles. User perception of the CA will be analyzed or collected by implementing a usability assessment during every evaluation phase using the System Usability Scale. The PARADISE (PARAdigm for Dialogue System Evaluation) method will be adopted to calculate the performance of this text-based CA. Results Two phases of the first design cycle (design and evaluation) were completed at the time of this writing (April 2022). The research team is currently reviewing the natural-language understanding model as part of the conversation-driven development (CDD) process in preparation for the first pilot intervention, which will conclude the CA’s first design cycle. In addition, conversational data will be analyzed quantitatively and qualitatively as part of the reflection and revision process to inform the subsequent design cycles. This project plans for three rounds of design cycles, resulting in various studies spreading outcomes and conclusions. The results of the first study describing the entire first design cycle are expected to be submitted for publication before the end of 2022. Conclusions CAs constitute an innovative way of delivering health communication information. However, they are primarily used to contribute to behavioral change or educate people about health issues. Therefore, health chatbots’ impact should be carefully designed to meet outcomes. DBR can help shape a holistic understanding of the process of CA conception. This protocol describes the design of VWise, a contextual CA that aims to address vaccine hesitancy using the DBR methodology. The results of this study will help identify the strengths and flaws of DBR’s application to such innovative projects.
Collapse
Affiliation(s)
- Youness Zidoun
- Institute for Excellence in Health Professions Education, Mohammed Bin Rashid University of Medicine and Health Sciences, Building 14, Dubai Healthcare CityP.O Box 505055, Dubai, AE
| | - Sreelekshmi Kaladhara
- Institute for Excellence in Health Professions Education, Mohammed Bin Rashid University of Medicine and Health Sciences, Building 14, Dubai Healthcare CityP.O Box 505055, Dubai, AE
| | - Leigh Powell
- Institute for Excellence in Health Professions Education, Mohammed Bin Rashid University of Medicine and Health Sciences, Building 14, Dubai Healthcare CityP.O Box 505055, Dubai, AE
| | - Radwa Nour
- Institute for Excellence in Health Professions Education, Mohammed Bin Rashid University of Medicine and Health Sciences, Building 14, Dubai Healthcare CityP.O Box 505055, Dubai, AE
| | - Hanan Al Suwaidi
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, AE
| | - Nabil Zary
- Institute for Excellence in Health Professions Education, Mohammed Bin Rashid University of Medicine and Health Sciences, Building 14, Dubai Healthcare CityP.O Box 505055, Dubai, AE
| |
Collapse
|
30
|
He Y, Yang L, Zhu X, Wu B, Zhang S, Qian C, Tian T. Mental health chatbot for young adults with depressive symptoms: a single-blind, three-arm, randomized controlled trial (Preprint). J Med Internet Res 2022; 24:e40719. [DOI: 10.2196/40719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 10/14/2022] [Accepted: 11/03/2022] [Indexed: 11/06/2022] Open
|
31
|
Sagstad MH, Morken NH, Lund A, Dingsør LJ, Nilsen ABV, Sorbye LM. Quantitative User Data From a Chatbot Developed for Women With Gestational Diabetes Mellitus: Observational Study. JMIR Form Res 2022; 6:e28091. [PMID: 35436213 PMCID: PMC9062719 DOI: 10.2196/28091] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 06/24/2021] [Accepted: 02/19/2022] [Indexed: 02/06/2023] Open
Abstract
Background The rising prevalence of gestational diabetes mellitus (GDM) calls for the use of innovative methods to inform and empower these pregnant women. An information chatbot, Dina, was developed for women with GDM and is Norway’s first health chatbot, integrated into the national digital health platform. Objective The aim of this study is to investigate what kind of information users seek in a health chatbot providing support on GDM. Furthermore, we sought to explore when and how the chatbot is used by time of day and the number of questions in each dialogue and to categorize the questions the chatbot was unable to answer (fallback). The overall goal is to explore quantitative user data in the chatbot’s log, thereby contributing to further development of the chatbot. Methods An observational study was designed. We used quantitative anonymous data (dialogues) from the chatbot’s log and platform during an 8-week period in 2018 and a 12-week period in 2019 and 2020. Dialogues between the user and the chatbot were the unit of analysis. Questions from the users were categorized by theme. The time of day the dialogue occurred and the number of questions in each dialogue were registered, and questions resulting in a fallback message were identified. Results are presented using descriptive statistics. Results We identified 610 dialogues with a total of 2838 questions during the 20 weeks of data collection. Questions regarding blood glucose, GDM, diet, and physical activity represented 58.81% (1669/2838) of all questions. In total, 58.0% (354/610) of dialogues occurred during daytime (8 AM to 3:59 PM), Monday through Friday. Most dialogues were short, containing 1-3 questions (340/610, 55.7%), and there was a decrease in dialogues containing 4-6 questions in the second period (P=.013). The chatbot was able to answer 88.51% (2512/2838) of all posed questions. The mean number of dialogues per week was 36 in the first period and 26.83 in the second period. Conclusions Frequently asked questions seem to mirror the cornerstones of GDM treatment and may indicate that the chatbot is used to quickly access information already provided for them by the health care service but providing a low-threshold way to access that information. Our results underline the need to actively promote and integrate the chatbot into antenatal care as well as the importance of continuous content improvement in order to provide relevant information.
Collapse
Affiliation(s)
- Mari Haaland Sagstad
- Department of Health and Caring Sciences, Faculty of Health and Social Sciences, Western Norway University of Applied Sciences, Bergen, Norway
- Department of Obstetrics and Gynecology, Haukeland University Hospital, Bergen, Norway
| | - Nils-Halvdan Morken
- Department of Obstetrics and Gynecology, Haukeland University Hospital, Bergen, Norway
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Agnethe Lund
- Department of Obstetrics and Gynecology, Haukeland University Hospital, Bergen, Norway
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Linn Jannike Dingsør
- Department of Health and Caring Sciences, Faculty of Health and Social Sciences, Western Norway University of Applied Sciences, Bergen, Norway
| | - Anne Britt Vika Nilsen
- Department of Health and Caring Sciences, Faculty of Health and Social Sciences, Western Norway University of Applied Sciences, Bergen, Norway
| | - Linn Marie Sorbye
- Department of Obstetrics and Gynecology, Haukeland University Hospital, Bergen, Norway
- Norwegian Research Centre for Womens´s Health, Rikshospitalet, Oslo University Hospital, Oslo, Norway
| |
Collapse
|
32
|
Denecke K, Schmid N, Nüssli S. Implementation of Cognitive Behavioral Therapy in e-Mental Health Apps: Literature Review. J Med Internet Res 2022; 24:e27791. [PMID: 35266875 PMCID: PMC8949700 DOI: 10.2196/27791] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Revised: 07/27/2021] [Accepted: 12/28/2021] [Indexed: 12/24/2022] Open
Abstract
Background To address the matter of limited resources for treating individuals with mental disorders, e–mental health has gained interest in recent years. More specifically, mobile health (mHealth) apps have been suggested as electronic mental health interventions accompanied by cognitive behavioral therapy (CBT). Objective This study aims to identify the therapeutic aspects of CBT that have been implemented in existing mHealth apps and the technologies used. From these, we aim to derive research gaps that should be addressed in the future. Methods Three databases were screened for studies on mHealth apps in the context of mental disorders that implement techniques of CBT: PubMed, IEEE Xplore, and ACM Digital Library. The studies were independently selected by 2 reviewers, who then extracted data from the included studies. Data on CBT techniques and their technical implementation in mHealth apps were synthesized narratively. Results Of the 530 retrieved citations, 34 (6.4%) studies were included in this review. mHealth apps for CBT exploit two groups of technologies: technologies that implement CBT techniques for cognitive restructuring, behavioral activation, and problem solving (exposure is not yet realized in mHealth apps) and technologies that aim to increase user experience, adherence, and engagement. The synergy of these technologies enables patients to self-manage and self-monitor their mental state and access relevant information on their mental illness, which helps them cope with mental health problems and allows self-treatment. Conclusions There are CBT techniques that can be implemented in mHealth apps. Additional research is needed on the efficacy of the mHealth interventions and their side effects, including inequalities because of the digital divide, addictive internet behavior, lack of trust in mHealth, anonymity issues, risks and biases for user groups and social contexts, and ethical implications. Further research is also required to integrate and test psychological theories to improve the impact of mHealth and adherence to the e–mental health interventions.
Collapse
Affiliation(s)
- Kerstin Denecke
- Institute for Medical Informatics, Bern University of Applied Sciences, Biel, Switzerland
| | | | - Stephan Nüssli
- Institute for Medical Informatics, Bern University of Applied Sciences, Biel, Switzerland
| |
Collapse
|
33
|
Lahti L. Detecting the patient’s need for help with machine learning based on expressions. BMC Med Res Methodol 2022; 22:60. [PMID: 35249538 PMCID: PMC8898191 DOI: 10.1186/s12874-021-01502-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 12/30/2021] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Developing machine learning models to support health analytics requires increased understanding about statistical properties of self-rated expression statements used in health-related communication and decision making. To address this, our current research analyzes self-rated expression statements concerning the coronavirus COVID-19 epidemic and with a new methodology identifies how statistically significant differences between groups of respondents can be linked to machine learning results.
Methods
A quantitative cross-sectional study gathering the “need for help” ratings for twenty health-related expression statements concerning the coronavirus epidemic on an 11-point Likert scale, and nine answers about the person’s health and wellbeing, sex and age. The study involved online respondents between 30 May and 3 August 2020 recruited from Finnish patient and disabled people’s organizations, other health-related organizations and professionals, and educational institutions (n = 673). We propose and experimentally motivate a new methodology of influence analysis concerning machine learning to be applied for evaluating how machine learning results depend on and are influenced by various properties of the data which are identified with traditional statistical methods.
Results
We found statistically significant Kendall rank-correlations and high cosine similarity values between various health-related expression statement pairs concerning the “need for help” ratings and a background question pair. With tests of Wilcoxon rank-sum, Kruskal-Wallis and one-way analysis of variance (ANOVA) between groups we identified statistically significant rating differences for several health-related expression statements in respect to groupings based on the answer values of background questions, such as the ratings of suspecting to have the coronavirus infection and having it depending on the estimated health condition, quality of life and sex. Our new methodology enabled us to identify how statistically significant rating differences were linked to machine learning results thus helping to develop better human-understandable machine learning models.
Conclusions
The self-rated “need for help” concerning health-related expression statements differs statistically significantly depending on the person’s background information, such as his/her estimated health condition, quality of life and sex. With our new methodology statistically significant rating differences can be linked to machine learning results thus enabling to develop better machine learning to identify, interpret and address the patient’s needs for well-personalized care.
Collapse
|
34
|
Parviainen J, Rantala J. Chatbot breakthrough in the 2020s? An ethical reflection on the trend of automated consultations in health care. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2022; 25:61-71. [PMID: 34480711 PMCID: PMC8416570 DOI: 10.1007/s11019-021-10049-w] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/30/2021] [Indexed: 05/20/2023]
Abstract
Many experts have emphasised that chatbots are not sufficiently mature to be able to technically diagnose patient conditions or replace the judgements of health professionals. The COVID-19 pandemic, however, has significantly increased the utilisation of health-oriented chatbots, for instance, as a conversational interface to answer questions, recommend care options, check symptoms and complete tasks such as booking appointments. In this paper, we take a proactive approach and consider how the emergence of task-oriented chatbots as partially automated consulting systems can influence clinical practices and expert-client relationships. We suggest the need for new approaches in professional ethics as the large-scale deployment of artificial intelligence may revolutionise professional decision-making and client-expert interaction in healthcare organisations. We argue that the implementation of chatbots amplifies the project of rationality and automation in clinical practice and alters traditional decision-making practices based on epistemic probability and prudence. This article contributes to the discussion on the ethical challenges posed by chatbots from the perspective of healthcare professional ethics.
Collapse
Affiliation(s)
- Jaana Parviainen
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| | - Juho Rantala
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
35
|
Parmar P, Ryu J, Pandya S, Sedoc J, Agarwal S. Health-focused conversational agents in person-centered care: a review of apps. NPJ Digit Med 2022; 5:21. [PMID: 35177772 PMCID: PMC8854396 DOI: 10.1038/s41746-022-00560-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 01/07/2022] [Indexed: 11/09/2022] Open
Abstract
Health-focused apps with chatbots ("healthbots") have a critical role in addressing gaps in quality healthcare. There is limited evidence on how such healthbots are developed and applied in practice. Our review of healthbots aims to classify types of healthbots, contexts of use, and their natural language processing capabilities. Eligible apps were those that were health-related, had an embedded text-based conversational agent, available in English, and were available for free download through the Google Play or Apple iOS store. Apps were identified using 42Matters software, a mobile app search engine. Apps were assessed using an evaluation framework addressing chatbot characteristics and natural language processing features. The review suggests uptake across 33 low- and high-income countries. Most healthbots are patient-facing, available on a mobile interface and provide a range of functions including health education and counselling support, assessment of symptoms, and assistance with tasks such as scheduling. Most of the 78 apps reviewed focus on primary care and mental health, only 6 (7.59%) had a theoretical underpinning, and 10 (12.35%) complied with health information privacy regulations. Our assessment indicated that only a few apps use machine learning and natural language processing approaches, despite such marketing claims. Most apps allowed for a finite-state input, where the dialogue is led by the system and follows a predetermined algorithm. Healthbots are potentially transformative in centering care around the user; however, they are in a nascent state of development and require further research on development, automation and adoption for a population-level health impact.
Collapse
Affiliation(s)
- Pritika Parmar
- The Johns Hopkins University Krieger School of Arts and Sciences, Baltimore, MD, USA
| | - Jina Ryu
- The Johns Hopkins University Krieger School of Arts and Sciences, Baltimore, MD, USA
| | - Shivani Pandya
- Department of International Health, The Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - João Sedoc
- Department of Technology, Operations and Statistics, New York University Stern School of Business, New York City, NY, USA
| | - Smisha Agarwal
- Department of International Health, The Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|
36
|
Chan WW, Fitzsimmons-Craft EE, Smith AC, Firebaugh ML, Fowler LA, DePietro B, Topooco N, Wilfley DE, Taylor CB, Jacobson NC. The Challenges in Designing a Prevention Chatbot for Eating Disorders: Observational Study. JMIR Form Res 2022; 6:e28003. [PMID: 35044314 PMCID: PMC8811687 DOI: 10.2196/28003] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 05/28/2021] [Accepted: 11/30/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Chatbots have the potential to provide cost-effective mental health prevention programs at scale and increase interactivity, ease of use, and accessibility of intervention programs. OBJECTIVE The development of chatbot prevention for eating disorders (EDs) is still in its infancy. Our aim is to present examples of and solutions to challenges in designing and refining a rule-based prevention chatbot program for EDs, targeted at adult women at risk for developing an ED. METHODS Participants were 2409 individuals who at least began to use an EDs prevention chatbot in response to social media advertising. Over 6 months, the research team reviewed up to 52,129 comments from these users to identify inappropriate responses that negatively impacted users' experience and technical glitches. Problems identified by reviewers were then presented to the entire research team, who then generated possible solutions and implemented new responses. RESULTS The most common problem with the chatbot was a general limitation in understanding and responding appropriately to unanticipated user responses. We developed several workarounds to limit these problems while retaining some interactivity. CONCLUSIONS Rule-based chatbots have the potential to reach large populations at low cost but are limited in understanding and responding appropriately to unanticipated user responses. They can be most effective in providing information and simple conversations. Workarounds can reduce conversation errors.
Collapse
Affiliation(s)
- William W Chan
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
- Center for m2Health, Palo Alto University, Los Altos, CA, United States
| | | | - Arielle C Smith
- Department of Psychiatry, Washington University School of Medicine, St Louis, MO, United States
| | - Marie-Laure Firebaugh
- Department of Psychiatry, Washington University School of Medicine, St Louis, MO, United States
| | - Lauren A Fowler
- Department of Psychiatry, Washington University School of Medicine, St Louis, MO, United States
| | - Bianca DePietro
- Department of Psychiatry, Washington University School of Medicine, St Louis, MO, United States
| | - Naira Topooco
- Center for m2Health, Palo Alto University, Los Altos, CA, United States
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden
| | - Denise E Wilfley
- Department of Psychiatry, Washington University School of Medicine, St Louis, MO, United States
| | - C Barr Taylor
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
- Center for m2Health, Palo Alto University, Los Altos, CA, United States
| | - Nicholas C Jacobson
- Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
| |
Collapse
|
37
|
Oh YJ, Zhang J, Fang ML, Fukuoka Y. A systematic review of artificial intelligence chatbots for promoting physical activity, healthy diet, and weight loss. Int J Behav Nutr Phys Act 2021; 18:160. [PMID: 34895247 PMCID: PMC8665320 DOI: 10.1186/s12966-021-01224-6] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 11/10/2021] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND This systematic review aimed to evaluate AI chatbot characteristics, functions, and core conversational capacities and investigate whether AI chatbot interventions were effective in changing physical activity, healthy eating, weight management behaviors, and other related health outcomes. METHODS In collaboration with a medical librarian, six electronic bibliographic databases (PubMed, EMBASE, ACM Digital Library, Web of Science, PsycINFO, and IEEE) were searched to identify relevant studies. Only randomized controlled trials or quasi-experimental studies were included. Studies were screened by two independent reviewers, and any discrepancy was resolved by a third reviewer. The National Institutes of Health quality assessment tools were used to assess risk of bias in individual studies. We applied the AI Chatbot Behavior Change Model to characterize components of chatbot interventions, including chatbot characteristics, persuasive and relational capacity, and evaluation of outcomes. RESULTS The database search retrieved 1692 citations, and 9 studies met the inclusion criteria. Of the 9 studies, 4 were randomized controlled trials and 5 were quasi-experimental studies. Five out of the seven studies suggest chatbot interventions are promising strategies in increasing physical activity. In contrast, the number of studies focusing on changing diet and weight status was limited. Outcome assessments, however, were reported inconsistently across the studies. Eighty-nine and thirty-three percent of the studies specified a name and gender (i.e., woman) of the chatbot, respectively. Over half (56%) of the studies used a constrained chatbot (i.e., rule-based), while the remaining studies used unconstrained chatbots that resemble human-to-human communication. CONCLUSION Chatbots may improve physical activity, but we were not able to make definitive conclusions regarding the efficacy of chatbot interventions on physical activity, diet, and weight management/loss. Application of AI chatbots is an emerging field of research in lifestyle modification programs and is expected to grow exponentially. Thus, standardization of designing and reporting chatbot interventions is warranted in the near future. SYSTEMATIC REVIEW REGISTRATION International Prospective Register of Systematic Reviews (PROSPERO): CRD42020216761 .
Collapse
Affiliation(s)
- Yoo Jung Oh
- Department of Communication, University of California Davis, Davis, USA
| | - Jingwen Zhang
- Department of Communication, University of California Davis, Davis, USA
- Department of Public Health Sciences, University of California Davis, Davis, USA
| | - Min-Lin Fang
- Education and Research Services, University of California, San Francisco (UCSF) Library, UCSF, San Francisco, USA
| | - Yoshimi Fukuoka
- Department of Physiological Nursing, UCSF, San Francisco, USA
| |
Collapse
|
38
|
Denecke K, Abd-Alrazaq A, Househ M, Warren J. Evaluation Metrics for Health Chatbots: A Delphi Study. Methods Inf Med 2021; 60:171-179. [PMID: 34719011 DOI: 10.1055/s-0041-1736664] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
BACKGROUND In recent years, an increasing number of health chatbots has been published in app stores and described in research literature. Given the sensitive data they are processing and the care settings for which they are developed, evaluation is essential to avoid harm to users. However, evaluations of those systems are reported inconsistently and without using a standardized set of evaluation metrics. Missing standards in health chatbot evaluation prevent comparisons of systems, and this may hamper acceptability since their reliability is unclear. OBJECTIVES The objective of this paper is to make an important step toward developing a health-specific chatbot evaluation framework by finding consensus on relevant metrics. METHODS We used an adapted Delphi study design to verify and select potential metrics that we retrieved initially from a scoping review. We invited researchers, health professionals, and health informaticians to score each metric for inclusion in the final evaluation framework, over three survey rounds. We distinguished metrics scored relevant with high, moderate, and low consensus. The initial set of metrics comprised 26 metrics (categorized as global metrics, metrics related to response generation, response understanding and aesthetics). RESULTS Twenty-eight experts joined the first round and 22 (75%) persisted to the third round. Twenty-four metrics achieved high consensus and three metrics achieved moderate consensus. The core set for our framework comprises mainly global metrics (e.g., ease of use, security content accuracy), metrics related to response generation (e.g., appropriateness of responses), and related to response understanding. Metrics on aesthetics (font type and size, color) are less well agreed upon-only moderate or low consensus was achieved for those metrics. CONCLUSION The results indicate that experts largely agree on metrics and that the consensus set is broad. This implies that health chatbot evaluation must be multifaceted to ensure acceptability.
Collapse
Affiliation(s)
- Kerstin Denecke
- School of Engineering and Computer Science, Institute for Medical Informatics, Bern University of Applied Sciences, Biel, Switzerland
| | - Alaa Abd-Alrazaq
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Mowafa Househ
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Jim Warren
- Faculty of Science, School of Computer Science, University of Auckland, Auckland, New Zealand
| |
Collapse
|
39
|
Issom DZ, Hardy-Dessources MD, Romana M, Hartvigsen G, Lovis C. Toward a Conversational Agent to Support the Self-Management of Adults and Young Adults With Sickle Cell Disease: Usability and Usefulness Study. Front Digit Health 2021; 3:600333. [PMID: 34713087 PMCID: PMC8521934 DOI: 10.3389/fdgth.2021.600333] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Accepted: 01/07/2021] [Indexed: 11/25/2022] Open
Abstract
Sickle cell disease (SCD) is the most common genetic blood disorder in the world and affects millions of people. With aging, patients encounter an increasing number of comorbidities that can be acute, chronic, and potentially lethal (e.g., pain, multiple organ damages, lung disease). Comprehensive and preventive care for adults with SCD faces disparities (e.g., shortage of well-trained providers). Consequently, many patients do not receive adequate treatment, as outlined by evidence-based guidelines, and suffer from mistrust, stigmatization or neglect. Thus, adult patients often avoid necessary care, seek treatment only as a last resort, and rely on self-management to maintain control over the course of the disease. Hopefully, self-management positively impacts health outcomes. However, few patients possess the required skills (e.g., disease-specific knowledge, self-efficacy), and many lack motivation for effective self-care. Health coaching has emerged as a new approach to enhance patients' self-management and support health behavior changes. Recent studies have demonstrated that conversational agents (chatbots) could effectively support chronic patients' self-management needs, improve self-efficacy, encourage behavior changes, and reduce disease-severity. To date, the use of chatbots to support SCD self-management remains largely under-researched. Consequently, we developed a high-fidelity prototype of a fully automated health coaching chatbot, following patient-important requirements and preferences collected during our previous work. We recruited a small convenience sample of adults with SCD to examine the usability and perceived usefulness of the system. Participants completed a post-test survey using the System Usability Scale and the Usefulness Scale for Patient Information Material questionnaire. Thirty-three patients participated. The majority (64%) was affected by the most clinically severe SCD genotypes (Hb SS, HbSβ0). Most participants (94%) rated the chatbots as easy and fun to use, while 88% perceived it as useful support for patient empowerment. In the qualitative phase, 72% of participants expressed their enthusiasm using the chatbot, and 82% emphasized its ability to improve their knowledge about self-management. Findings suggest that chatbots could be used to promote the acquisition of recommended health behaviors and self-care practices related to the prevention of the main symptoms of SCD. Further work is needed to refine the system, and to assess clinical validity.
Collapse
Affiliation(s)
- David-Zacharie Issom
- Department of Radiology and Medical Informatics, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | | | - Marc Romana
- INSERM U1134 Biologie Intégrée du Globule Rouge, Paris, France
| | - Gunnar Hartvigsen
- Department of Computer Science, UiT the Arctic University of Norway, Tromsø, Norway
| | - Christian Lovis
- Department of Radiology and Medical Informatics, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
40
|
Figueroa CA, Luo TC, Jacobo A, Munoz A, Manuel M, Chan D, Canny J, Aguilera A. Conversational Physical Activity Coaches for Spanish and English Speaking Women: A User Design Study. Front Digit Health 2021; 3:747153. [PMID: 34713207 PMCID: PMC8531260 DOI: 10.3389/fdgth.2021.747153] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 09/06/2021] [Indexed: 11/17/2022] Open
Abstract
Introduction: Digital technologies, including text messaging and mobile phone apps, can be leveraged to increase people's physical activity and manage health. Chatbots, powered by artificial intelligence, can automatically interact with individuals through natural conversation. They may be more engaging than one-way messaging interventions. To our knowledge, physical activity chatbots have not been developed with low-income participants, nor in Spanish-the second most dominant language in the U.S. We recommend best practices for physical activity chatbots in English and Spanish for low-income women. Methods: We designed a prototype physical activity text-message based conversational agent based on various psychotherapeutic techniques. We recruited participants through SNAP-Ed (Supplemental Nutrition Assistance Program Education) in California (Alameda County) and Tennessee (Shelby County). We conducted qualitative interviews with participants during testing of our prototype chatbot, held a Wizard of Oz study, and facilitated a co-design workshop in Spanish with a subset of our participants. Results: We included 10 Spanish- and 8 English-speaking women between 27 and 41 years old. The majority was Hispanic/Latina (n = 14), 2 were White and 2 were Black/African American. More than half were monolingual Spanish speakers, and the majority was born outside the US (>50% in Mexico). Most participants were unfamiliar with chatbots and were initially skeptical. After testing our prototype, most users felt positively about health chatbots. They desired a personalized chatbot that addresses their concerns about privacy, and stressed the need for a comprehensive system to also aid with nutrition, health information, stress, and involve family members. Differences between English and monolingual Spanish speakers were found mostly in exercise app use, digital literacy, and the wish for family inclusion. Conclusion: Low-income Spanish- and English-speaking women are interested in using chatbots to improve their physical activity and other health related aspects. Researchers developing health chatbots for this population should focus on issues of digital literacy, app familiarity, linguistic and cultural issues, privacy concerns, and personalization. Designing and testing this intervention for and with this group using co-creation techniques and involving community partners will increase the probability that it will ultimately be effective.
Collapse
Affiliation(s)
- Caroline A. Figueroa
- School of Social Welfare, University of California, Berkeley, Berkeley, CA, United States
| | - Tiffany C. Luo
- School of Social Welfare, University of California, Berkeley, Berkeley, CA, United States
| | - Andrea Jacobo
- School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - Alan Munoz
- School of Social Welfare, University of California, Berkeley, Berkeley, CA, United States
| | - Minx Manuel
- School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - David Chan
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, United States
| | - John Canny
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, United States
| | - Adrian Aguilera
- School of Social Welfare, University of California, Berkeley, Berkeley, CA, United States
- Department of Psychiatry and Behavioral Sciences, Zuckerberg San Francisco General Hospital, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
41
|
Mauriello ML, Tantivasadakarn N, Mora-Mendoza MA, Lincoln ET, Hon G, Nowruzi P, Simon D, Hansen L, Goenawan NH, Kim J, Gowda N, Jurafsky D, Paredes PE. A Suite of Mobile Conversational Agents for Daily Stress Management (Popbots): Mixed Methods Exploratory Study. JMIR Form Res 2021; 5:e25294. [PMID: 34519655 PMCID: PMC8479600 DOI: 10.2196/25294] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 12/11/2020] [Accepted: 08/01/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Approximately 60%-80% of the primary care visits have a psychological stress component, but only 3% of patients receive stress management advice during these visits. Given recent advances in natural language processing, there is renewed interest in mental health chatbots. Conversational agents that can understand a user's problems and deliver advice that mitigates the effects of daily stress could be an effective public health tool. However, such systems are complex to build and costly to develop. OBJECTIVE To address these challenges, our aim is to develop and evaluate a fully automated mobile suite of shallow chatbots-we call them Popbots-that may serve as a new species of chatbots and further complement human assistance in an ecosystem of stress management support. METHODS After conducting an exploratory Wizard of Oz study (N=14) to evaluate the feasibility of a suite of multiple chatbots, we conducted a web-based study (N=47) to evaluate the implementation of our prototype. Each participant was randomly assigned to a different chatbot designed on the basis of a proven cognitive or behavioral intervention method. To measure the effectiveness of the chatbots, the participants' stress levels were determined using self-reported psychometric evaluations (eg, web-based daily surveys and Patient Health Questionnaire-4). The participants in these studies were recruited through email and enrolled on the web, and some of them participated in follow-up interviews that were conducted in person or on the web (as necessary). RESULTS Of the 47 participants, 31 (66%) completed the main study. The findings suggest that the users viewed the conversations with our chatbots as helpful or at least neutral and came away with increasingly positive sentiment toward the use of chatbots for proactive stress management. Moreover, those users who used the system more often (ie, they had more than or equal to the median number of conversations) noted a decrease in depression symptoms compared with those who used the system less often based on a Wilcoxon signed-rank test (W=91.50; Z=-2.54; P=.01; r=0.47). The follow-up interviews with a subset of the participants indicated that half of the common daily stressors could be discussed with chatbots, potentially reducing the burden on human coping resources. CONCLUSIONS Our work suggests that suites of shallow chatbots may offer benefits for both users and designers. As a result, this study's contributions include the design and evaluation of a novel suite of shallow chatbots for daily stress management, a summary of benefits and challenges associated with random delivery of multiple conversational interventions, and design guidelines and directions for future research into similar systems, including authoring chatbot systems and artificial intelligence-enabled recommendation algorithms.
Collapse
Affiliation(s)
- Matthew Louis Mauriello
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, United States
| | - Nantanick Tantivasadakarn
- Symbolic Systems Program, School of Humanities and Sciences, Stanford University, Stanford, CA, United States
| | | | | | - Grace Hon
- Stanford School of Medicine, Stanford University, Stanford, CA, United States
| | - Parsa Nowruzi
- Stanford School of Medicine, Stanford University, Stanford, CA, United States
| | - Dorien Simon
- Computer Science Department, College of Engineering, Stanford University, Stanford, CA, United States
| | - Luke Hansen
- Symbolic Systems Program, School of Humanities and Sciences, Stanford University, Stanford, CA, United States
| | - Nathaniel H Goenawan
- Computer Science Department, College of Engineering, Stanford University, Stanford, CA, United States
| | - Joshua Kim
- Computer Science Department, College of Engineering, Stanford University, Stanford, CA, United States
| | - Nikhil Gowda
- Alliance Innovation Lab, Silicon Valley, CA, United States
| | - Dan Jurafsky
- Computer Science Department, College of Engineering, Stanford University, Stanford, CA, United States.,Department of Linguistics, School of Humanities and Sciences, Stanford University, Stanford, CA, United States
| | | |
Collapse
|
42
|
Intelligent Cognitive Assistants for Attitude and Behavior Change Support in Mental Health: State-of-the-Art Technical Review. ELECTRONICS 2021. [DOI: 10.3390/electronics10111250] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Intelligent cognitive assistant (ICA) technology is used in various domains to emulate human behavior expressed through synchronous communication, especially written conversation. Due to their ability to use individually tailored natural language, they present a powerful vessel to support attitude and behavior change. Behavior change support systems are emerging as a crucial tool in digital mental health services, and ICAs exceed in effective support, especially for stress, anxiety and depression (SAD), where ICAs guide people’s thought processes and actions by analyzing their affective and cognitive phenomena. Currently, there is no comprehensive review of such ICAs from a technical standpoint, and existing work is conducted exclusively from a psychological or medical perspective. This technical state-of-the-art review tried to discern and systematize current technological approaches and trends as well as detail the highly interdisciplinary landscape of intersections between ICAs, attitude and behavior change, and mental health, focusing on text-based ICAs for SAD. Ten papers with systems, fitting our criteria, were selected. The systems varied significantly in their approaches, with the most successful opting for comprehensive user models, classification-based assessment, personalized intervention, and dialogue tree conversational models.
Collapse
|
43
|
Bérubé C, Schachner T, Keller R, Fleisch E, V Wangenheim F, Barata F, Kowatsch T. Voice-Based Conversational Agents for the Prevention and Management of Chronic and Mental Health Conditions: Systematic Literature Review. J Med Internet Res 2021; 23:e25933. [PMID: 33658174 PMCID: PMC8042539 DOI: 10.2196/25933] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/10/2021] [Accepted: 03/03/2021] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Chronic and mental health conditions are increasingly prevalent worldwide. As devices in our everyday lives offer more and more voice-based self-service, voice-based conversational agents (VCAs) have the potential to support the prevention and management of these conditions in a scalable manner. However, evidence on VCAs dedicated to the prevention and management of chronic and mental health conditions is unclear. OBJECTIVE This study provides a better understanding of the current methods used in the evaluation of health interventions for the prevention and management of chronic and mental health conditions delivered through VCAs. METHODS We conducted a systematic literature review using PubMed MEDLINE, Embase, PsycINFO, Scopus, and Web of Science databases. We included primary research involving the prevention or management of chronic or mental health conditions through a VCA and reporting an empirical evaluation of the system either in terms of system accuracy, technology acceptance, or both. A total of 2 independent reviewers conducted the screening and data extraction, and agreement between them was measured using Cohen kappa. A narrative approach was used to synthesize the selected records. RESULTS Of 7170 prescreened papers, 12 met the inclusion criteria. All studies were nonexperimental. The VCAs provided behavioral support (n=5), health monitoring services (n=3), or both (n=4). The interventions were delivered via smartphones (n=5), tablets (n=2), or smart speakers (n=3). In 2 cases, no device was specified. A total of 3 VCAs targeted cancer, whereas 2 VCAs targeted diabetes and heart failure. The other VCAs targeted hearing impairment, asthma, Parkinson disease, dementia, autism, intellectual disability, and depression. The majority of the studies (n=7) assessed technology acceptance, but only few studies (n=3) used validated instruments. Half of the studies (n=6) reported either performance measures on speech recognition or on the ability of VCAs to respond to health-related queries. Only a minority of the studies (n=2) reported behavioral measures or a measure of attitudes toward intervention-targeted health behavior. Moreover, only a minority of studies (n=4) reported controlling for participants' previous experience with technology. Finally, risk bias varied markedly. CONCLUSIONS The heterogeneity in the methods, the limited number of studies identified, and the high risk of bias show that research on VCAs for chronic and mental health conditions is still in its infancy. Although the results of system accuracy and technology acceptance are encouraging, there is still a need to establish more conclusive evidence on the efficacy of VCAs for the prevention and management of chronic and mental health conditions, both in absolute terms and in comparison with standard health care.
Collapse
Affiliation(s)
- Caterina Bérubé
- Center for Digital Health Interventions, Department of Management, Technology, and Economics, ETH Zurich, Zurich, Switzerland
| | - Theresa Schachner
- Center for Digital Health Interventions, Department of Management, Technology, and Economics, ETH Zurich, Zurich, Switzerland
| | - Roman Keller
- Future Health Technologies Programme, Campus for Research Excellence and Technological Enterprise (CREATE), Singapore-ETH Centre, Singapore, Singapore
| | - Elgar Fleisch
- Center for Digital Health Interventions, Department of Management, Technology, and Economics, ETH Zurich, Zurich, Switzerland
- Future Health Technologies Programme, Campus for Research Excellence and Technological Enterprise (CREATE), Singapore-ETH Centre, Singapore, Singapore
- Center for Digital Health Interventions, Institute of Technology Management, University of St. Gallen, St. Gallen, Switzerland
| | - Florian V Wangenheim
- Center for Digital Health Interventions, Department of Management, Technology, and Economics, ETH Zurich, Zurich, Switzerland
- Future Health Technologies Programme, Campus for Research Excellence and Technological Enterprise (CREATE), Singapore-ETH Centre, Singapore, Singapore
| | - Filipe Barata
- Center for Digital Health Interventions, Department of Management, Technology, and Economics, ETH Zurich, Zurich, Switzerland
| | - Tobias Kowatsch
- Center for Digital Health Interventions, Department of Management, Technology, and Economics, ETH Zurich, Zurich, Switzerland
- Future Health Technologies Programme, Campus for Research Excellence and Technological Enterprise (CREATE), Singapore-ETH Centre, Singapore, Singapore
- Center for Digital Health Interventions, Institute of Technology Management, University of St. Gallen, St. Gallen, Switzerland
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
| |
Collapse
|
44
|
Abd-Alrazaq AA, Alajlani M, Ali N, Denecke K, Bewick BM, Househ M. Perceptions and Opinions of Patients About Mental Health Chatbots: Scoping Review. J Med Internet Res 2021; 23:e17828. [PMID: 33439133 PMCID: PMC7840290 DOI: 10.2196/17828] [Citation(s) in RCA: 92] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 06/01/2020] [Accepted: 06/21/2020] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Chatbots have been used in the last decade to improve access to mental health care services. Perceptions and opinions of patients influence the adoption of chatbots for health care. Many studies have been conducted to assess the perceptions and opinions of patients about mental health chatbots. To the best of our knowledge, there has been no review of the evidence surrounding perceptions and opinions of patients about mental health chatbots. OBJECTIVE This study aims to conduct a scoping review of the perceptions and opinions of patients about chatbots for mental health. METHODS The scoping review was carried out in line with the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) extension for scoping reviews guidelines. Studies were identified by searching 8 electronic databases (eg, MEDLINE and Embase) in addition to conducting backward and forward reference list checking of the included studies and relevant reviews. In total, 2 reviewers independently selected studies and extracted data from the included studies. Data were synthesized using thematic analysis. RESULTS Of 1072 citations retrieved, 37 unique studies were included in the review. The thematic analysis generated 10 themes from the findings of the studies: usefulness, ease of use, responsiveness, understandability, acceptability, attractiveness, trustworthiness, enjoyability, content, and comparisons. CONCLUSIONS The results demonstrated overall positive perceptions and opinions of patients about chatbots for mental health. Important issues to be addressed in the future are the linguistic capabilities of the chatbots: they have to be able to deal adequately with unexpected user input, provide high-quality responses, and have to show high variability in responses. To be useful for clinical practice, we have to find ways to harmonize chatbot content with individual treatment recommendations, that is, a personalization of chatbot conversations is required.
Collapse
Affiliation(s)
- Alaa A Abd-Alrazaq
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar
| | - Mohannad Alajlani
- Institute of Digital Healthcare, University of Warwick, Warwick, United Kingdom
| | - Nashva Ali
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar
| | - Kerstin Denecke
- Institute for Medical Informatics, Bern University of Applied Science, Bern, Switzerland
| | - Bridgette M Bewick
- Leeds Institute of Health Sciences, School of Medicine, University of Leeds, Leeds, United Kingdom
| | - Mowafa Househ
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar
| |
Collapse
|
45
|
Mokmin NAM, Ibrahim NA. The evaluation of chatbot as a tool for health literacy education among undergraduate students. EDUCATION AND INFORMATION TECHNOLOGIES 2021; 26:6033-6049. [PMID: 34054328 PMCID: PMC8144870 DOI: 10.1007/s10639-021-10542-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Accepted: 04/06/2021] [Indexed: 05/10/2023]
Abstract
This study discussed and evaluated the usefulness, performance, and technology acceptance of a chatbot developed to educate users and provide health literacy. A semi-structured interview and analytic sessions were provided on Google Analytics dashboard, and the users' acceptance toward the technology was measured using the Unified Theory of Acceptance and Use of Technology 2 (UTAUT2). A total of 75 undergraduate students were involved over a total period of two months. Each respondent explored the health chatbot actively to get advice from it with a phrase that matched the chatbot's intents via mobile devices. The evaluation results showed that 73.3% of the respondents found that the chatbot can help understand several health issues and provide a good conversation. The performance evaluation also showed that the chatbot contributed a low percentage of exit, where less than 37% of users exited the application. The overall assessment showed that the developed chatbot has a significant potential to be used as a conversational agent to increase health literacy, especially among students and young adults. However, more research should be done before the technology can replace humans in a real setting.
Collapse
Affiliation(s)
| | - Nurul Anwar Ibrahim
- Centre of Instructional Technology and Multimedia, Universiti Sains Malaysia, Penang, Malaysia
| |
Collapse
|
46
|
Zhang J, Oh YJ, Lange P, Yu Z, Fukuoka Y. Artificial Intelligence Chatbot Behavior Change Model for Designing Artificial Intelligence Chatbots to Promote Physical Activity and a Healthy Diet: Viewpoint. J Med Internet Res 2020; 22:e22845. [PMID: 32996892 PMCID: PMC7557439 DOI: 10.2196/22845] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 09/03/2020] [Accepted: 09/17/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Chatbots empowered by artificial intelligence (AI) can increasingly engage in natural conversations and build relationships with users. Applying AI chatbots to lifestyle modification programs is one of the promising areas to develop cost-effective and feasible behavior interventions to promote physical activity and a healthy diet. OBJECTIVE The purposes of this perspective paper are to present a brief literature review of chatbot use in promoting physical activity and a healthy diet, describe the AI chatbot behavior change model our research team developed based on extensive interdisciplinary research, and discuss ethical principles and considerations. METHODS We conducted a preliminary search of studies reporting chatbots for improving physical activity and/or diet in four databases in July 2020. We summarized the characteristics of the chatbot studies and reviewed recent developments in human-AI communication research and innovations in natural language processing. Based on the identified gaps and opportunities, as well as our own clinical and research experience and findings, we propose an AI chatbot behavior change model. RESULTS Our review found a lack of understanding around theoretical guidance and practical recommendations on designing AI chatbots for lifestyle modification programs. The proposed AI chatbot behavior change model consists of the following four components to provide such guidance: (1) designing chatbot characteristics and understanding user background; (2) building relational capacity; (3) building persuasive conversational capacity; and (4) evaluating mechanisms and outcomes. The rationale and evidence supporting the design and evaluation choices for this model are presented in this paper. CONCLUSIONS As AI chatbots become increasingly integrated into various digital communications, our proposed theoretical framework is the first step to conceptualize the scope of utilization in health behavior change domains and to synthesize all possible dimensions of chatbot features to inform intervention design and evaluation. There is a need for more interdisciplinary work to continue developing AI techniques to improve a chatbot's relational and persuasive capacities to change physical activity and diet behaviors with strong ethical principles.
Collapse
Affiliation(s)
- Jingwen Zhang
- Department of Communication, University of California, Davis, Davis, CA, United States
- Department of Public Health Sciences, University of California, Davis, Davis, CA, United States
| | - Yoo Jung Oh
- Department of Communication, University of California, Davis, Davis, CA, United States
| | - Patrick Lange
- Department of Computer Science, University of California, Davis, Davis, CA, United States
| | - Zhou Yu
- Department of Computer Science, University of California, Davis, Davis, CA, United States
| | - Yoshimi Fukuoka
- Department of Physiological Nursing, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
47
|
Text Messaging-Based Medical Diagnosis Using Natural Language Processing and Fuzzy Logic. JOURNAL OF HEALTHCARE ENGINEERING 2020. [DOI: 10.1155/2020/8839524] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The use of natural language processing (NLP) methods and their application to developing conversational systems for health diagnosis increases patients’ access to medical knowledge. In this study, a chatbot service was developed for the Covenant University Doctor (CUDoctor) telehealth system based on fuzzy logic rules and fuzzy inference. The service focuses on assessing the symptoms of tropical diseases in Nigeria. Telegram Bot Application Programming Interface (API) was used to create the interconnection between the chatbot and the system, while Twilio API was used for interconnectivity between the system and a short messaging service (SMS) subscriber. The service uses the knowledge base consisting of known facts on diseases and symptoms acquired from medical ontologies. A fuzzy support vector machine (SVM) is used to effectively predict the disease based on the symptoms inputted. The inputs of the users are recognized by NLP and are forwarded to the CUDoctor for decision support. Finally, a notification message displaying the end of the diagnosis process is sent to the user. The result is a medical diagnosis system which provides a personalized diagnosis utilizing self-input from users to effectively diagnose diseases. The usability of the developed system was evaluated using the system usability scale (SUS), yielding a mean SUS score of 80.4, which indicates the overall positive evaluation.
Collapse
|