1
|
Gu JZ, Baird GL, Escamilla Guevara A, Sohn YJ, Lydston M, Doyle C, Tevis SEA, Miles RC. A systematic review and meta-analysis of English language online patient education materials in breast cancer: Is readability the only story? Breast 2024; 75:103722. [PMID: 38603836 PMCID: PMC11019273 DOI: 10.1016/j.breast.2024.103722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 02/09/2024] [Accepted: 03/25/2024] [Indexed: 04/13/2024] Open
Abstract
BACKGROUND Online patient education materials (OPEMs) are an increasingly popular resource for women seeking information about breast cancer. The AMA recommends written patient material to be at or below a 6th grade level to meet the general public's health literacy. Metrics such as quality, understandability, and actionability also heavily influence the usability of health information, and thus should be evaluated alongside readability. PURPOSE A systematic review and meta-analysis was conducted to determine: 1) Average readability scores and reporting methodologies of breast cancer readability studies; and 2) Inclusion frequency of additional health literacy-associated metrics. MATERIALS AND METHODS A registered systematic review and meta-analysis was conducted in Ovid MEDLINE, Web of Science, Embase.com, CENTRAL via Ovid, and ClinicalTrials.gov in June 2022 in adherence with the PRISMA 2020 statement. Eligible studies performed readability analyses on English-language breast cancer-related OPEMs. Study characteristics, readability data, and reporting of non-readability health literacy metrics were extracted. Meta-analysis estimates were derived from generalized linear mixed modeling. RESULTS The meta-analysis included 30 studies yielding 4462 OPEMs. Overall, average readability was 11.81 (95% CI [11.14, 12.49]), with a significant difference (p < 0.001) when grouped by OPEM categories. Commercial organizations had the highest average readability at 12.2 [11.3,13.0]; non-profit organizations had one of the lowest at 11.3 [10.6,12.0]. Readability also varied by index, with New Fog, Lexile, and FORCAST having the lowest average scores (9.4 [8.6, 10.3], 10.4 [10.0, 10.8], and 10.7 [10.2, 11.1], respectively). Only 57% of studies calculated average readability with more than two indices. Only 60% of studies assessed other OPEM metrics associated with health literacy. CONCLUSION Average readability of breast cancer OPEMs is nearly double the AMA's recommended 6th grade level. Readability and other health literacy-associated metrics are inconsistently reported in the current literature. Standardization of future readability studies, with a focus on holistic evaluation of patient materials, may aid shared decision-making and be critical to increased screening rates and breast cancer awareness.
Collapse
Affiliation(s)
- Joey Z Gu
- Department of Medicine, Roger Williams Medical Center, Providence, RI, USA.
| | - Grayson L Baird
- Department of Diagnostic Imaging, Warren Alpert Medical School of Brown University, Providence, RI, USA; Lifespan Biostatistics, Epidemiology, and Research Design, Providence, RI, USA
| | | | - Young-Jin Sohn
- Harvard Medical School Center for Primary Care, Boston, MA, USA
| | - Melis Lydston
- Treadwell Virtual Library, Massachusetts General Hospital, Boston, MA, USA
| | - Christopher Doyle
- Department of Radiology and Medical Imaging, Denver Health Hospital and Authority, Denver, CO, USA
| | - Sarah E A Tevis
- Department of Surgery, School of Medicine Anschutz Medical Campus, Aurora, CO, USA
| | - Randy C Miles
- Department of Radiology and Medical Imaging, Denver Health Hospital and Authority, Denver, CO, USA
| |
Collapse
|
2
|
DiSipio T, Scholte C, Diaz A. Evaluation of online text-based information resources of gynaecological cancer symptoms. Cancer Med 2024; 13:e7167. [PMID: 38676385 PMCID: PMC11053368 DOI: 10.1002/cam4.7167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 03/18/2024] [Accepted: 03/27/2024] [Indexed: 04/28/2024] Open
Abstract
BACKGROUND Gynaecological cancer symptoms are often vague and non-specific. Quality health information is central to timely cancer diagnosis and treatment. The aim of this study was to identify and evaluate the quality of online text-based patient information resources regarding gynaecological cancer symptoms. METHODS A targeted website search and Google search were conducted to identify health information resources published by the Australian government and non-government health organisations. Resources were classified by topic (gynaecological health, gynaecological cancers, cancer, general health); assessed for reading level (Simple Measure of Gobbledygook, SMOG) and difficulty (Flesch Reading Ease, FRE); understandability and actionability (Patient Education Materials Assessment Tool, PEMAT, 0-100), whereby higher scores indicate better understandability/actionability. Seven criteria were used to assess cultural inclusivity specific for Aboriginal and Torres Strait Islander people; resources which met 3-5 items were deemed to be moderately inclusive and 6+ items as inclusive. RESULTS A total of 109 resources were identified and 76% provided information on symptoms in the context of gynaecological cancers. The average readability was equivalent to a grade 10 reading level on the SMOG and classified as 'difficult to read' on the FRE. The mean PEMAT scores were 95% (range 58-100) for understandability and 13% (range 0-80) for actionability. Five resources were evaluated as being moderately culturally inclusive. No resource met all the benchmarks. CONCLUSIONS This study highlights the inadequate quality of online resources available on pre-diagnosis gynaecological cancer symptom information. Resources should be revised in line with the recommended standards for readability, understandability and actionability and to meet the needs of a culturally diverse population.
Collapse
Affiliation(s)
- Tracey DiSipio
- School of Public HealthThe University of QueenslandBrisbaneQueenslandAustralia
| | - Cate Scholte
- School of Public HealthThe University of QueenslandBrisbaneQueenslandAustralia
| | - Abbey Diaz
- School of Public HealthThe University of QueenslandBrisbaneQueenslandAustralia
| |
Collapse
|
3
|
Garcia Valencia OA, Thongprayoon C, Miao J, Suppadungsuk S, Krisanapan P, Craici IM, Jadlowiec CC, Mao SA, Mao MA, Leeaphorn N, Budhiraja P, Cheungpasitporn W. Empowering inclusivity: improving readability of living kidney donation information with ChatGPT. Front Digit Health 2024; 6:1366967. [PMID: 38659656 PMCID: PMC11039889 DOI: 10.3389/fdgth.2024.1366967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 04/01/2024] [Indexed: 04/26/2024] Open
Abstract
Background Addressing disparities in living kidney donation requires making information accessible across literacy levels, especially important given that the average American adult reads at an 8th-grade level. This study evaluated the effectiveness of ChatGPT, an advanced AI language model, in simplifying living kidney donation information to an 8th-grade reading level or below. Methods We used ChatGPT versions 3.5 and 4.0 to modify 27 questions and answers from Donate Life America, a key resource on living kidney donation. We measured the readability of both original and modified texts using the Flesch-Kincaid formula. A paired t-test was conducted to assess changes in readability levels, and a statistical comparison between the two ChatGPT versions was performed. Results Originally, the FAQs had an average reading level of 9.6 ± 1.9. Post-modification, ChatGPT 3.5 achieved an average readability level of 7.72 ± 1.85, while ChatGPT 4.0 reached 4.30 ± 1.71, both with a p-value <0.001 indicating significant reduction. ChatGPT 3.5 made 59.26% of answers readable below 8th-grade level, whereas ChatGPT 4.0 did so for 96.30% of the texts. The grade level range for modified answers was 3.4-11.3 for ChatGPT 3.5 and 1-8.1 for ChatGPT 4.0. Conclusion Both ChatGPT 3.5 and 4.0 effectively lowered the readability grade levels of complex medical information, with ChatGPT 4.0 being more effective. This suggests ChatGPT's potential role in promoting diversity and equity in living kidney donation, indicating scope for further refinement in making medical information more accessible.
Collapse
Affiliation(s)
- Oscar A. Garcia Valencia
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN, United States
| | - Charat Thongprayoon
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN, United States
| | - Jing Miao
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN, United States
| | - Supawadee Suppadungsuk
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN, United States
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan, Thailand
| | - Pajaree Krisanapan
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN, United States
- Division of Nephrology, Department of Internal Medicine, Faculty of Medicine, Thammasat University, Pathum Thani, Thailand
- Division of Nephrology, Department of Internal Medicine, Thammasat University Hospital, Pathum Thani, Thailand
| | - Iasmina M. Craici
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN, United States
| | - Caroline C. Jadlowiec
- Division of Transplant Surgery, Department of Surgery, Mayo Clinic, Phoenix, AZ, United States
| | - Shennen A. Mao
- Division of Transplant Surgery, Department of Transplant, Mayo Clinic, Jacksonville, FL, United States
| | - Michael A. Mao
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Jacksonville, FL, United States
| | - Napat Leeaphorn
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Jacksonville, FL, United States
| | - Pooja Budhiraja
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Phoenix, AZ, United States
| | - Wisit Cheungpasitporn
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
4
|
Hazewinkel MHJ, Gfrerer L, Ashina S, Austen WG, Klassen AF, Pusic A, Kaur MN. Readability analysis and concept mapping of PROMs used for headache disorders. Headache 2024; 64:410-423. [PMID: 38525832 DOI: 10.1111/head.14706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 01/22/2024] [Accepted: 01/23/2024] [Indexed: 03/26/2024]
Abstract
OBJECTIVE To assess the readability and the comprehensiveness of patient-reported outcome measures (PROMs) utilized in primary headache disorders literature. BACKGROUND As the health-care landscape has evolved toward a patient-centric model, numerous PROMs have been developed to capture treatment outcomes in patients with headache disorders. For these PROMs to advance our understanding of headache disorders and their treatment impact, they must be easy to understand (i.e., reading grade level 6 or less) and comprehensively capture what matters to patients with headache. The aim of this study was to (a) assess the readability of PROMs utilized in headache disorders literature, and (b) assess the comprehensiveness of PROMs by mapping their content to a health-related quality of life framework. METHODS In this scoping review, recently published systematic reviews were used to identify PROMs used in primary headache disorders literature. Readability analysis was performed at the level of individual items and full PROM using established readability metrics. The content of the PROMs was mapped against a health-related quality-of-life framework by two independent reviewers. RESULTS In total, 22 PROMs (15 headache disorders related, 7 generic) were included. The median reading grade level varied between 7.1 (interquartile range [IQR] 6.3-7.8) and 12.7 (IQR 11.8-13.2). None of the PROMs were below the recommended reading grade level for patient-facing material (grade 6). Three PROMs, the Migraine-Treatment Assessment Questionnaire, the Eurolight, and the European Quality of Life 5 Dimensions 3 Level Version, were between reading grade levels 7 and 8; the remaining 19 PROMs were above reading grade level 8. In total, the PROMs included 425 items. Most items (n = 134, 32%) assessed physical function (e.g., work, activities of daily living). The remaining items assessed physical symptoms (n = 127, 30%; e.g., pain, nausea), treatment effects on symptoms (n = 65, 15%; e.g., accompanying symptoms relief, headache relief), treatment impact (n = 56, 13%; e.g., function, side effects), psychological well-being (n = 41, 10%; e.g., anger, frustration), social well-being (n = 29, 7%; e.g., missing out on social activities, relationships), psychological impact (n = 14, 3%; e.g., feeling [not] in control, feeling like a burden), and sexual well-being (n = 3, 1%; e.g., sexual activity, sexual interest). Some of the items pertained to treatment (n = 27, 6%), of which most were about treatment type and use (n = 12, 3%; e.g., medication, botulinum toxin), treatment access (n = 10, 2%; e.g., health-care utilization, cost of medication), and treatment experience (n = 9, 2%; e.g., treatment satisfaction, confidence in treatment). CONCLUSION The PROMs used in studies of headache disorders may be challenging for some patients to understand, leading to inaccurate or missing data. Furthermore, no available PROM comprehensively measures the health-related quality-of-life impact of headache disorders or their treatment, resulting in a limited understanding of patient-reported outcomes. The development of an easy-to-understand, comprehensive, and validated headache disorders-specific PROM is warranted.
Collapse
Affiliation(s)
- Merel H J Hazewinkel
- Department of Plastic and Reconstructive Surgery, Weill Cornell Medicine, New York, New York, USA
| | - Lisa Gfrerer
- Department of Plastic and Reconstructive Surgery, Weill Cornell Medicine, New York, New York, USA
| | - Sait Ashina
- Department of Neurology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
- Department of Anesthesia, Critical Care, Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - William G Austen
- Department of Plastic and Reconstructive Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Anne F Klassen
- Department of Pediatrics, McMaster University, Hamilton, Ontario, Canada
| | - Andrea Pusic
- Patient Reported Outcomes, Value and Experience Center (PROVE), Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Manraj N Kaur
- Patient Reported Outcomes, Value and Experience Center (PROVE), Brigham and Women's Hospital, Boston, Massachusetts, USA
| |
Collapse
|
5
|
Schilstra CE, Ellis SJ, Cohen J, Gall A, Diaz A, Clarke K, Dumlao G, Chard J, Cumming TM, Davis E, Dhillon H, Burns MA, Docking K, Koh ES, O'Reilly J, Sansom-Daly UM, Shaw J, Speers N, Taylor N, Warne A, Fardell JE. Exploring Web-Based Information and Resources That Support Adolescents and Young Adults With Cancer to Resume Study and Work: Environmental Scan Study. JMIR Cancer 2024; 10:e47944. [PMID: 38526527 PMCID: PMC11002739 DOI: 10.2196/47944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 11/14/2023] [Accepted: 02/06/2024] [Indexed: 03/26/2024] Open
Abstract
BACKGROUND Adolescents and young adults (AYAs) diagnosed with cancer experience physical, cognitive, and psychosocial effects from cancer treatment that can negatively affect their ability to remain engaged in education or work through cancer treatment and in the long term. Disengagement from education or work can have lasting implications for AYAs' financial independence, psychosocial well-being, and quality of life. Australian AYAs with cancer lack access to adequate specialist support for their education and work needs and report a preference for web-based support that they can access from anywhere, in their own time. However, it remains unclear what web-based resources exist that are tailored to support AYAs with cancer in reaching their educational or work goals. OBJECTIVE This study aimed to determine what web-based resources exist for Australian AYAs with cancer to (1) support return to education or work and (2) identify the degree to which existing resources are age-specific, cancer-specific, culturally inclusive, and evidence-based; are co-designed with AYAs; use age-appropriate language; and are easy to find. METHODS We conducted an environmental scan by searching Google with English search terms in August 2022 to identify information resources about employment and education for AYAs ever diagnosed with cancer. Data extraction was conducted in Microsoft Excel, and the following were assessed: understandability and actionability (using the Patient Education and Materials Tool), readability (using the Sydney Health Literacy Laboratory Health Literacy Editor), and whether the resource was easy to locate, evidence-based, co-designed with AYAs, and culturally inclusive of Aboriginal and Torres Strait Islander peoples. The latter was assessed using 7 criteria previously developed by members of the research team. RESULTS We identified 24 web-based resources, comprising 22 written text resources and 12 video resources. Most resources (21/24, 88%) were published by nongovernmental organizations in Australia, Canada, the United States, and the United Kingdom. A total of 7 resources focused on education, 8 focused on work, and 9 focused on both education and work. The evaluation of resources demonstrated poor understandability and actionability. Resources were rarely evidence-based or co-designed by AYAs, difficult to locate on the internet, and largely not inclusive of Aboriginal and Torres Strait Islander populations. CONCLUSIONS Although web-based resources for AYAs with cancer are often available through the websites of hospitals or nongovernmental organizations, this environmental scan suggests they would benefit from more evidence-based and actionable resources that are available in multiple formats (eg, text and audio-visual) and tailored to be age-appropriate and culturally inclusive.
Collapse
Affiliation(s)
- Clarissa E Schilstra
- Behavioural Sciences Unit, Kids Cancer Centre, Sydney Children's Hospital, Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Randwick, Australia
| | - Sarah J Ellis
- Behavioural Sciences Unit, Kids Cancer Centre, Sydney Children's Hospital, Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Randwick, Australia
| | - Jennifer Cohen
- Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Kensington, Australia
- Canteen Australia, Newtown, Australia
| | - Alana Gall
- National Centre for Naturopathic Medicine, Faculty of Health, Southern Cross University, Lismore, Australia
| | - Abbey Diaz
- School of Public Health, Faculty of Medicine, University of Queensland, Brisbane, Australia
| | | | - Gadiel Dumlao
- Behavioural Sciences Unit, Kids Cancer Centre, Sydney Children's Hospital, Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Randwick, Australia
| | - Jennifer Chard
- Western Sydney Youth Cancer Service, Crown Princess Mary Cancer Centre, Westmead Hospital, Westmead, Australia
| | - Therese M Cumming
- Faculty of Arts, Design and Architecture, University of New South Wales Sydney, Kensington, Australia
- Disability Innovation Institute, University of New South Wales Sydney, Kensington, Australia
| | | | - Haryana Dhillon
- School of Psychology, Faculty of Science, University of Sydney, Camperdown, Australia
| | - Mary Anne Burns
- School of Psychology, Faculty of Science, University of Sydney, Camperdown, Australia
| | - Kimberley Docking
- Faculty of Medicine and Health, University of Sydney, Camperdown, Australia
| | - Eng-Siew Koh
- South West Sydney Clinical School, Faculty of Medicine and Health, University of New South Wales Sydney, Liverpool, Australia
- Liverpool and Macarthur Cancer Therapy Centres, Liverpool, Australia
- Ingham Institute for Applied Medical Research, Liverpool, Australia
| | | | - Ursula M Sansom-Daly
- Behavioural Sciences Unit, Kids Cancer Centre, Sydney Children's Hospital, Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Randwick, Australia
- Sydney Youth Cancer Service, Nelune Comprehensive Cancer Centre, Prince of Wales Hospital, Randwick, Australia
| | - Joanne Shaw
- School of Psychology, Faculty of Science, University of Sydney, Camperdown, Australia
| | - Nicole Speers
- Cancer survivor representative, New South Wales, Australia
| | - Natalie Taylor
- Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Kensington, Australia
| | - Anthea Warne
- Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Kensington, Australia
| | - Joanna E Fardell
- Behavioural Sciences Unit, Kids Cancer Centre, Sydney Children's Hospital, Faculty of Medicine and Health, Randwick Clinical Campus, University of New South Wales Sydney, Randwick, Australia
- Western Sydney Youth Cancer Service, Crown Princess Mary Cancer Centre, Westmead Hospital, Westmead, Australia
| |
Collapse
|
6
|
Zaretsky J, Kim JM, Baskharoun S, Zhao Y, Austrian J, Aphinyanaphongs Y, Gupta R, Blecker SB, Feldman J. Generative Artificial Intelligence to Transform Inpatient Discharge Summaries to Patient-Friendly Language and Format. JAMA Netw Open 2024; 7:e240357. [PMID: 38466307 DOI: 10.1001/jamanetworkopen.2024.0357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/12/2024] Open
Abstract
Importance By law, patients have immediate access to discharge notes in their medical records. Technical language and abbreviations make notes difficult to read and understand for a typical patient. Large language models (LLMs [eg, GPT-4]) have the potential to transform these notes into patient-friendly language and format. Objective To determine whether an LLM can transform discharge summaries into a format that is more readable and understandable. Design, Setting, and Participants This cross-sectional study evaluated a sample of the discharge summaries of adult patients discharged from the General Internal Medicine service at NYU (New York University) Langone Health from June 1 to 30, 2023. Patients discharged as deceased were excluded. All discharge summaries were processed by the LLM between July 26 and August 5, 2023. Interventions A secure Health Insurance Portability and Accountability Act-compliant platform, Microsoft Azure OpenAI, was used to transform these discharge summaries into a patient-friendly format between July 26 and August 5, 2023. Main Outcomes and Measures Outcomes included readability as measured by Flesch-Kincaid Grade Level and understandability using Patient Education Materials Assessment Tool (PEMAT) scores. Readability and understandability of the original discharge summaries were compared with the transformed, patient-friendly discharge summaries created through the LLM. As balancing metrics, accuracy and completeness of the patient-friendly version were measured. Results Discharge summaries of 50 patients (31 female [62.0%] and 19 male [38.0%]) were included. The median patient age was 65.5 (IQR, 59.0-77.5) years. Mean (SD) Flesch-Kincaid Grade Level was significantly lower in the patient-friendly discharge summaries (6.2 [0.5] vs 11.0 [1.5]; P < .001). PEMAT understandability scores were significantly higher for patient-friendly discharge summaries (81% vs 13%; P < .001). Two physicians reviewed each patient-friendly discharge summary for accuracy on a 6-point scale, with 54 of 100 reviews (54.0%) giving the best possible rating of 6. Summaries were rated entirely complete in 56 reviews (56.0%). Eighteen reviews noted safety concerns, mostly involving omissions, but also several inaccurate statements (termed hallucinations). Conclusions and Relevance The findings of this cross-sectional study of 50 discharge summaries suggest that LLMs can be used to translate discharge summaries into patient-friendly language and formats that are significantly more readable and understandable than discharge summaries as they appear in electronic health records. However, implementation will require improvements in accuracy, completeness, and safety. Given the safety concerns, initial implementation will require physician review.
Collapse
Affiliation(s)
- Jonah Zaretsky
- Division of Hospital Medicine, Department of Medicine, NYU (New York University) Langone Health, New York, New York
| | - Jeong Min Kim
- Division of Hospital Medicine, Department of Medicine, NYU (New York University) Langone Health, New York, New York
| | | | - Yunan Zhao
- Department of Population Health, NYU Langone Health, New York
| | - Jonathan Austrian
- Division of Hospital Medicine, Department of Medicine, NYU (New York University) Langone Health, New York, New York
- Department of Health Informatics, NYU Langone Medical Center Information Technology, New York
| | - Yindalon Aphinyanaphongs
- Department of Population Health, NYU Langone Health, New York
- Predictive Analytics Unit, NYU Langone Health, New York
| | - Ravi Gupta
- Department of Internal Medicine, Long Island Community Hospital, NYU Langone Health, New York
| | - Saul B Blecker
- Division of Hospital Medicine, Department of Medicine, NYU (New York University) Langone Health, New York, New York
- Department of Population Health, NYU Langone Health, New York
| | - Jonah Feldman
- Department of Medicine, NYU Long Island School of Medicine, Mineola
- Department of Health Informatics, NYU Langone Medical Center Information Technology, New York
| |
Collapse
|
7
|
Ayre J, Mac O, McCaffery K, McKay BR, Liu M, Shi Y, Rezwan A, Dunn AG. New Frontiers in Health Literacy: Using ChatGPT to Simplify Health Information for People in the Community. J Gen Intern Med 2024; 39:573-577. [PMID: 37940756 DOI: 10.1007/s11606-023-08469-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/06/2023] [Indexed: 11/10/2023]
Abstract
BACKGROUND Most health information does not meet the health literacy needs of our communities. Writing health information in plain language is time-consuming but the release of tools like ChatGPT may make it easier to produce reliable plain language health information. OBJECTIVE To investigate the capacity for ChatGPT to produce plain language versions of health texts. DESIGN Observational study of 26 health texts from reputable websites. METHODS ChatGPT was prompted to 'rewrite the text for people with low literacy'. Researchers captured three revised versions of each original text. MAIN MEASURES Objective health literacy assessment, including Simple Measure of Gobbledygook (SMOG), proportion of the text that contains complex language (%), number of instances of passive voice and subjective ratings of key messages retained (%). KEY RESULTS On average, original texts were written at grade 12.8 (SD = 2.2) and revised to grade 11.0 (SD = 1.2), p < 0.001. Original texts were on average 22.8% complex (SD = 7.5%) compared to 14.4% (SD = 5.6%) in revised texts, p < 0.001. Original texts had on average 4.7 instances (SD = 3.2) of passive text compared to 1.7 (SD = 1.2) in revised texts, p < 0.001. On average 80% of key messages were retained (SD = 15.0). The more complex original texts showed more improvements than less complex original texts. For example, when original texts were ≥ grade 13, revised versions improved by an average 3.3 grades (SD = 2.2), p < 0.001. Simpler original texts (< grade 11) improved by an average 0.5 grades (SD = 1.4), p < 0.001. CONCLUSIONS This study used multiple objective assessments of health literacy to demonstrate that ChatGPT can simplify health information while retaining most key messages. However, the revised texts typically did not meet health literacy targets for grade reading score, and improvements were marginal for texts that were already relatively simple.
Collapse
Affiliation(s)
- Julie Ayre
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia.
| | - Olivia Mac
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| | - Kirsten McCaffery
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| | - Brad R McKay
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| | - Mingyi Liu
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| | - Yi Shi
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| | - Atria Rezwan
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
| | - Adam G Dunn
- Discipline of Biomedical Informatics and Digital Health, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| |
Collapse
|
8
|
Guo Y, Qiu W, Leroy G, Wang S, Cohen T. Retrieval augmentation of large language models for lay language generation. J Biomed Inform 2024; 149:104580. [PMID: 38163514 PMCID: PMC10874606 DOI: 10.1016/j.jbi.2023.104580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 12/05/2023] [Accepted: 12/17/2023] [Indexed: 01/03/2024]
Abstract
The complex linguistic structures and specialized terminology of expert-authored content limit the accessibility of biomedical literature to the general public. Automated methods have the potential to render this literature more interpretable to readers with different educational backgrounds. Prior work has framed such lay language generation as a summarization or simplification task. However, adapting biomedical text for the lay public includes the additional and distinct task of background explanation: adding external content in the form of definitions, motivation, or examples to enhance comprehensibility. This task is especially challenging because the source document may not include the required background knowledge. Furthermore, background explanation capabilities have yet to be formally evaluated, and little is known about how best to enhance them. To address this problem, we introduce Retrieval-Augmented Lay Language (RALL) generation, which intuitively fits the need for external knowledge beyond that in expert-authored source documents. In addition, we introduce CELLS, the largest (63k pairs) and broadest-ranging (12 journals) parallel corpus for lay language generation. To evaluate RALL, we augmented state-of-the-art text generation models with information retrieval of either term definitions from the UMLS and Wikipedia, or embeddings of explanations from Wikipedia documents. Of these, embedding-based RALL models improved summary quality and simplicity while maintaining factual correctness, suggesting that Wikipedia is a helpful source for background explanation in this context. We also evaluated the ability of both an open-source Large Language Model (Llama 2) and a closed-source Large Language Model (GPT-4) in background explanation, with and without retrieval augmentation. Results indicate that these LLMs can generate simplified content, but that the summary quality is not ideal. Taken together, this work presents the first comprehensive study of background explanation for lay language generation, paving the path for disseminating scientific knowledge to a broader audience. Our code and data are publicly available at: https://github.com/LinguisticAnomalies/pls_retrieval.
Collapse
Affiliation(s)
- Yue Guo
- Biomedical and Health Informatics, University of Washington, United States of America.
| | - Wei Qiu
- Paul G. Allen School of Computer Science, University of Washington, United States of America
| | - Gondy Leroy
- Management Information Systems, University of Arizona, United States of America
| | - Sheng Wang
- Paul G. Allen School of Computer Science, University of Washington, United States of America
| | - Trevor Cohen
- Biomedical and Health Informatics, University of Washington, United States of America
| |
Collapse
|
9
|
Ayre J, Muscat DM, Mac O, Bonner C, Dunn AG, Dalmazzo J, Mouwad D, McCaffery K. Helping patient educators meet health literacy needs: End-user testing and iterative development of an innovative health literacy editing tool. PEC INNOVATION 2023; 2:100162. [PMID: 37384149 PMCID: PMC10294045 DOI: 10.1016/j.pecinn.2023.100162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 03/15/2023] [Accepted: 05/06/2023] [Indexed: 06/30/2023]
Abstract
Objective The Sydney Health Literacy Lab (SHeLL) Editor is an online text-editing tool that provides real-time assessment and feedback on written health information (assesses grade reading score, complex language, passive voice). This study aimed to explore how the design could be further enhanced to help health information providers interpret and act on automated feedback. Methods The prototype was iteratively refined across four rounds of user-testing with health services staff (N = 20). Participants took part in online interviews and a brief follow-up survey using validated usability scales (System Usability Scale, Technology Acceptance Model). After each round, Yardley's (2021) optimisation criteria guided which changes would be implemented. Results Participants rated the Editor as having adequate usability (M = 82.8 out of 100, SD = 13.5). Most modifications sought to reduce information overload (e.g. simplifying instructions for new users) or make feedback motivating and actionable (e.g. using frequent incremental feedback to highlight changes to the text altered assessment scores). Conclusion terative user-testing was critical to balancing academic values and the practical needs of the Editor's target users. The final version emphasises actionable real-time feedback and not just assessment. Innovation The Editor is a new tool that will help health information providers apply health literacy principles to written text.
Collapse
Affiliation(s)
- Julie Ayre
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| | - Danielle M. Muscat
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| | - Olivia Mac
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| | - Carissa Bonner
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
- Menzies Centre for Health Policy and Economics, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Adam G. Dunn
- Discipline of Biomedical Informatics and Digital Health, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| | - Jason Dalmazzo
- Discipline of Biomedical Informatics and Digital Health, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| | - Dana Mouwad
- Western Sydney Local Health District, Health Literacy Hub, Westmead, NSW, Australia
| | - Kirsten McCaffery
- Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, NSW, Australia
| |
Collapse
|
10
|
Tan DJY, Ko TK, Fan KS. The Readability and Quality of Web-Based Patient Information on Nasopharyngeal Carcinoma: Quantitative Content Analysis. JMIR Form Res 2023; 7:e47762. [PMID: 38010802 DOI: 10.2196/47762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 08/25/2023] [Accepted: 10/25/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Nasopharyngeal carcinoma (NPC) is a rare disease that is strongly associated with exposure to the Epstein-Barr virus and is characterized by the formation of malignant cells in nasopharynx tissues. Early diagnosis of NPC is often difficult owing to the location of initial tumor sites and the nonspecificity of initial symptoms, resulting in a higher frequency of advanced-stage diagnoses and a poorer prognosis. Access to high-quality, readable information could improve the early detection of the disease and provide support to patients during disease management. OBJECTIVE This study aims to assess the quality and readability of publicly available web-based information in the English language about NPC, using the most popular search engines. METHODS Key terms relevant to NPC were searched across 3 of the most popular internet search engines: Google, Yahoo, and Bing. The top 25 results from each search engine were included in the analysis. Websites that contained text written in languages other than English, required paywall access, targeted medical professionals, or included nontext content were excluded. Readability for each website was assessed using the Flesch Reading Ease score and the Flesch-Kincaid grade level. Website quality was assessed using the Journal of the American Medical Association (JAMA) and DISCERN tools as well as the presence of a Health on the Net Foundation seal. RESULTS Overall, 57 suitable websites were included in this study; 26% (15/57) of the websites were academic. The mean JAMA and DISCERN scores of all websites were 2.80 (IQR 3) and 57.60 (IQR 19), respectively, with a median of 3 (IQR 2-4) and 61 (IQR 49-68), respectively. Health care industry websites (n=3) had the highest mean JAMA score of 4 (SD 0). Academic websites (15/57, 26%) had the highest mean DISCERN score of 77.5. The Health on the Net Foundation seal was present on only 1 website, which also achieved a JAMA score of 3 and a DISCERN score of 50. Significant differences were observed between the JAMA score of hospital websites and the scores of industry websites (P=.04), news service websites (P<.048), charity and nongovernmental organization websites (P=.03). Despite being a vital source for patients, general practitioner websites were found to have significantly lower JAMA scores compared with charity websites (P=.05). The overall mean readability scores reflected an average reading age of 14.3 (SD 1.1) years. CONCLUSIONS The results of this study suggest an inconsistent and suboptimal quality of information related to NPC on the internet. On average, websites presented readability challenges, as written information about NPC was above the recommended reading level of sixth grade. As such, web-based information requires improvement in both quality and accessibility, and healthcare providers should be selective about information recommended to patients, ensuring they are reliable and readable.
Collapse
Affiliation(s)
- Denise Jia Yun Tan
- Department of Surgery, Royal Stoke University Hospital, Stoke on Trent, United Kingdom
| | - Tsz Ki Ko
- Department of Surgery, Royal Stoke University Hospital, Stoke on Trent, United Kingdom
| | - Ka Siu Fan
- Department of Surgery, Royal Surrey County Hospital, Guildford, Surrey, United Kingdom
| |
Collapse
|
11
|
Nattam A, Vithala T, Wu TC, Bindhu S, Bond G, Liu H, Thompson A, Wu DTY. Assessing the Readability of Online Patient Education Materials in Obstetrics and Gynecology Using Traditional Measures: Comparative Analysis and Limitations. J Med Internet Res 2023; 25:e46346. [PMID: 37647115 PMCID: PMC10500363 DOI: 10.2196/46346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 06/06/2023] [Accepted: 07/04/2023] [Indexed: 09/01/2023] Open
Abstract
BACKGROUND Patient education materials (PEMs) can be vital sources of information for the general population. However, despite American Medical Association (AMA) and National Institutes of Health (NIH) recommendations to make PEMs easier to read for patients with low health literacy, they often do not adhere to these recommendations. The readability of online PEMs in the obstetrics and gynecology (OB/GYN) field, in particular, has not been thoroughly investigated. OBJECTIVE The study sampled online OB/GYN PEMs and aimed to examine (1) agreeability across traditional readability measures (TRMs), (2) adherence of online PEMs to AMA and NIH recommendations, and (3) whether the readability level of online PEMs varied by web-based source and medical topic. This study is not a scoping review, rather, it focused on scoring the readability of OB/GYN PEMs using the traditional measures to add empirical evidence to the literature. METHODS A total of 1576 online OB/GYN PEMs were collected via 3 major search engines. In total 93 were excluded due to shorter content (less than 100 words), yielding 1483 PEMs for analysis. Each PEM was scored by 4 TRMs, including Flesch-Kincaid grade level, Gunning fog index, Simple Measure of Gobbledygook, and the Dale-Chall. The PEMs were categorized based on publication source and medical topic by 2 research team members. The readability scores of the categories were compared statistically. RESULTS Results indicated that the 4 TRMs did not agree with each other, leading to the use of an averaged readability (composite) score for comparison. The composite scores across all online PEMs were not normally distributed and had a median at the 11th grade. Governmental PEMs were the easiest to read amongst source categorizations and PEMs about menstruation were the most difficult to read. However, the differences in the readability scores among the sources and the topics were small. CONCLUSIONS This study found that online OB/GYN PEMs did not meet the AMA and NIH readability recommendations and would be difficult to read and comprehend for patients with low health literacy. Both findings connected well to the literature. This study highlights the need to improve the readability of OB/GYN PEMs to help patients make informed decisions. Research has been done to create more sophisticated readability measures for medical and health documents. Once validated, these tools need to be used by web-based content creators of health education materials.
Collapse
Affiliation(s)
- Anunita Nattam
- College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Tripura Vithala
- College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Tzu-Chun Wu
- College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Shwetha Bindhu
- College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Gregory Bond
- Department of Computer Science, University of Cincinnati, Cincinnati, OH, United States
| | - Hexuan Liu
- School of Criminal Justice, University of Cincinnati, Cincinnati, OH, United States
| | - Amy Thompson
- College of Medicine, University of Cincinnati, Cincinnati, OH, United States
| | - Danny T Y Wu
- College of Medicine, University of Cincinnati, Cincinnati, OH, United States
- Department of Computer Science, University of Cincinnati, Cincinnati, OH, United States
| |
Collapse
|