1
|
Xie H. A capsule network-based public health prediction system for chronic diseases: clinical and community implications. Front Public Health 2025; 13:1526360. [PMID: 40161025 PMCID: PMC11949884 DOI: 10.3389/fpubh.2025.1526360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Accepted: 02/27/2025] [Indexed: 04/02/2025] Open
Abstract
Objective To observe the role of a public health chronic disease prediction method based on capsule network and information system in clinical treatment and public health management. Methods Patients with hypertension, diabetes, and asthma admitted from May 2022 to October 2023 were incorporated into the research. They were grouped into hypertension group (n = 341), diabetes group (n = 341), and asthma group (n = 341). The established chronic disease prediction method was used to diagnose these types of public health chronic diseases. The key influencing factors obtained by the prediction method were compared with the regression analysis results. In addition, its diagnostic accuracy and specificity were analyzed, and the clinical diagnostic value of this method was explored. This method was applied to public health management and the management approach was improved based on the distribution and prevalence of chronic diseases. The effectiveness and residents' acceptance of public health management before and after improvement were compared, and the application value of this method in public health management was explored. Results The key factors affecting the three diseases obtained by the application of prediction methods were found to be significantly correlated with disease occurrence after regression analysis (p < 0.05). Compared with before application, the diagnostic accuracy, specificity and sensitivity values of the method were 88.6, 89 and 92%, respectively, which were higher than the empirical diagnostic methods of doctors (p < 0.05). Compared with other existing AI-based chronic disease prediction methods, the AUC value of the proposed method was significantly higher than theirs (p < 0.05). This indicates that the diagnostic method proposed in this study has higher accuracy. After applying this method to public health management, the wellbeing of individuals with chronic conditions in the community was notably improved, and the incidence rate was notably reduced (p < 0.05). The acceptance level of residents toward the management work of public health management departments was also notably raised (p < 0.05). Conclusion The public health chronic disease prediction method based on information systems and capsule network has high clinical value in diagnosis and can help physicians accurately diagnose patients' conditions. In addition, this method has high application value in public health management. Management departments can adjust management strategies in a timely manner through predictive analysis results and propose targeted management measures based on the characteristics of residents in the management community.
Collapse
Affiliation(s)
- Haiyan Xie
- Medical College of Changsha Social Work College, Changsha, China
| |
Collapse
|
2
|
Hasanzadeh F, Josephson CB, Waters G, Adedinsewo D, Azizi Z, White JA. Bias recognition and mitigation strategies in artificial intelligence healthcare applications. NPJ Digit Med 2025; 8:154. [PMID: 40069303 PMCID: PMC11897215 DOI: 10.1038/s41746-025-01503-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 02/06/2025] [Indexed: 03/15/2025] Open
Abstract
Artificial intelligence (AI) is delivering value across all aspects of clinical practice. However, bias may exacerbate healthcare disparities. This review examines the origins of bias in healthcare AI, strategies for mitigation, and responsibilities of relevant stakeholders towards achieving fair and equitable use. We highlight the importance of systematically identifying bias and engaging relevant mitigation activities throughout the AI model lifecycle, from model conception through to deployment and longitudinal surveillance.
Collapse
Affiliation(s)
- Fereshteh Hasanzadeh
- Libin Cardiovascular Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Colin B Josephson
- Department of Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Gabriella Waters
- Morgan State University, Center for Equitable AI & Machine Learning Systems, Baltimore, MD, USA
| | | | - Zahra Azizi
- Department of Cardiovascular Medicine, Stanford University, Stanford, CA, USA
| | - James A White
- Libin Cardiovascular Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
- Department of Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
3
|
Modise LM, Alborzi Avanaki M, Ameen S, Celi LA, Chen VXY, Cordes A, Elmore M, Fiske A, Gallifant J, Hayes M, Marcelo A, Matos J, Nakayama L, Ozoani E, Silverman BC, Comeau DS. Introducing the Team Card: Enhancing governance for medical Artificial Intelligence (AI) systems in the age of complexity. PLOS DIGITAL HEALTH 2025; 4:e0000495. [PMID: 40036250 DOI: 10.1371/journal.pdig.0000495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 12/23/2024] [Indexed: 03/06/2025]
Abstract
This paper introduces the Team Card (TC) as a protocol to address harmful biases in the development of clinical artificial intelligence (AI) systems by emphasizing the often-overlooked role of researchers' positionality. While harmful bias in medical AI, particularly in Clinical Decision Support (CDS) tools, is frequently attributed to issues of data quality, this limited framing neglects how researchers' worldviews-shaped by their training, backgrounds, and experiences-can influence AI design and deployment. These unexamined subjectivities can create epistemic limitations, amplifying biases and increasing the risk of inequitable applications in clinical settings. The TC emphasizes reflexivity-critical self-reflection-as an ethical strategy to identify and address biases stemming from the subjectivity of research teams. By systematically documenting team composition, positionality, and the steps taken to monitor and address unconscious bias, TCs establish a framework for assessing how diversity within teams impacts AI development. Studies across business, science, and organizational contexts demonstrate that diversity improves outcomes, including innovation, decision-making quality, and overall performance. However, epistemic diversity-diverse ways of thinking and problem-solving-must be actively cultivated through intentional, collaborative processes to mitigate bias effectively. By embedding epistemic diversity into research practices, TCs may enhance model performance, improve fairness and offer an empirical basis for evaluating how diversity influences bias mitigation efforts over time. This represents a critical step toward developing inclusive, ethical, and effective AI systems in clinical care. A publicly available prototype presenting our TC is accessible at https://www.teamcard.io/team/demo.
Collapse
Affiliation(s)
- Lesedi Mamodise Modise
- Center for Bioethics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Mahsa Alborzi Avanaki
- Department of Radiology, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
| | - Saleem Ameen
- Department of Biomedical Informatics, Harvard Medical School, Harvard University, Boston, Massachusetts, United States of America
- Tasmanian School of Medicine, College of Health and Medicine, University of Tasmania, Hobart, Tasmania, Australia
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Leo A Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Victor Xin Yuan Chen
- Center for Bioethics, Harvard Medical School, Boston, Massachusetts, United States of America
- Faculty of Medicine, The Chinese University of Hong Kong, New Territories, Hong Kong SAR
| | - Ashley Cordes
- Indigenous Media in Environmental Studies Program and the Department of Data Science, University of Oregon, Eugene, Oregon, United States of America
| | - Matthew Elmore
- Duke Health, AI Evaluation and Governance, Duke University, Durham, North Carolina, United States of America
| | - Amelia Fiske
- Department of Preclinical Medicine, Institute of History and Ethics in Medicine, TUM School of Medicine and Health, Technical University of Munich, Bavaria, Germany
| | - Jack Gallifant
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Critical Care, Guy's and St. Thomas' NHS Trust, London, United Kingdom
| | - Megan Hayes
- Department of Environmental Studies, University of Oregon, Eugene, Oregon, United States of America
| | - Alvin Marcelo
- Medical Informatics Unit, College of Medicine, University of the Philippines Manila, Philippines
| | - Joao Matos
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Faculty of Engineering, University of Porto, Portugal
- Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| | - Luis Nakayama
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Ophthalmology, Sao Paulo Federal University, Sao Paulo, Brazil
| | - Ezinwanne Ozoani
- Machine Learning and Ethics Research Engineer, Innovation n Ethics, Dublin, Ireland
| | - Benjamin C Silverman
- Center for Bioethics, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Human Research Affairs, Mass General Brigham, Somerville, Massachusetts, United States of America
- Institute for Technology in Psychiatry, McLean Hospital, Belmont, Massachusetts, United States of America
| | - Donnella S Comeau
- Department of Radiology, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
- Department of Human Research Affairs, Mass General Brigham, Somerville, Massachusetts, United States of America
| |
Collapse
|
4
|
Victor A. Artificial intelligence in global health: An unfair future for health in Sub-Saharan Africa? HEALTH AFFAIRS SCHOLAR 2025; 3:qxaf023. [PMID: 39949826 PMCID: PMC11823112 DOI: 10.1093/haschl/qxaf023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 01/31/2025] [Accepted: 02/03/2025] [Indexed: 02/16/2025]
Abstract
Artificial intelligence (AI) holds transformative potential for global health, particularly in underdeveloped regions like Africa. However, the integration of AI into healthcare systems raises significant concerns regarding equity and fairness. This debate paper explores the challenges and risks associated with implementing AI in healthcare in Africa, focusing on the lack of infrastructure, data quality issues, and inadequate governance frameworks. It also explores the geopolitical and economic dynamics that exacerbate these disparities, including the impact of global competition and weakened international institutions. While highlighting the risks, the paper acknowledges the potential benefits of AI, including improved healthcare access, standardization of care, and enhanced health communication. To ensure equitable outcomes, it advocates for targeted policy measures, including infrastructure investment, capacity building, regulatory frameworks, and international collaboration. This comprehensive approach is essential to mitigate risks, harness the benefits of AI, and promote social justice in global health.
Collapse
Affiliation(s)
- Audêncio Victor
- Public Health Postgraduate Program, School of Public Health, University of São Paulo, São Paulo, SP 01246-904, Brazil
- Department of Nutrition, Ministry of Health, Zambezia 2400, Mozambique
| |
Collapse
|
5
|
Ameli N, Firoozi T, Gibson M, Lai H. Classification of periodontitis stage and grade using natural language processing techniques. PLOS DIGITAL HEALTH 2024; 3:e0000692. [PMID: 39671337 PMCID: PMC11642968 DOI: 10.1371/journal.pdig.0000692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 11/06/2024] [Indexed: 12/15/2024]
Abstract
Periodontitis is a complex and microbiome-related inflammatory condition impacting dental supporting tissues. Emphasizing the potential of Clinical Decision Support Systems (CDSS), this study aims to facilitate early diagnosis of periodontitis by extracting patients' information collected as dental charts and notes. We developed a CDSS to predict the stage and grade of periodontitis using natural language processing (NLP) techniques including bidirectional encoder representation for transformers (BERT). We compared the performance of BERT with that of a baseline feature-engineered model. A secondary data analysis was conducted using 309 anonymized patient periodontal charts and corresponding clinician's notes obtained from the university periodontal clinic. After data preprocessing, we added a classification layer on top of the pre-trained BERT model to classify the clinical notes into their corresponding stage and grades. Then, we fine-tuned the pre-trained BERT model on 70% of our data. The performance of the model was evaluated on 32 unseen new patients' clinical notes. The results were compared with the output of a baseline feature-engineered algorithm coupled with MLP techniques to classify the stage and grade of periodontitis. Our proposed BERT model predicted the patients' stage and grade with 77% and 75% accuracy, respectively. MLP model showed that the accuracy of correct classification of stage and grade of the periodontitis on a set of 32 new unseen data was 59.4% and 62.5%, respectively. The BERT model could predict the periodontitis stage and grade on the same new dataset with higher accuracy (66% and 72%, respectively). The utilization of BERT in this context represents a groundbreaking application in dentistry, particularly in CDSS. Our BERT model outperformed baseline models, even with reduced information, promising efficient review of patient notes. This integration of advanced NLP techniques with CDSS frameworks holds potential for timely interventions, preventing complications and reducing healthcare costs.
Collapse
Affiliation(s)
- Nazila Ameli
- Mike Petryk School of Dentistry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada
| | - Tahereh Firoozi
- Mike Petryk School of Dentistry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada
| | - Monica Gibson
- Department of Periodontology, School of Dentistry, University of Indiana, Indianapolis, United States of America
| | - Hollis Lai
- Mike Petryk School of Dentistry, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
6
|
Collins BX, Bélisle-Pipon JC, Evans BJ, Ferryman K, Jiang X, Nebeker C, Novak L, Roberts K, Were M, Yin Z, Ravitsky V, Coco J, Hendricks-Sturrup R, Williams I, Clayton EW, Malin BA, Bridge2AI Ethics and Trustworthy AI Working Group. Addressing ethical issues in healthcare artificial intelligence using a lifecycle-informed process. JAMIA Open 2024; 7:ooae108. [PMID: 39553826 PMCID: PMC11565898 DOI: 10.1093/jamiaopen/ooae108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 08/19/2024] [Accepted: 10/04/2024] [Indexed: 11/19/2024] Open
Abstract
Objectives Artificial intelligence (AI) proceeds through an iterative and evaluative process of development, use, and refinement which may be characterized as a lifecycle. Within this context, stakeholders can vary in their interests and perceptions of the ethical issues associated with this rapidly evolving technology in ways that can fail to identify and avert adverse outcomes. Identifying issues throughout the AI lifecycle in a systematic manner can facilitate better-informed ethical deliberation. Materials and Methods We analyzed existing lifecycles from within the current literature for ethical issues of AI in healthcare to identify themes, which we relied upon to create a lifecycle that consolidates these themes into a more comprehensive lifecycle. We then considered the potential benefits and harms of AI through this lifecycle to identify ethical questions that can arise at each step and to identify where conflicts and errors could arise in ethical analysis. We illustrated the approach in 3 case studies that highlight how different ethical dilemmas arise at different points in the lifecycle. Results Discussion and Conclusion Through case studies, we show how a systematic lifecycle-informed approach to the ethical analysis of AI enables mapping of the effects of AI onto different steps to guide deliberations on benefits and harms. The lifecycle-informed approach has broad applicability to different stakeholders and can facilitate communication on ethical issues for patients, healthcare professionals, research participants, and other stakeholders.
Collapse
Affiliation(s)
- Benjamin X Collins
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Center for Biomedical Ethics and Society, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | | | - Barbara J Evans
- Levin College of Law, University of Florida, Gainesville, FL 32611, United States
- Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL 32611, United States
| | - Kadija Ferryman
- Berman Institute of Bioethics, Johns Hopkins University, Baltimore, MD 21205, United States
| | - Xiaoqian Jiang
- McWilliams School of Biomedical Informatics, UTHealth Houston, Houston, TX 77030, United States
| | - Camille Nebeker
- Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego, La Jolla, CA 92093, United States
| | - Laurie Novak
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Kirk Roberts
- McWilliams School of Biomedical Informatics, UTHealth Houston, Houston, TX 77030, United States
| | - Martin Were
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Zhijun Yin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Computer Science, Vanderbilt University, Nashville, TN 37212, United States
| | | | - Joseph Coco
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Rachele Hendricks-Sturrup
- National Alliance against Disparities in Patient Health, Woodbridge, VA 22191, United States
- Margolis Center for Health Policy, Duke University, Washington, DC 20004, United States
| | - Ishan Williams
- School of Nursing, University of Virginia, Charlottesville, VA 22903, United States
| | - Ellen W Clayton
- Center for Biomedical Ethics and Society, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Law School, Vanderbilt University, Nashville, TN 37203, United States
| | - Bradley A Malin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Computer Science, Vanderbilt University, Nashville, TN 37212, United States
| | | |
Collapse
|
7
|
Anibal JT, Huth HB, Gunkel J, Gregurick SK, Wood BJ. Simulated misuse of large language models and clinical credit systems. NPJ Digit Med 2024; 7:317. [PMID: 39528596 PMCID: PMC11554647 DOI: 10.1038/s41746-024-01306-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 10/15/2024] [Indexed: 11/16/2024] Open
Abstract
In the future, large language models (LLMs) may enhance the delivery of healthcare, but there are risks of misuse. These methods may be trained to allocate resources via unjust criteria involving multimodal data - financial transactions, internet activity, social behaviors, and healthcare information. This study shows that LLMs may be biased in favor of collective/systemic benefit over the protection of individual rights and could facilitate AI-driven social credit systems.
Collapse
Affiliation(s)
- James T Anibal
- Center for Interventional Oncology, NIH Clinical Center, National Institutes of Health (NIH), Bethesda, MD, USA.
| | - Hannah B Huth
- Center for Interventional Oncology, NIH Clinical Center, National Institutes of Health (NIH), Bethesda, MD, USA
| | - Jasmine Gunkel
- Department of Bioethics, National Institutes of Health (NIH), Bethesda, MD, USA
| | - Susan K Gregurick
- Office of the Director, National Institutes of Health (NIH), Bethesda, MD, USA
| | - Bradford J Wood
- Center for Interventional Oncology, NIH Clinical Center, National Institutes of Health (NIH), Bethesda, MD, USA
| |
Collapse
|
8
|
Germani F, Spitale G, Biller-Andorno N. The Dual Nature of AI in Information Dissemination: Ethical Considerations. JMIR AI 2024; 3:e53505. [PMID: 39405099 PMCID: PMC11522648 DOI: 10.2196/53505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/22/2024] [Accepted: 07/28/2024] [Indexed: 11/01/2024]
Abstract
Infodemics pose significant dangers to public health and to the societal fabric, as the spread of misinformation can have far-reaching consequences. While artificial intelligence (AI) systems have the potential to craft compelling and valuable information campaigns with positive repercussions for public health and democracy, concerns have arisen regarding the potential use of AI systems to generate convincing disinformation. The consequences of this dual nature of AI, capable of both illuminating and obscuring the information landscape, are complex and multifaceted. We contend that the rapid integration of AI into society demands a comprehensive understanding of its ethical implications and the development of strategies to harness its potential for the greater good while mitigating harm. Thus, in this paper we explore the ethical dimensions of AI's role in information dissemination and impact on public health, arguing that potential strategies to deal with AI and disinformation encompass generating regulated and transparent data sets used to train AI models, regulating content outputs, and promoting information literacy.
Collapse
Affiliation(s)
- Federico Germani
- Institute of Biomedical Ethics and History of Medicine, University of Zurich, Switzerland, Zurich, Switzerland
| | - Giovanni Spitale
- Institute of Biomedical Ethics and History of Medicine, University of Zurich, Switzerland, Zurich, Switzerland
| | - Nikola Biller-Andorno
- Institute of Biomedical Ethics and History of Medicine, University of Zurich, Switzerland, Zurich, Switzerland
| |
Collapse
|
9
|
Yuan H. Toward real-world deployment of machine learning for health care: External validation, continual monitoring, and randomized clinical trials. HEALTH CARE SCIENCE 2024; 3:360-364. [PMID: 39479276 PMCID: PMC11520244 DOI: 10.1002/hcs2.114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Revised: 07/21/2024] [Accepted: 07/23/2024] [Indexed: 11/02/2024]
Abstract
In this commentary, we elucidate three indispensable evaluation steps toward the real-world deployment of machine learning within the healthcare sector and demonstrate referable examples for diagnostic, therapeutic, and prognostic tasks. We encourage researchers to move beyond retrospective and within-sample validation, and step into the practical implementation at the bedside rather than leaving developed machine learning models in the dust of archived literature.
Collapse
Affiliation(s)
- Han Yuan
- Centre for Quantitative MedicineDuke‐NUS Medical SchoolSingaporeSingapore
| |
Collapse
|
10
|
Chinni BK, Manlhiot C. Emerging Analytical Approaches for Personalized Medicine Using Machine Learning In Pediatric and Congenital Heart Disease. Can J Cardiol 2024; 40:1880-1896. [PMID: 39097187 DOI: 10.1016/j.cjca.2024.07.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 07/25/2024] [Accepted: 07/26/2024] [Indexed: 08/05/2024] Open
Abstract
Precision and personalized medicine, the process by which patient management is tailored to individual circumstances, are now terms that are familiar to cardiologists, despite it still being an emerging field. Although precision medicine relies most often on the underlying biology and pathophysiology of a patient's condition, personalized medicine relies on digital biomarkers generated through algorithms. Given the complexity of the underlying data, these digital biomarkers are most often generated through machine-learning algorithms. There are a number of analytic considerations regarding the creation of digital biomarkers that are discussed in this review, including data preprocessing, time dependency and gating, dimensionality reduction, and novel methods, both in the realm of supervised and unsupervised machine learning. Some of these considerations, such as sample size requirements and measurements of model performance, are particularly challenging in small and heterogeneous populations with rare outcomes such as children with congenital heart disease. Finally, we review analytic considerations for the deployment of digital biomarkers in clinical settings, including the emerging field of clinical artificial intelligence (AI) operations, computational needs for deployment, efforts to increase the explainability of AI, algorithmic drift, and the needs for distributed surveillance and federated learning. We conclude this review by discussing a recent simulation study that shows that, despite these analytic challenges and complications, the use of digital biomarkers in managing clinical care might have substantial benefits regarding individual patient outcomes.
Collapse
Affiliation(s)
- Bhargava K Chinni
- The Blalock-Taussig-Thomas Pediatric and Congenital Heart Center, Department of Pediatrics, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, Maryland, USA
| | - Cedric Manlhiot
- The Blalock-Taussig-Thomas Pediatric and Congenital Heart Center, Department of Pediatrics, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, Maryland, USA; Research Institute, SickKids Hospital, Department of Pediatrics, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
11
|
Anibal J, Huth H, Gunkel J, Gregurick S, Wood B. Simulated Misuse of Large Language Models and Clinical Credit Systems. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.10.24305470. [PMID: 38645190 PMCID: PMC11030492 DOI: 10.1101/2024.04.10.24305470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Large language models (LLMs) have been proposed to support many healthcare tasks, including disease diagnostics and treatment personalization. While AI may be applied to assist or enhance the delivery of healthcare, there is also a risk of misuse. LLMs could be used to allocate resources via unfair, unjust, or inaccurate criteria. For example, a social credit system uses big data to assess "trustworthiness" in society, penalizing those who score poorly based on evaluation metrics defined only by a power structure (e.g., a corporate entity or governing body). Such a system may be amplified by powerful LLMs which can evaluate individuals based on multimodal data - financial transactions, internet activity, and other behavioral inputs. Healthcare data is perhaps the most sensitive information which can be collected and could potentially be used to violate civil liberty or other rights via a "clinical credit system", which may include limiting access to care. The results of this study show that LLMs may be biased in favor of collective or systemic benefit over protecting individual rights, potentially enabling this type of future misuse. Moreover, experiments in this report simulate how clinical datasets might be exploited with current LLMs, demonstrating the urgency of addressing these ethical dangers. Finally, strategies are proposed to mitigate the risk of developing large AI models for healthcare.
Collapse
Affiliation(s)
- James Anibal
- Center for Interventional Oncology, NIH Clinical Center, National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Hannah Huth
- Center for Interventional Oncology, NIH Clinical Center, National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Jasmine Gunkel
- Department of Bioethics, National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Susan Gregurick
- Office of the Director, National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Bradford Wood
- Center for Interventional Oncology, NIH Clinical Center, National Institutes of Health (NIH), Bethesda, Maryland, USA
| |
Collapse
|
12
|
Federico CA, Trotsyuk AA. Biomedical Data Science, Artificial Intelligence, and Ethics: Navigating Challenges in the Face of Explosive Growth. Annu Rev Biomed Data Sci 2024; 7:1-14. [PMID: 38598860 DOI: 10.1146/annurev-biodatasci-102623-104553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Advances in biomedical data science and artificial intelligence (AI) are profoundly changing the landscape of healthcare. This article reviews the ethical issues that arise with the development of AI technologies, including threats to privacy, data security, consent, and justice, as they relate to donors of tissue and data. It also considers broader societal obligations, including the importance of assessing the unintended consequences of AI research in biomedicine. In addition, this article highlights the challenge of rapid AI development against the backdrop of disparate regulatory frameworks, calling for a global approach to address concerns around data misuse, unintended surveillance, and the equitable distribution of AI's benefits and burdens. Finally, a number of potential solutions to these ethical quandaries are offered. Namely, the merits of advocating for a collaborative, informed, and flexible regulatory approach that balances innovation with individual rights and public welfare, fostering a trustworthy AI-driven healthcare ecosystem, are discussed.
Collapse
Affiliation(s)
- Carole A Federico
- Center for Biomedical Ethics, Stanford University School of Medicine, Stanford, California, USA; ,
| | - Artem A Trotsyuk
- Center for Biomedical Ethics, Stanford University School of Medicine, Stanford, California, USA; ,
| |
Collapse
|
13
|
Sánchez-Marqués R, García V, Sánchez JS. A data-centric machine learning approach to improve prediction of glioma grades using low-imbalance TCGA data. Sci Rep 2024; 14:17195. [PMID: 39060383 PMCID: PMC11282236 DOI: 10.1038/s41598-024-68291-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 07/22/2024] [Indexed: 07/28/2024] Open
Abstract
Accurate prediction and grading of gliomas play a crucial role in evaluating brain tumor progression, assessing overall prognosis, and treatment planning. In addition to neuroimaging techniques, identifying molecular biomarkers that can guide the diagnosis, prognosis and prediction of the response to therapy has aroused the interest of researchers in their use together with machine learning and deep learning models. Most of the research in this field has been model-centric, meaning it has been based on finding better performing algorithms. However, in practice, improving data quality can result in a better model. This study investigates a data-centric machine learning approach to determine their potential benefits in predicting glioma grades. We report six performance metrics to provide a complete picture of model performance. Experimental results indicate that standardization and oversizing the minority class increase the prediction performance of four popular machine learning models and two classifier ensembles applied on a low-imbalanced data set consisting of clinical factors and molecular biomarkers. The experiments also show that the two classifier ensembles significantly outperform three of the four standard prediction models. Furthermore, we conduct a comprehensive descriptive analysis of the glioma data set to identify relevant statistical characteristics and discover the most informative attributes using four feature ranking algorithms.
Collapse
Affiliation(s)
- Raquel Sánchez-Marqués
- Fundación Estatal, Salud, Infancia y Bienestar Social, 28029, Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Infecciosas (CIBERINFEC), Instituto de Salud Carlos III, 28029, Madrid, Spain
| | - Vicente García
- Dept. Electrical and Computer Engineering, Instituto de Ingeniería y Tecnología, Universidad Autónoma de Ciudad Juárez, 32310, Ciudad Juárez, Mexico.
| | - J Salvador Sánchez
- Dept. Computer Languages and Systems, Institute of New Imaging Technologies, Universitat Jaume I, 12071, Castelló, Spain
| |
Collapse
|
14
|
Wu CC, Poly TN, Weng YC, Lin MC, Islam MM. Machine Learning Models for Predicting Mortality in Critically Ill Patients with Sepsis-Associated Acute Kidney Injury: A Systematic Review. Diagnostics (Basel) 2024; 14:1594. [PMID: 39125470 PMCID: PMC11311778 DOI: 10.3390/diagnostics14151594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 07/22/2024] [Accepted: 07/22/2024] [Indexed: 08/12/2024] Open
Abstract
While machine learning (ML) models hold promise for enhancing the management of acute kidney injury (AKI) in sepsis patients, creating models that are equitable and unbiased is crucial for accurate patient stratification and timely interventions. This study aimed to systematically summarize existing evidence to determine the effectiveness of ML algorithms for predicting mortality in patients with sepsis-associated AKI. An exhaustive literature search was conducted across several electronic databases, including PubMed, Scopus, and Web of Science, employing specific search terms. This review included studies published from 1 January 2000 to 1 February 2024. Studies were included if they reported on the use of ML for predicting mortality in patients with sepsis-associated AKI. Studies not written in English or with insufficient data were excluded. Data extraction and quality assessment were performed independently by two reviewers. Five studies were included in the final analysis, reporting a male predominance (>50%) among patients with sepsis-associated AKI. Limited data on race and ethnicity were available across the studies, with White patients comprising the majority of the study cohorts. The predictive models demonstrated varying levels of performance, with area under the receiver operating characteristic curve (AUROC) values ranging from 0.60 to 0.87. Algorithms such as extreme gradient boosting (XGBoost), random forest (RF), and logistic regression (LR) showed the best performance in terms of accuracy. The findings of this study show that ML models hold immense ability to identify high-risk patients, predict the progression of AKI early, and improve survival rates. However, the lack of fairness in ML models for predicting mortality in critically ill patients with sepsis-associated AKI could perpetuate existing healthcare disparities. Therefore, it is crucial to develop trustworthy ML models to ensure their widespread adoption and reliance by both healthcare professionals and patients.
Collapse
Affiliation(s)
- Chieh-Chen Wu
- Department of Healthcare Information and Management, School of Health and Medical Engineering, Ming Chuan University, Taipei 111, Taiwan; (C.-C.W.); (Y.-C.W.)
| | - Tahmina Nasrin Poly
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan;
| | - Yung-Ching Weng
- Department of Healthcare Information and Management, School of Health and Medical Engineering, Ming Chuan University, Taipei 111, Taiwan; (C.-C.W.); (Y.-C.W.)
| | - Ming-Chin Lin
- Department of Healthcare Information and Management, School of Health and Medical Engineering, Ming Chuan University, Taipei 111, Taiwan; (C.-C.W.); (Y.-C.W.)
- Taipei Neuroscience Institute, Taipei Medical University, Taipei 110, Taiwan
- Department of Neurosurgery, Shuang Ho Hospital, Taipei Medical University, Taipei 110, Taiwan
| | - Md. Mohaimenul Islam
- Department of Outcomes and Translational Sciences, College of Pharmacy, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
15
|
Franklin G, Stephens R, Piracha M, Tiosano S, Lehouillier F, Koppel R, Elkin PL. The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective. Life (Basel) 2024; 14:652. [PMID: 38929638 PMCID: PMC11204917 DOI: 10.3390/life14060652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/24/2024] [Accepted: 04/26/2024] [Indexed: 06/28/2024] Open
Abstract
Artificial intelligence models represented in machine learning algorithms are promising tools for risk assessment used to guide clinical and other health care decisions. Machine learning algorithms, however, may house biases that propagate stereotypes, inequities, and discrimination that contribute to socioeconomic health care disparities. The biases include those related to some sociodemographic characteristics such as race, ethnicity, gender, age, insurance, and socioeconomic status from the use of erroneous electronic health record data. Additionally, there is concern that training data and algorithmic biases in large language models pose potential drawbacks. These biases affect the lives and livelihoods of a significant percentage of the population in the United States and globally. The social and economic consequences of the associated backlash cannot be underestimated. Here, we outline some of the sociodemographic, training data, and algorithmic biases that undermine sound health care risk assessment and medical decision-making that should be addressed in the health care system. We present a perspective and overview of these biases by gender, race, ethnicity, age, historically marginalized communities, algorithmic bias, biased evaluations, implicit bias, selection/sampling bias, socioeconomic status biases, biased data distributions, cultural biases and insurance status bias, conformation bias, information bias and anchoring biases and make recommendations to improve large language model training data, including de-biasing techniques such as counterfactual role-reversed sentences during knowledge distillation, fine-tuning, prefix attachment at training time, the use of toxicity classifiers, retrieval augmented generation and algorithmic modification to mitigate the biases moving forward.
Collapse
Affiliation(s)
- Gillian Franklin
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Department of Veterans Affairs, Knowledge Based Systems and Western New York, Veterans Affairs, Buffalo, NY 14215, USA
| | - Rachel Stephens
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
| | - Muhammad Piracha
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
| | - Shmuel Tiosano
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
| | - Frank Lehouillier
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Department of Veterans Affairs, Knowledge Based Systems and Western New York, Veterans Affairs, Buffalo, NY 14215, USA
| | - Ross Koppel
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Institute for Biomedical Informatics, Perelman School of Medicine, and Sociology Department, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Peter L. Elkin
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY 14203, USA; (G.F.); (R.S.); (M.P.); (F.L.); (R.K.)
- Department of Veterans Affairs, Knowledge Based Systems and Western New York, Veterans Affairs, Buffalo, NY 14215, USA
| |
Collapse
|
16
|
Taha MA, Morren JA. The role of artificial intelligence in electrodiagnostic and neuromuscular medicine: Current state and future directions. Muscle Nerve 2024; 69:260-272. [PMID: 38151482 DOI: 10.1002/mus.28023] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 12/04/2023] [Accepted: 12/09/2023] [Indexed: 12/29/2023]
Abstract
The rapid advancements in artificial intelligence (AI), including machine learning (ML), and deep learning (DL) have ushered in a new era of technological breakthroughs in healthcare. These technologies are revolutionizing the way we utilize medical data, enabling improved disease classification, more precise diagnoses, better treatment selection, therapeutic monitoring, and highly accurate prognostication. Different ML and DL models have been used to distinguish between electromyography signals in normal individuals and those with amyotrophic lateral sclerosis and myopathy, with accuracy ranging from 67% to 99.5%. DL models have also been successfully applied in neuromuscular ultrasound, with the use of segmentation techniques achieving diagnostic accuracy of at least 90% for nerve entrapment disorders, and 87% for inflammatory myopathies. Other successful AI applications include prediction of treatment response, and prognostication including prediction of intensive care unit admissions for patients with myasthenia gravis. Despite these remarkable strides, significant knowledge, attitude, and practice gaps persist, including within the field of electrodiagnostic and neuromuscular medicine. In this narrative review, we highlight the fundamental principles of AI and draw parallels with the intricacies of human brain networks. Specifically, we explore the immense potential that AI holds for applications in electrodiagnostic studies, neuromuscular ultrasound, and other aspects of neuromuscular medicine. While there are exciting possibilities for the future, it is essential to acknowledge and understand the limitations of AI and take proactive steps to mitigate these challenges. This collective endeavor holds immense potential for the advancement of healthcare through the strategic and responsible integration of AI technologies.
Collapse
Affiliation(s)
- Mohamed A Taha
- Neuromuscular Center, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - John A Morren
- Neuromuscular Center, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| |
Collapse
|
17
|
Leal Neto O, Von Wyl V. Digital Transformation of Public Health for Noncommunicable Diseases: Narrative Viewpoint of Challenges and Opportunities. JMIR Public Health Surveill 2024; 10:e49575. [PMID: 38271097 PMCID: PMC10853859 DOI: 10.2196/49575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 09/13/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
The recent SARS-CoV-2 pandemic underscored the effectiveness and rapid deployment of digital public health interventions, notably the digital proximity tracing apps, leveraging Bluetooth capabilities to trace and notify users about potential infection exposures. Backed by renowned organizations such as the World Health Organization and the European Union, digital proximity tracings showcased the promise of digital public health. As the world pivots from pandemic responses, it becomes imperative to address noncommunicable diseases (NCDs) that account for a vast majority of health care expenses and premature disability-adjusted life years lost. The narrative of digital transformation in the realm of NCD public health is distinct from infectious diseases. Public health, with its multifaceted approach from disciplines such as medicine, epidemiology, and psychology, focuses on promoting healthy living and choices through functions categorized as "Assessment," "Policy Development," "Resource Allocation," "Assurance," and "Access." The power of artificial intelligence (AI) in this digital transformation is noteworthy. AI can automate repetitive tasks, facilitating health care providers to prioritize personal interactions, especially those that cannot be digitalized like emotional support. Moreover, AI presents tools for individuals to be proactive in their health management. However, the human touch remains irreplaceable; AI serves as a companion guiding through the health care landscape. Digital evolution, while revolutionary, poses its own set of challenges. Issues of equity and access are at the forefront. Vulnerable populations, whether due to economic constraints, geographical barriers, or digital illiteracy, face the threat of being marginalized further. This transformation mandates an inclusive strategy, focusing on not amplifying existing health disparities but eliminating them. Population-level digital interventions in NCD prevention demand societal agreement. Policies, like smoking bans or sugar taxes, though effective, might affect those not directly benefiting. Hence, all involved parties, from policy makers to the public, should have a balanced perspective on the advantages, risks, and expenses of these digital shifts. For a successful digital shift in public health, especially concerning NCDs, AI's potential to enhance efficiency, effectiveness, user experience, and equity-the "quadruple aim"-is undeniable. However, it is vital that AI-driven initiatives in public health domains remain purposeful, offering improvements without compromising other objectives. The broader success of digital public health hinges on transparent benchmarks and criteria, ensuring maximum benefits without sidelining minorities or vulnerable groups. Especially in population-centric decisions, like resource allocation, AI's ability to avoid bias is paramount. Therefore, the continuous involvement of stakeholders, including patients and minority groups, remains pivotal in the progression of AI-integrated digital public health.
Collapse
Affiliation(s)
- Onicio Leal Neto
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
- Global Health Institute, Mel & Enid Zuckerman College of Public Health, University of Arizona, Tucson, AZ, United States
- Department of Epidemiology and Biostatistics, Mel & Enid Zuckerman College of Public Health, University of Arizona, Tucson, AZ, United States
| | - Viktor Von Wyl
- Institute for Implementation Science in Health Care, University of Zurich, Zurich, Switzerland
- Epidemiology, Biostatistics & Prevention Institute, University of Zurich, Zurich, Switzerland
| |
Collapse
|
18
|
Zaleski AL, Berkowsky R, Craig KJT, Pescatello LS. Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study. JMIR MEDICAL EDUCATION 2024; 10:e51308. [PMID: 38206661 PMCID: PMC10811574 DOI: 10.2196/51308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/05/2023] [Accepted: 12/11/2023] [Indexed: 01/12/2024]
Abstract
BACKGROUND Regular physical activity is critical for health and disease prevention. Yet, health care providers and patients face barriers to implement evidence-based lifestyle recommendations. The potential to augment care with the increased availability of artificial intelligence (AI) technologies is limitless; however, the suitability of AI-generated exercise recommendations has yet to be explored. OBJECTIVE The purpose of this study was to assess the comprehensiveness, accuracy, and readability of individualized exercise recommendations generated by a novel AI chatbot. METHODS A coding scheme was developed to score AI-generated exercise recommendations across ten categories informed by gold-standard exercise recommendations, including (1) health condition-specific benefits of exercise, (2) exercise preparticipation health screening, (3) frequency, (4) intensity, (5) time, (6) type, (7) volume, (8) progression, (9) special considerations, and (10) references to the primary literature. The AI chatbot was prompted to provide individualized exercise recommendations for 26 clinical populations using an open-source application programming interface. Two independent reviewers coded AI-generated content for each category and calculated comprehensiveness (%) and factual accuracy (%) on a scale of 0%-100%. Readability was assessed using the Flesch-Kincaid formula. Qualitative analysis identified and categorized themes from AI-generated output. RESULTS AI-generated exercise recommendations were 41.2% (107/260) comprehensive and 90.7% (146/161) accurate, with the majority (8/15, 53%) of inaccuracy related to the need for exercise preparticipation medical clearance. Average readability level of AI-generated exercise recommendations was at the college level (mean 13.7, SD 1.7), with an average Flesch reading ease score of 31.1 (SD 7.7). Several recurring themes and observations of AI-generated output included concern for liability and safety, preference for aerobic exercise, and potential bias and direct discrimination against certain age-based populations and individuals with disabilities. CONCLUSIONS There were notable gaps in the comprehensiveness, accuracy, and readability of AI-generated exercise recommendations. Exercise and health care professionals should be aware of these limitations when using and endorsing AI-based technologies as a tool to support lifestyle change involving exercise.
Collapse
Affiliation(s)
- Amanda L Zaleski
- Clinical Evidence Development, Aetna Medical Affairs, CVS Health Corporation, Hartford, CT, United States
- Department of Preventive Cardiology, Hartford Hospital, Hartford, CT, United States
| | - Rachel Berkowsky
- Department of Kinesiology, University of Connecticut, Storrs, CT, United States
| | - Kelly Jean Thomas Craig
- Clinical Evidence Development, Aetna Medical Affairs, CVS Health Corporation, Hartford, CT, United States
| | - Linda S Pescatello
- Department of Kinesiology, University of Connecticut, Storrs, CT, United States
| |
Collapse
|
19
|
Ferrara C, Sellitto G, Ferrucci F, Palomba F, De Lucia A. Fairness-aware machine learning engineering: how far are we? EMPIRICAL SOFTWARE ENGINEERING 2023; 29:9. [PMID: 38027253 PMCID: PMC10673752 DOI: 10.1007/s10664-023-10402-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 10/03/2023] [Indexed: 12/01/2023]
Abstract
Machine learning is part of the daily life of people and companies worldwide. Unfortunately, bias in machine learning algorithms risks unfairly influencing the decision-making process and reiterating possible discrimination. While the interest of the software engineering community in software fairness is rapidly increasing, there is still a lack of understanding of various aspects connected to fair machine learning engineering, i.e., the software engineering process involved in developing fairness-critical machine learning systems. Questions connected to the practitioners' awareness and maturity about fairness, the skills required to deal with the matter, and the best development phase(s) where fairness should be faced more are just some examples of the knowledge gaps currently open. In this paper, we provide insights into how fairness is perceived and managed in practice, to shed light on the instruments and approaches that practitioners might employ to properly handle fairness. We conducted a survey with 117 professionals who shared their knowledge and experience highlighting the relevance of fairness in practice, and the skills and tools required to handle it. The key results of our study show that fairness is still considered a second-class quality aspect in the development of artificial intelligence systems. The building of specific methods and development environments, other than automated validation tools, might help developers to treat fairness throughout the software lifecycle and revert this trend.
Collapse
Affiliation(s)
- Carmine Ferrara
- Software Engineering (SeSa) Lab, University of Salerno, Salerno, Italy
| | - Giulia Sellitto
- Software Engineering (SeSa) Lab, University of Salerno, Salerno, Italy
| | - Filomena Ferrucci
- Software Engineering (SeSa) Lab, University of Salerno, Salerno, Italy
| | - Fabio Palomba
- Software Engineering (SeSa) Lab, University of Salerno, Salerno, Italy
| | - Andrea De Lucia
- Software Engineering (SeSa) Lab, University of Salerno, Salerno, Italy
| |
Collapse
|
20
|
Celeste C, Ming D, Broce J, Ojo DP, Drobina E, Louis-Jacques AF, Gilbert JE, Fang R, Parker IK. Ethnic disparity in diagnosing asymptomatic bacterial vaginosis using machine learning. NPJ Digit Med 2023; 6:211. [PMID: 37978250 PMCID: PMC10656445 DOI: 10.1038/s41746-023-00953-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 10/27/2023] [Indexed: 11/19/2023] Open
Abstract
While machine learning (ML) has shown great promise in medical diagnostics, a major challenge is that ML models do not always perform equally well among ethnic groups. This is alarming for women's health, as there are already existing health disparities that vary by ethnicity. Bacterial Vaginosis (BV) is a common vaginal syndrome among women of reproductive age and has clear diagnostic differences among ethnic groups. Here, we investigate the ability of four ML algorithms to diagnose BV. We determine the fairness in the prediction of asymptomatic BV using 16S rRNA sequencing data from Asian, Black, Hispanic, and white women. General purpose ML model performances vary based on ethnicity. When evaluating the metric of false positive or false negative rate, we find that models perform least effectively for Hispanic and Asian women. Models generally have the highest performance for white women and the lowest for Asian women. These findings demonstrate a need for improved methodologies to increase model fairness for predicting BV.
Collapse
Affiliation(s)
- Cameron Celeste
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, 32610, USA
| | - Dion Ming
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, 32610, USA
| | - Justin Broce
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, 32611, USA
| | - Diandra P Ojo
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, 32611, USA
| | - Emma Drobina
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, 32611, USA
| | - Adetola F Louis-Jacques
- Department of Obstetrics and Gynecology, College of Medicine, University of Florida, Gainesville, FL, 32611, USA
| | - Juan E Gilbert
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, 32611, USA
| | - Ruogu Fang
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, 32610, USA.
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, 32611, USA.
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, 32611, USA.
- Department of Radiology, University of Florida, Gainesville, FL, 32611, USA.
| | - Ivana K Parker
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, 32610, USA.
| |
Collapse
|
21
|
Tong W, Guan Y, Chen J, Huang X, Zhong Y, Zhang C, Zhang H. Artificial intelligence in global health equity: an evaluation and discussion on the application of ChatGPT, in the Chinese National Medical Licensing Examination. Front Med (Lausanne) 2023; 10:1237432. [PMID: 38020160 PMCID: PMC10656681 DOI: 10.3389/fmed.2023.1237432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/09/2023] [Indexed: 12/01/2023] Open
Abstract
Background The demand for healthcare is increasing globally, with notable disparities in access to resources, especially in Asia, Africa, and Latin America. The rapid development of Artificial Intelligence (AI) technologies, such as OpenAI's ChatGPT, has shown promise in revolutionizing healthcare. However, potential challenges, including the need for specialized medical training, privacy concerns, and language bias, require attention. Methods To assess the applicability and limitations of ChatGPT in Chinese and English settings, we designed an experiment evaluating its performance in the 2022 National Medical Licensing Examination (NMLE) in China. For a standardized evaluation, we used the comprehensive written part of the NMLE, translated into English by a bilingual expert. All questions were input into ChatGPT, which provided answers and reasons for choosing them. Responses were evaluated for "information quality" using the Likert scale. Results ChatGPT demonstrated a correct response rate of 81.25% for Chinese and 86.25% for English questions. Logistic regression analysis showed that neither the difficulty nor the subject matter of the questions was a significant factor in AI errors. The Brier Scores, indicating predictive accuracy, were 0.19 for Chinese and 0.14 for English, indicating good predictive performance. The average quality score for English responses was excellent (4.43 point), slightly higher than for Chinese (4.34 point). Conclusion While AI language models like ChatGPT show promise for global healthcare, language bias is a key challenge. Ensuring that such technologies are robustly trained and sensitive to multiple languages and cultures is vital. Further research into AI's role in healthcare, particularly in areas with limited resources, is warranted.
Collapse
Affiliation(s)
- Wenting Tong
- Department of Pharmacy, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
| | - Yongfu Guan
- Department of Rehabilitation and Elderly Care, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
| | - Jinping Chen
- Department of Rehabilitation and Elderly Care, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
| | - Xixuan Huang
- Department of Mathematics, Xiamen University, Xiamen, Fujian, China
| | - Yuting Zhong
- Department of Anesthesiology, Gannan Medical University, Jiangxi, China
| | - Changrong Zhang
- Department of Chinese Medicine, Affiliated Hospital of Qinghai University, Xining, Qinghai, China
| | - Hui Zhang
- Department of Rehabilitation and Elderly Care, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
- Chair of Endocrinology and Medical Sexology (ENDOSEX), Department of Experimental Medicine, University of Rome Tor Vergata, Rome, Italy
| |
Collapse
|
22
|
Tiribelli S, Monnot A, Shah SFH, Arora A, Toong PJ, Kong S. Ethics Principles for Artificial Intelligence-Based Telemedicine for Public Health. Am J Public Health 2023; 113:577-584. [PMID: 36893365 PMCID: PMC10088937 DOI: 10.2105/ajph.2023.307225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/02/2023] [Indexed: 03/11/2023]
Abstract
The use of artificial intelligence (AI) in the field of telemedicine has grown exponentially over the past decade, along with the adoption of AI-based telemedicine to support public health systems. Although AI-based telemedicine can open up novel opportunities for the delivery of clinical health and care and become a strong aid to public health systems worldwide, it also comes with ethical risks that should be detected, prevented, or mitigated for the responsible use of AI-based telemedicine in and for public health. However, despite the current proliferation of AI ethics frameworks, thus far, none have been developed for the design of AI-based telemedicine, especially for the adoption of AI-based telemedicine in and for public health. We aimed to fill this gap by mapping the most relevant AI ethics principles for AI-based telemedicine for public health and by showing the need to revise them via major ethical themes emerging from bioethics, medical ethics, and public health ethics toward the definition of a unified set of 6 AI ethics principles for the implementation of AI-based telemedicine. (Am J Public Health. 2023;113(5):577-584. https://doi.org/10.2105/AJPH.2023.307225).
Collapse
Affiliation(s)
- Simona Tiribelli
- Simona Tiribelli is with the Department of Political Sciences, Communication, and International Relations, University of Macerata, Macerata, Italy, and the Institute for Technology and Global Health, Cambridge, MA. Annabelle Monnot is with Polygeia, Global Health Think Tank, Cambridge, UK. Syed F. H. Shah and Anmol Arora are with the School of Clinical Medicine, University of Cambridge, Cambridge. Ping J. Toong is with the Department of Pathology, University of Cambridge. Sokanha Kong is with the Department of Medical Genetics, University of Cambridge
| | - Annabelle Monnot
- Simona Tiribelli is with the Department of Political Sciences, Communication, and International Relations, University of Macerata, Macerata, Italy, and the Institute for Technology and Global Health, Cambridge, MA. Annabelle Monnot is with Polygeia, Global Health Think Tank, Cambridge, UK. Syed F. H. Shah and Anmol Arora are with the School of Clinical Medicine, University of Cambridge, Cambridge. Ping J. Toong is with the Department of Pathology, University of Cambridge. Sokanha Kong is with the Department of Medical Genetics, University of Cambridge
| | - Syed F H Shah
- Simona Tiribelli is with the Department of Political Sciences, Communication, and International Relations, University of Macerata, Macerata, Italy, and the Institute for Technology and Global Health, Cambridge, MA. Annabelle Monnot is with Polygeia, Global Health Think Tank, Cambridge, UK. Syed F. H. Shah and Anmol Arora are with the School of Clinical Medicine, University of Cambridge, Cambridge. Ping J. Toong is with the Department of Pathology, University of Cambridge. Sokanha Kong is with the Department of Medical Genetics, University of Cambridge
| | - Anmol Arora
- Simona Tiribelli is with the Department of Political Sciences, Communication, and International Relations, University of Macerata, Macerata, Italy, and the Institute for Technology and Global Health, Cambridge, MA. Annabelle Monnot is with Polygeia, Global Health Think Tank, Cambridge, UK. Syed F. H. Shah and Anmol Arora are with the School of Clinical Medicine, University of Cambridge, Cambridge. Ping J. Toong is with the Department of Pathology, University of Cambridge. Sokanha Kong is with the Department of Medical Genetics, University of Cambridge
| | - Ping J Toong
- Simona Tiribelli is with the Department of Political Sciences, Communication, and International Relations, University of Macerata, Macerata, Italy, and the Institute for Technology and Global Health, Cambridge, MA. Annabelle Monnot is with Polygeia, Global Health Think Tank, Cambridge, UK. Syed F. H. Shah and Anmol Arora are with the School of Clinical Medicine, University of Cambridge, Cambridge. Ping J. Toong is with the Department of Pathology, University of Cambridge. Sokanha Kong is with the Department of Medical Genetics, University of Cambridge
| | - Sokanha Kong
- Simona Tiribelli is with the Department of Political Sciences, Communication, and International Relations, University of Macerata, Macerata, Italy, and the Institute for Technology and Global Health, Cambridge, MA. Annabelle Monnot is with Polygeia, Global Health Think Tank, Cambridge, UK. Syed F. H. Shah and Anmol Arora are with the School of Clinical Medicine, University of Cambridge, Cambridge. Ping J. Toong is with the Department of Pathology, University of Cambridge. Sokanha Kong is with the Department of Medical Genetics, University of Cambridge
| |
Collapse
|
23
|
Büdenbender B, Höfling TTA, Gerdes ABM, Alpers GW. Training machine learning algorithms for automatic facial coding: The role of emotional facial expressions' prototypicality. PLoS One 2023; 18:e0281309. [PMID: 36763694 PMCID: PMC9916590 DOI: 10.1371/journal.pone.0281309] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 01/20/2023] [Indexed: 02/12/2023] Open
Abstract
Automatic facial coding (AFC) is a promising new research tool to efficiently analyze emotional facial expressions. AFC is based on machine learning procedures to infer emotion categorization from facial movements (i.e., Action Units). State-of-the-art AFC accurately classifies intense and prototypical facial expressions, whereas it is less accurate for non-prototypical and less intense facial expressions. A potential reason might be that AFC is typically trained with standardized and prototypical facial expression inventories. Because AFC would be useful to analyze less prototypical research material as well, we set out to determine the role of prototypicality in the training material. We trained established machine learning algorithms either with standardized expressions from widely used research inventories or with unstandardized emotional facial expressions obtained in a typical laboratory setting and tested them on identical or cross-over material. All machine learning models' accuracies were comparable when trained and tested with held-out dataset from the same dataset (acc. = [83.4% to 92.5%]). Strikingly, we found a substantial drop in accuracies for models trained with the highly prototypical standardized dataset when tested in the unstandardized dataset (acc. = [52.8%; 69.8%]). However, when they were trained with unstandardized expressions and tested with standardized datasets, accuracies held up (acc. = [82.7%; 92.5%]). These findings demonstrate a strong impact of the training material's prototypicality on AFC's ability to classify emotional faces. Because AFC would be useful for analyzing emotional facial expressions in research or even naturalistic scenarios, future developments should include more naturalistic facial expressions for training. This approach will improve the generalizability of AFC to encode more naturalistic facial expressions and increase robustness for future applications of this promising technology.
Collapse
Affiliation(s)
- Björn Büdenbender
- Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, Germany
| | - Tim T. A. Höfling
- Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, Germany
| | - Antje B. M. Gerdes
- Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, Germany
| | - Georg W. Alpers
- Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, Germany
- * E-mail:
| |
Collapse
|
24
|
Ferrara P, Battiato S, Polosa R. Progress and prospects for artificial intelligence in clinical practice: learning from COVID-19. Intern Emerg Med 2022; 17:1855-1857. [PMID: 36063262 PMCID: PMC9442555 DOI: 10.1007/s11739-022-03080-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 08/11/2022] [Indexed: 12/05/2022]
Affiliation(s)
- Pietro Ferrara
- Center for Public Health Research, University of Milan Bicocca, Monza, Italy
- IRCCS Istituto Auxologico Italiano, Milan, Italy
| | - Sebastiano Battiato
- Department of Mathematics and Computer Science, University of Catania, Catania, Italy
| | - Riccardo Polosa
- Center of Excellence for the Acceleration of Harm Reduction (CoEHAR), University of Catania, Catania, Italy.
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy.
- Institute of Internal Medicine, AOU "Policlinico-V. Emanuele", Via S. Sofia, 78, Catania, Italy.
| |
Collapse
|