1
|
HRU-Net: A high-resolution convolutional neural network for esophageal cancer radiotherapy target segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108177. [PMID: 38648704 DOI: 10.1016/j.cmpb.2024.108177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/10/2024] [Accepted: 04/13/2024] [Indexed: 04/25/2024]
Abstract
BACKGROUND AND OBJECTIVE The effective segmentation of esophageal squamous carcinoma lesions in CT scans is significant for auxiliary diagnosis and treatment. However, accurate lesion segmentation is still a challenging task due to the irregular form of the esophagus and small size, the inconsistency of spatio-temporal structure, and low contrast of esophagus and its peripheral tissues in medical images. The objective of this study is to improve the segmentation effect of esophageal squamous cell carcinoma lesions. METHODS It is critical for a segmentation network to effectively extract 3D discriminative features to distinguish esophageal cancers from some visually closed adjacent esophageal tissues and organs. In this work, an efficient HRU-Net architecture (High-Resolution U-Net) was exploited for esophageal cancer and esophageal carcinoma segmentation in CT slices. Based on the idea of localization first and segmentation later, the HRU-Net locates the esophageal region before segmentation. In addition, an Resolution Fusion Module (RFM) was designed to integrate the information of adjacent resolution feature maps to obtain strong semantic information, as well as preserve the high-resolution features. RESULTS Compared with the other five typical methods, the devised HRU-Net is capable of generating superior segmentation results. CONCLUSIONS Our proposed HRU-NET improves the accuracy of segmentation for squamous esophageal cancer. Compared to other models, our model performs the best. The designed method may improve the efficiency of clinical diagnosis of esophageal squamous cell carcinoma lesions.
Collapse
|
2
|
MUSE-XAE: MUtational Signature Extraction with eXplainable AutoEncoder enhances tumour types classification. BIOINFORMATICS (OXFORD, ENGLAND) 2024:btae320. [PMID: 38754097 DOI: 10.1093/bioinformatics/btae320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 04/08/2024] [Accepted: 05/15/2024] [Indexed: 05/18/2024]
Abstract
MOTIVATION Mutational signatures are a critical component in deciphering the genetic alterations that underlie cancer development and have become a valuable resource to understand the genomic changes during tumorigenesis. Therefore, it is essential to employ precise and accurate methods for their extraction to ensure that the underlying patterns are reliably identified and can be effectively utilized in new strategies for diagnosis, prognosis and treatment of cancer patients. RESULTS We present MUSE-XAE, a novel method for mutational signature extraction from cancer genomes using an explainable autoencoder. Our approach employs a hybrid architecture consisting of a nonlinear encoder that can capture nonlinear interactions among features, and a linear decoder which ensures the interpretability of the active signatures. We evaluated and compared MUSE-XAE with other available tools on both synthetic and real cancer datasets and demonstrated that it achieves superior performance in terms of precision and sensitivity in recovering mutational signature profiles. MUSE-XAE extracts highly discriminative mutational signature profiles by enhancing the classification of primary tumour types and subtypes in real world settings. This approach could facilitate further research in this area, with neural networks playing a critical role in advancing our understanding of cancer genomics. AVAILABILITY MUSE-XAE software is freely available at https://github.com/compbiomed-unito/MUSE-XAE. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
3
|
The crucial role of explainability in healthcare AI. Eur J Radiol 2024; 176:111507. [PMID: 38761444 DOI: 10.1016/j.ejrad.2024.111507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 05/13/2024] [Indexed: 05/20/2024]
|
4
|
Automated annotation of disease subtypes. J Biomed Inform 2024; 154:104650. [PMID: 38701887 DOI: 10.1016/j.jbi.2024.104650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 03/28/2024] [Accepted: 04/29/2024] [Indexed: 05/05/2024]
Abstract
BACKGROUND Distinguishing diseases into distinct subtypes is crucial for study and effective treatment strategies. The Open Targets Platform (OT) integrates biomedical, genetic, and biochemical datasets to empower disease ontologies, classifications, and potential gene targets. Nevertheless, many disease annotations are incomplete, requiring laborious expert medical input. This challenge is especially pronounced for rare and orphan diseases, where resources are scarce. METHODS We present a machine learning approach to identifying diseases with potential subtypes, using the approximately 23,000 diseases documented in OT. We derive novel features for predicting diseases with subtypes using direct evidence. Machine learning models were applied to analyze feature importance and evaluate predictive performance for discovering both known and novel disease subtypes. RESULTS Our model achieves a high (89.4%) ROC AUC (Area Under the Receiver Operating Characteristic Curve) in identifying known disease subtypes. We integrated pre-trained deep-learning language models and showed their benefits. Moreover, we identify 515 disease candidates predicted to possess previously unannotated subtypes. CONCLUSIONS Our models can partition diseases into distinct subtypes. This methodology enables a robust, scalable approach for improving knowledge-based annotations and a comprehensive assessment of disease ontology tiers. Our candidates are attractive targets for further study and personalized medicine, potentially aiding in the unveiling of new therapeutic indications for sought-after targets.
Collapse
|
5
|
An innovative artificial intelligence-based method to compress complex models into explainable, model-agnostic and reduced decision support systems with application to healthcare (NEAR). Artif Intell Med 2024; 151:102841. [PMID: 38658130 DOI: 10.1016/j.artmed.2024.102841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 02/29/2024] [Accepted: 03/11/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND AND OBJECTIVE In everyday clinical practice, medical decision is currently based on clinical guidelines which are often static and rigid, and do not account for population variability, while individualized, patient-oriented decision and/or treatment are the paradigm change necessary to enter into the era of precision medicine. Most of the limitations of a guideline-based system could be overcome through the adoption of Clinical Decision Support Systems (CDSSs) based on Artificial Intelligence (AI) algorithms. However, the black-box nature of AI algorithms has hampered a large adoption of AI-based CDSSs in clinical practice. In this study, an innovative AI-based method to compress AI-based prediction models into explainable, model-agnostic, and reduced decision support systems (NEAR) with application to healthcare is presented and validated. METHODS NEAR is based on the Shapley Additive Explanations framework and can be applied to complex input models to obtain the contributions of each input feature to the output. Technically, the simplified NEAR models approximate contributions from input features using a custom library and merge them to determine the final output. Finally, NEAR estimates the confidence error associated with the single input feature contributing to the final score, making the result more interpretable. Here, NEAR is evaluated on a clinical real-world use case, the mortality prediction in patients who experienced Acute Coronary Syndrome (ACS), applying three different Machine Learning/Deep Learning models as implementation examples. RESULTS NEAR, when applied to the ACS use case, exhibits performances like the ones of the AI-based model from which it is derived, as in the case of the Adaptive Boosting classifier, whose Area Under the Curve is not statistically different from the NEAR one, even the model's simplification. Moreover, NEAR comes with intrinsic explainability and modularity, as it can be tested on the developed web application platform (https://neardashboard.pythonanywhere.com/). CONCLUSIONS An explainable and reliable CDSS tailored to single-patient analysis has been developed. The proposed AI-based system has the potential to be used alongside the clinical guidelines currently employed in the medical setting making them more personalized and dynamic and assisting doctors in taking their everyday clinical decisions.
Collapse
|
6
|
Predicting the conversion from clinically isolated syndrome to multiple sclerosis: An explainable machine learning approach. Mult Scler Relat Disord 2024; 86:105614. [PMID: 38642495 DOI: 10.1016/j.msard.2024.105614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 04/04/2024] [Accepted: 04/07/2024] [Indexed: 04/22/2024]
Abstract
INTRODUCTION Predicting the conversion of clinically isolated syndrome (CIS) to clinically definite multiple sclerosis (CDMS) is critical to personalizing treatment planning and benefits for patients. The aim of this study is to develop an explainable machine learning (ML) model for predicting this conversion based on demographic, clinical, and imaging data. METHOD The ML model, Extreme Gradient Boosting (XGBoost), was employed on the public dataset of 273 Mexican mestizo CIS patients with 10-year follow-up. The data was divided into a training set for cross-validation and feature selection, and a holdout test set for final testing. Feature importance was determined using the SHapley Additive Explanations library (SHAP). Then, two experiments were conducted to optimize the model's performance by selectively adding variables and selecting the most contributive variables for the final model. RESULTS Nine variables including age, gender, schooling, motor symptoms, infratentorial and periventricular lesion at imaging, oligoclonal band in cerebrospinal fluid, lesion and symptoms types were significant. The model achieved an accuracy of 83.6 %, AUC of 91.8 %, sensitivity of 83.9 %, and specificity of 83.4 % in cross-validation. In the final testing, the model achieved an accuracy of 78.3 %, AUC of 85.8 %, sensitivity of 75 %, and specificity of 81.1 %. Finally, a web-based demo of the model was created for testing purposes. CONCLUSION The model, focusing on feature selection and interpretability, effectively stratifies risk for treatment decisions and disability prevention in MS patients. It provides a numerical risk estimate for CDMS conversion, enhancing transparency in clinical decision-making and aiding in patient care.
Collapse
|
7
|
Using test-time augmentation to investigate explainable AI: inconsistencies between method, model and human intuition. J Cheminform 2024; 16:39. [PMID: 38576047 PMCID: PMC10993590 DOI: 10.1186/s13321-024-00824-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 03/09/2024] [Indexed: 04/06/2024] Open
Abstract
Stakeholders of machine learning models desire explainable artificial intelligence (XAI) to produce human-understandable and consistent interpretations. In computational toxicity, augmentation of text-based molecular representations has been used successfully for transfer learning on downstream tasks. Augmentations of molecular representations can also be used at inference to compare differences between multiple representations of the same ground-truth. In this study, we investigate the robustness of eight XAI methods using test-time augmentation for a molecular-representation model in the field of computational toxicity prediction. We report significant differences between explanations for different representations of the same ground-truth, and show that randomized models have similar variance. We hypothesize that text-based molecular representations in this and past research reflect tokenization more than learned parameters. Furthermore, we see a greater variance between in-domain predictions than out-of-domain predictions, indicating XAI measures something other than learned parameters. Finally, we investigate the relative importance given to expert-derived structural alerts and find similar importance given irregardless of applicability domain, randomization and varying training procedures. We therefore caution future research to validate their methods using a similar comparison to human intuition without further investigation. SCIENTIFIC CONTRIBUTION: In this research we critically investigate XAI through test-time augmentation, contrasting previous assumptions about using expert validation and showing inconsistencies within models for identical representations. SMILES augmentation has been used to increase model accuracy, but was here adapted from the field of image test-time augmentation to be used as an independent indication of the consistency within SMILES-based molecular representation models.
Collapse
|
8
|
Using generative AI to investigate medical imagery models and datasets. EBioMedicine 2024; 102:105075. [PMID: 38565004 PMCID: PMC10993140 DOI: 10.1016/j.ebiom.2024.105075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 03/05/2024] [Accepted: 03/06/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND AI models have shown promise in performing many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust of doctors in AI-based models, especially in domains where AI prediction capabilities surpass those of humans. Moreover, such explanations could enable novel scientific discovery by uncovering signals in the data that aren't yet known to experts. METHODS In this paper, we present a workflow for generating hypotheses to understand which visual signals in images are correlated with a classification model's predictions for a given task. This approach leverages an automatic visual explanation algorithm followed by interdisciplinary expert review. We propose the following 4 steps: (i) Train a classifier to perform a given task to assess whether the imagery indeed contains signals relevant to the task; (ii) Train a StyleGAN-based image generator with an architecture that enables guidance by the classifier ("StylEx"); (iii) Automatically detect, extract, and visualize the top visual attributes that the classifier is sensitive towards. For visualization, we independently modify each of these attributes to generate counterfactual visualizations for a set of images (i.e., what the image would look like with the attribute increased or decreased); (iv) Formulate hypotheses for the underlying mechanisms, to stimulate future research. Specifically, present the discovered attributes and corresponding counterfactual visualizations to an interdisciplinary panel of experts so that hypotheses can account for social and structural determinants of health (e.g., whether the attributes correspond to known patho-physiological or socio-cultural phenomena, or could be novel discoveries). FINDINGS To demonstrate the broad applicability of our approach, we present results on eight prediction tasks across three medical imaging modalities-retinal fundus photographs, external eye photographs, and chest radiographs. We showcase examples where many of the automatically-learned attributes clearly capture clinically known features (e.g., types of cataract, enlarged heart), and demonstrate automatically-learned confounders that arise from factors beyond physiological mechanisms (e.g., chest X-ray underexposure is correlated with the classifier predicting abnormality, and eye makeup is correlated with the classifier predicting low hemoglobin levels). We further show that our method reveals a number of physiologically plausible, previously-unknown attributes based on the literature (e.g., differences in the fundus associated with self-reported sex, which were previously unknown). INTERPRETATION Our approach enables hypotheses generation via attribute visualizations and has the potential to enable researchers to better understand, improve their assessment, and extract new knowledge from AI-based models, as well as debug and design better datasets. Though not designed to infer causality, importantly, we highlight that attributes generated by our framework can capture phenomena beyond physiology or pathophysiology, reflecting the real world nature of healthcare delivery and socio-cultural factors, and hence interdisciplinary perspectives are critical in these investigations. Finally, we will release code to help researchers train their own StylEx models and analyze their predictive tasks of interest, and use the methodology presented in this paper for responsible interpretation of the revealed attributes. FUNDING Google.
Collapse
|
9
|
Artificial intelligence and explanation: How, why, and when to explain black boxes. Eur J Radiol 2024; 173:111393. [PMID: 38417186 DOI: 10.1016/j.ejrad.2024.111393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 02/22/2024] [Indexed: 03/01/2024]
Abstract
Artificial intelligence (AI) is infiltrating nearly all fields of science by storm. One notorious property that AI algorithms bring is their so-called black box character. In particular, they are said to be inherently unexplainable algorithms. Of course, such characteristics would pose a problem for the medical world, including radiology. The patient journey is filled with explanations along the way, from diagnoses to treatment, follow-up, and more. If we were to replace part of these steps with non-explanatory algorithms, we could lose grip on vital aspects such as finding mistakes, patient trust, and even the creation of new knowledge. In this article, we argue that, even for the darkest of black boxes, there is hope of understanding them. In particular, we compare the situation of understanding black box models to that of understanding the laws of nature in physics. In the case of physics, we are given a 'black box' law of nature, about which there is no upfront explanation. However, as current physical theories show, we can learn plenty about them. During this discussion, we present the process by which we make such explanations and the human role therein, keeping a solid focus on radiological AI situations. We will outline the AI developers' roles in this process, but also the critical role fulfilled by the practitioners, the radiologists, in providing a healthy system of continuous improvement of AI models. Furthermore, we explore the role of the explainable AI (XAI) research program in the broader context we describe.
Collapse
|
10
|
Explanatory subgraph attacks against Graph Neural Networks. Neural Netw 2024; 172:106097. [PMID: 38286098 DOI: 10.1016/j.neunet.2024.106097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 12/20/2023] [Accepted: 01/01/2024] [Indexed: 01/31/2024]
Abstract
Graph Neural Networks (GNNs) are often viewed as black boxes due to their lack of transparency, which hinders their application in critical fields. Many explanation methods have been proposed to address the interpretability issue of GNNs. These explanation methods reveal explanatory information about graphs from different perspectives. However, the explanatory information may also pose an attack risk to GNN models. In this work, we will explore this problem from the explanatory subgraph perspective. To this end, we utilize a powerful GNN explanation method called SubgraphX and deploy it locally to obtain explanatory subgraphs from given graphs. Then we propose methods for conducting evasion attacks and backdoor attacks based on the local explainer. In evasion attacks, the attacker gets explanatory subgraphs of test graphs from the local explainer and replace their explanatory subgraphs with an explanatory subgraph of other labels, making the target model misclassify test graphs as wrong labels. In backdoor attacks, the attacker employs the local explainer to select an explanatory trigger and locate suitable injection locations. We validate the effectiveness of our proposed attacks on state-of-art GNN models and different datasets. The results also demonstrate that our proposed backdoor attack is more efficient, adaptable, and concealed than previous backdoor attacks.
Collapse
|
11
|
Corruption depth: Analysis of DNN depth for misclassification. Neural Netw 2024; 172:106013. [PMID: 38354665 DOI: 10.1016/j.neunet.2023.11.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 10/11/2023] [Accepted: 11/14/2023] [Indexed: 02/16/2024]
Abstract
Many large and complex deep neural networks have been shown to provide higher performance on various computer vision tasks. However, very little is known about the relationship between the complexity of the input data along with the type of noise and the depth needed for correct classification. Existing studies do not address the issue of common corruptions adequately, especially in understanding what impact these corruptions leave on the individual part of a deep neural network. Therefore, we can safely assume that the classification (or misclassification) might be happening at a particular layer(s) of a network that accumulates to draw a final correct or incorrect prediction. In this paper, we introduce a novel concept of corruption depth, which identifies the location of the network layer/depth until the misclassification persists. We assert that the identification of such layers will help in better designing the network by pruning certain layers in comparison to the purification of the entire network which is computationally heavy. Through our extensive experiments, we present a coherent study to understand the processing of examples through the network. Our approach also illustrates different philosophies of example memorization and a one-dimensional view of sample or query difficulty. We believe that the understanding of the corruption depth can open a new dimension of model explainability and model compression, where in place of just visualizing the attention map, the classification progress can be seen throughout the network.
Collapse
|
12
|
Reinforcement learning for intensive care medicine: actionable clinical insights from novel approaches to reward shaping and off-policy model evaluation. Intensive Care Med Exp 2024; 12:32. [PMID: 38526681 PMCID: PMC10963714 DOI: 10.1186/s40635-024-00614-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 03/07/2024] [Indexed: 03/27/2024] Open
Abstract
BACKGROUND Reinforcement learning (RL) holds great promise for intensive care medicine given the abundant availability of data and frequent sequential decision-making. But despite the emergence of promising algorithms, RL driven bedside clinical decision support is still far from reality. Major challenges include trust and safety. To help address these issues, we introduce cross off-policy evaluation and policy restriction and show how detailed policy analysis may increase clinical interpretability. As an example, we apply these in the setting of RL to optimise ventilator settings in intubated covid-19 patients. METHODS With data from the Dutch ICU Data Warehouse and using an exhaustive hyperparameter grid search, we identified an optimal set of Dueling Double-Deep Q Network RL models. The state space comprised ventilator, medication, and clinical data. The action space focused on positive end-expiratory pressure (peep) and fraction of inspired oxygen (FiO2) concentration. We used gas exchange indices as interim rewards, and mortality and state duration as final rewards. We designed a novel evaluation method called cross off-policy evaluation (OPE) to assess the efficacy of models under varying weightings between the interim and terminal reward components. In addition, we implemented policy restriction to prevent potentially hazardous model actions. We introduce delta-Q to compare physician versus policy action quality and in-depth policy inspection using visualisations. RESULTS We created trajectories for 1118 intensive care unit (ICU) admissions and trained 69,120 models using 8 model architectures with 128 hyperparameter combinations. For each model, policy restrictions were applied. In the first evaluation step, 17,182/138,240 policies had good performance, but cross-OPE revealed suboptimal performance for 44% of those by varying the reward function used for evaluation. Clinical policy inspection facilitated assessment of action decisions for individual patients, including identification of action space regions that may benefit most from optimisation. CONCLUSION Cross-OPE can serve as a robust evaluation framework for safe RL model implementation by identifying policies with good generalisability. Policy restriction helps prevent potentially unsafe model recommendations. Finally, the novel delta-Q metric can be used to operationalise RL models in clinical practice. Our findings offer a promising pathway towards application of RL in intensive care medicine and beyond.
Collapse
|
13
|
AI-enhanced chemical paradigm: From molecular graphs to accurate prediction and mechanism. JOURNAL OF HAZARDOUS MATERIALS 2024; 465:133355. [PMID: 38198864 DOI: 10.1016/j.jhazmat.2023.133355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/19/2023] [Accepted: 12/21/2023] [Indexed: 01/12/2024]
Abstract
The development of accurate and interpretable models for predicting reaction constants of organic compounds with hydroxyl radicals is vital for advancing quantitative structure-activity relationships (QSAR) in pollutant degradation. Methods like molecular descriptors, molecular fingerprinting, and group contribution methods have limitations, as traditional machine learning struggles to capture all intramolecular information simultaneously. To address this, we established an integrated graph neural network (GNN) with approximately 12 million learnable parameters. GNN represents atoms as nodes and chemical bonds as edges, thus transforming molecules into a graph structures, effectively capturing microscopic properties while depicting atom connectivity in non-Euclidean space. Our datasets comprise 1401 pollutants to develop an integrated GNN model with Bayesian optimization, the model achieves root mean square errors of 0.165, 0.172, and 0.189 on the training, validation, and test datasets, respectively. Furthermore, we assess molecular structure similarity using molecular fingerprint to enhance the model's applicability. Afterwards, we propose a gradient weight mapping method for model explainability, uncovering the key functional groups in chemical reactions in artificial intelligence perspective, which would boost chemistry through artificial intelligence extreme arithmetic power.
Collapse
|
14
|
Ethics and artificial intelligence. Rev Clin Esp 2024; 224:178-186. [PMID: 38355097 DOI: 10.1016/j.rceng.2024.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 01/10/2024] [Indexed: 02/16/2024]
Abstract
The relationship between ethics and artificial intelligence in medicine is a crucial and complex topic that falls within its broader context. Ethics in medical artificial intelligence (AI) involves ensuring that technologies are safe, fair, and respect patient privacy. This includes concerns about the accuracy of diagnoses provided by artificial intelligence, fairness in patient treatment, and protection of personal health data. Advances in artificial intelligence can significantly improve healthcare, from more accurate diagnoses to personalized treatments. However, it is essential that developments in medical artificial intelligence are carried out with strong ethical consideration, involving healthcare professionals, artificial intelligence experts, patients, and ethics specialists to guide and oversee their implementation. Finally, transparency in artificial intelligence algorithms and ongoing training for medical professionals are fundamental.
Collapse
|
15
|
AI in medical diagnosis: AI prediction & human judgment. Artif Intell Med 2024; 149:102769. [PMID: 38462271 DOI: 10.1016/j.artmed.2024.102769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 12/02/2023] [Accepted: 01/14/2024] [Indexed: 03/12/2024]
Abstract
AI has long been regarded as a panacea for decision-making and many other aspects of knowledge work; as something that will help humans get rid of their shortcomings. We believe that AI can be a useful asset to support decision-makers, but not that it should replace decision-makers. Decision-making uses algorithmic analysis, but it is not solely algorithmic analysis; it also involves other factors, many of which are very human, such as creativity, intuition, emotions, feelings, and value judgments. We have conducted semi-structured open-ended research interviews with 17 dermatologists to understand what they expect from an AI application to deliver to medical diagnosis. We have found four aggregate dimensions along which the thinking of dermatologists can be described: the ways in which our participants chose to interact with AI, responsibility, 'explainability', and the new way of thinking (mindset) needed for working with AI. We believe that our findings will help physicians who might consider using AI in their diagnosis to understand how to use AI beneficially. It will also be useful for AI vendors in improving their understanding of how medics want to use AI in diagnosis. Further research will be needed to examine if our findings have relevance in the wider medical field and beyond.
Collapse
|
16
|
NeuroIGN: Explainable Multimodal Image-Guided System for Precise Brain Tumor Surgery. J Med Syst 2024; 48:25. [PMID: 38393660 DOI: 10.1007/s10916-024-02037-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 02/03/2024] [Indexed: 02/25/2024]
Abstract
Precise neurosurgical guidance is critical for successful brain surgeries and plays a vital role in all phases of image-guided neurosurgery (IGN). Neuronavigation software enables real-time tracking of surgical tools, ensuring their presentation with high precision in relation to a virtual patient model. Therefore, this work focuses on the development of a novel multimodal IGN system, leveraging deep learning and explainable AI to enhance brain tumor surgery outcomes. The study establishes the clinical and technical requirements of the system for brain tumor surgeries. NeuroIGN adopts a modular architecture, including brain tumor segmentation, patient registration, and explainable output prediction, and integrates open-source packages into an interactive neuronavigational display. The NeuroIGN system components underwent validation and evaluation in both laboratory and simulated operating room (OR) settings. Experimental results demonstrated its accuracy in tumor segmentation and the success of ExplainAI in increasing the trust of medical professionals in deep learning. The proposed system was successfully assembled and set up within 11 min in a pre-clinical OR setting with a tracking accuracy of 0.5 (± 0.1) mm. NeuroIGN was also evaluated as highly useful, with a high frame rate (19 FPS) and real-time ultrasound imaging capabilities. In conclusion, this paper describes not only the development of an open-source multimodal IGN system but also demonstrates the innovative application of deep learning and explainable AI algorithms in enhancing neuronavigation for brain tumor surgeries. By seamlessly integrating pre- and intra-operative patient image data with cutting-edge interventional devices, our experiments underscore the potential for deep learning models to improve the surgical treatment of brain tumors and long-term post-operative outcomes.
Collapse
|
17
|
An explainable unsupervised risk early warning framework based on the empirical cumulative distribution function: Application to dairy safety. Food Res Int 2024; 178:113933. [PMID: 38309904 DOI: 10.1016/j.foodres.2024.113933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 12/25/2023] [Accepted: 01/02/2024] [Indexed: 02/05/2024]
Abstract
Efficient food safety risk assessment significantly affects food safety supervision. However, food detection data of different types and batches show different feature distributions, resulting in unstable detection results of most risk assessment models, lack of interpretability of risk classification, and insufficient risk traceability. This study aims to explore an efficient food safety risk assessment model that takes into account robustness, interpretability and traceability. Therefore, the Explainable unsupervised risk Warning Framework based on the Empirical cumulative Distribution function (EWFED) was proposed. Firstly, the detection data's underlying distribution is estimated as non-parametric by calculating each testing indicator's empirical cumulative distribution. Next, the tail probabilities of each testing indicator are estimated based on these distributions and summarized to obtain the sample risk value. Finally, the "3σ Rule" is used to achieve explainable risk classification of qualified samples, and the reasons for unqualified samples are tracked according to the risk score of each testing indicator. The experiments of the EWFED model on two types of dairy product detection data in actual application scenarios have verified its effectiveness, achieving interpretable risk division and risk tracing of unqualified samples. Therefore, this study provides a more robust and systematic food safety risk assessment method to promote precise management and control of food safety risks effectively.
Collapse
|
18
|
Multi-style spatial attention module for cortical cataract classification in AS-OCT image with supervised contrastive learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 244:107958. [PMID: 38070390 DOI: 10.1016/j.cmpb.2023.107958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 10/30/2023] [Accepted: 11/27/2023] [Indexed: 01/26/2024]
Abstract
BACKGROUND AND OBJECTIVE Precise cortical cataract (CC) classification plays a significant role in early cataract intervention and surgery. Anterior segment optical coherence tomography (AS-OCT) images have shown excellent potential in cataract diagnosis. However, due to the complex opacity distributions of CC, automatic AS-OCT-based CC classification has been rarely studied. In this paper, we aim to explore the opacity distribution characteristics of CC as clinical priori to enhance the representational capability of deep convolutional neural networks (CNNs) in CC classification tasks. METHODS We propose a novel architectural unit, Multi-style Spatial Attention module (MSSA), which recalibrates intermediate feature maps by exploiting diverse clinical contexts. MSSA first extracts the clinical style context features with Group-wise Style Pooling (GSP), then refines the clinical style context features with Local Transform (LT), and finally executes group-wise feature map recalibration via Style Feature Recalibration (SFR). MSSA can be easily integrated into modern CNNs with negligible overhead. RESULTS The extensive experiments on a CASIA2 AS-OCT dataset and two public ophthalmic datasets demonstrate the superiority of MSSA over state-of-the-art attention methods. The visualization analysis and ablation study are conducted to improve the explainability of MSSA in the decision-making process. CONCLUSIONS Our proposed MSSANet utilized the opacity distribution characteristics of CC to enhance the representational power and explainability of deep convolutional neural network (CNN) and improve the CC classification performance. Our proposed method has the potential in the early clinical CC diagnosis.
Collapse
|
19
|
ShapeAXI: Shape Analysis Explainability and Interpretability. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2024; 12931:1293116. [PMID: 38736903 PMCID: PMC11085013 DOI: 10.1117/12.3007053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2024]
Abstract
ShapeAXI represents a cutting-edge framework for shape analysis that leverages a multi-view approach, capturing 3D objects from diverse viewpoints and subsequently analyzing them via 2D Convolutional Neural Networks (CNNs). We implement an automatic N-fold cross-validation process and aggregate the results across all folds. This ensures insightful explainability heat-maps for each class across every shape, enhancing interpretability and contributing to a more nuanced understanding of the underlying phenomena. We demonstrate the versatility of ShapeAXI through two targeted classification experiments. The first experiment categorizes condyles into healthy and degenerative states. The second, more intricate experiment, engages with shapes extracted from CBCT scans of cleft patients, efficiently classifying them into four severity classes. This innovative application not only aligns with existing medical research but also opens new avenues for specialized cleft patient analysis, holding considerable promise for both scientific exploration and clinical practice. The rich insights derived from ShapeAXI's explainability images reinforce existing knowledge and provide a platform for fresh discovery in the fields of condyle assessment and cleft patient severity classification. As a versatile and interpretative tool, ShapeAXI sets a new benchmark in 3D object interpretation and classification, and its groundbreaking approach hopes to make significant contributions to research and practical applications across various domains. ShapeAXI is available in our GitHub repository https://github.com/DCBIA-OrthoLab/ShapeAXI.
Collapse
|
20
|
CT-based radiomics: predicting early outcomes after percutaneous transluminal renal angioplasty in patients with severe atherosclerotic renal artery stenosis. Vis Comput Ind Biomed Art 2024; 7:1. [PMID: 38212451 PMCID: PMC10784441 DOI: 10.1186/s42492-023-00152-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 12/27/2023] [Indexed: 01/13/2024] Open
Abstract
This study aimed to comprehensively evaluate non-contrast computed tomography (CT)-based radiomics for predicting early outcomes in patients with severe atherosclerotic renal artery stenosis (ARAS) after percutaneous transluminal renal angioplasty (PTRA). A total of 52 patients were retrospectively recruited, and their clinical characteristics and pretreatment CT images were collected. During a median follow-up period of 3.7 mo, 18 patients were confirmed to have benefited from the treatment, defined as a 20% improvement from baseline in the estimated glomerular filtration rate. A deep learning network trained via self-supervised learning was used to enhance the imaging phenotype characteristics. Radiomics features, comprising 116 handcrafted features and 78 deep learning features, were extracted from the affected renal and perirenal adipose regions. More features from the latter were correlated with early outcomes, as determined by univariate analysis, and were visually represented in radiomics heatmaps and volcano plots. After using consensus clustering and the least absolute shrinkage and selection operator method for feature selection, five machine learning models were evaluated. Logistic regression yielded the highest leave-one-out cross-validation accuracy of 0.780 (95%CI: 0.660-0.880) for the renal signature, while the support vector machine achieved 0.865 (95%CI: 0.769-0.942) for the perirenal adipose signature. SHapley Additive exPlanations was used to visually interpret the prediction mechanism, and a histogram feature and a deep learning feature were identified as the most influential factors for the renal signature and perirenal adipose signature, respectively. Multivariate analysis revealed that both signatures served as independent predictive factors. When combined, they achieved an area under the receiver operating characteristic curve of 0.888 (95%CI: 0.784-0.992), indicating that the imaging phenotypes from both regions complemented each other. In conclusion, non-contrast CT-based radiomics can be leveraged to predict the early outcomes of PTRA, thereby assisting in identifying patients with ARAS suitable for this treatment, with perirenal adipose tissue providing added predictive value.
Collapse
|
21
|
Deep-GA-Net for Accurate and Explainable Detection of Geographic Atrophy on OCT Scans. OPHTHALMOLOGY SCIENCE 2023; 3:100311. [PMID: 37304045 PMCID: PMC10251072 DOI: 10.1016/j.xops.2023.100311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 04/06/2023] [Accepted: 04/07/2023] [Indexed: 06/13/2023]
Abstract
Objective To propose Deep-GA-Net, a 3-dimensional (3D) deep learning network with 3D attention layer, for the detection of geographic atrophy (GA) on spectral domain OCT (SD-OCT) scans, explain its decision making, and compare it with existing methods. Design Deep learning model development. Participants Three hundred eleven participants from the Age-Related Eye Disease Study 2 Ancillary SD-OCT Study. Methods A dataset of 1284 SD-OCT scans from 311 participants was used to develop Deep-GA-Net. Cross-validation was used to evaluate Deep-GA-Net, where each testing set contained no participant from the corresponding training set. En face heatmaps and important regions at the B-scan level were used to visualize the outputs of Deep-GA-Net, and 3 ophthalmologists graded the presence or absence of GA in them to assess the explainability (i.e., understandability and interpretability) of its detections. Main Outcome Measures Accuracy, area under receiver operating characteristic curve (AUC), area under precision-recall curve (APR). Results Compared with other networks, Deep-GA-Net achieved the best metrics, with accuracy of 0.93, AUC of 0.94, and APR of 0.91, and received the best gradings of 0.98 and 0.68 on the en face heatmap and B-scan grading tasks, respectively. Conclusions Deep-GA-Net was able to detect GA accurately from SD-OCT scans. The visualizations of Deep-GA-Net were more explainable, as suggested by 3 ophthalmologists. The code and pretrained models are publicly available at https://github.com/ncbi/Deep-GA-Net. Financial Disclosures The author(s) have no proprietary or commercial interest in any materials discussed in this article.
Collapse
|
22
|
A scoping review of interpretability and explainability concerning artificial intelligence methods in medical imaging. Eur J Radiol 2023; 169:111159. [PMID: 37976760 DOI: 10.1016/j.ejrad.2023.111159] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/26/2023] [Accepted: 10/19/2023] [Indexed: 11/19/2023]
Abstract
PURPOSE To review eXplainable Artificial Intelligence/(XAI) methods available for medical imaging/(MI). METHOD A scoping review was conducted following the Joanna Briggs Institute's methodology. The search was performed on Pubmed, Embase, Cinhal, Web of Science, BioRxiv, MedRxiv, and Google Scholar. Studies published in French and English after 2017 were included. Keyword combinations and descriptors related to explainability, and MI modalities were employed. Two independent reviewers screened abstracts, titles and full text, resolving differences through discussion. RESULTS 228 studies met the criteria. XAI publications are increasing, targeting MRI (n = 73), radiography (n = 47), CT (n = 46). Lung (n = 82) and brain (n = 74) pathologies, Covid-19 (n = 48), Alzheimer's disease (n = 25), brain tumors (n = 15) are the main pathologies explained. Explanations are presented visually (n = 186), numerically (n = 67), rule-based (n = 11), textually (n = 11), and example-based (n = 6). Commonly explained tasks include classification (n = 89), prediction (n = 47), diagnosis (n = 39), detection (n = 29), segmentation (n = 13), and image quality improvement (n = 6). The most frequently provided explanations were local (78.1 %), 5.7 % were global, and 16.2 % combined both local and global approaches. Post-hoc approaches were predominantly employed. The used terminology varied, sometimes indistinctively using explainable (n = 207), interpretable (n = 187), understandable (n = 112), transparent (n = 61), reliable (n = 31), and intelligible (n = 3). CONCLUSION The number of XAI publications in medical imaging is increasing, primarily focusing on applying XAI techniques to MRI, CT, and radiography for classifying and predicting lung and brain pathologies. Visual and numerical output formats are predominantly used. Terminology standardisation remains a challenge, as terms like "explainable" and "interpretable" are sometimes being used indistinctively. Future XAI development should consider user needs and perspectives.
Collapse
|
23
|
Explainable survival analysis with uncertainty using convolution-involved vision transformer. Comput Med Imaging Graph 2023; 110:102302. [PMID: 37839216 DOI: 10.1016/j.compmedimag.2023.102302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 09/20/2023] [Accepted: 09/20/2023] [Indexed: 10/17/2023]
Abstract
Image-based precision medicine research is able to help doctors make better decisions on treatments. Among all kinds of medical images, a special form is called Whole Slide Image (WSI), which is used for diagnosing patients with cancer, aiming to enable more accurate survival prediction with its high resolution. However, One unique challenge of the WSI-based prediction models is processing the gigabyte-size or even terabyte-size WSIs, which would make most models computationally infeasible. Although existing models mostly use a pre-selected subset of key patches or patch clusters as input, they might discard some important morphology information, making the prediction inferior. Another challenge is improving the prediction models' explainability, which is crucial to help doctors understand the predictions given by the models and make faithful decisions with high confidence. To address the above two challenges, in this work, we propose a novel explainable survival prediction model based on Vision Transformer. Specifically, we adopt dual-channel convolutional layers to utilize the complete WSIs for more accurate predictions. We also introduce the aleatoric uncertainty into our model to understand its limitation and avoid overconfidence in using the prediction results. Additionally, we present a post-hoc explainable method to identify the most salient patches and distinct morphology features as supporting evidence for predictions. Evaluations of two large cancer datasets show that our proposed model is able to make survival predictions more effectively and has better explainability for cancer diagnosis.
Collapse
|
24
|
Development of a novel machine learning model based on laboratory and imaging indices to predict acute cardiac injury in cancer patients with COVID-19 infection: a retrospective observational study. J Cancer Res Clin Oncol 2023; 149:17039-17050. [PMID: 37747525 DOI: 10.1007/s00432-023-05417-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 09/07/2023] [Indexed: 09/26/2023]
Abstract
PURPOSE Due to the increased risk of acute cardiac injury (ACI) and poor prognosis in cancer patients with COVID-19 infection, our aim was to develop a novel and interpretable model for predicting ACI occurrence in cancer patients with COVID-19 infection. METHODS This retrospective observational study screened 740 cancer patients with COVID-19 infection from December 2022 to April 2023. The least absolute shrinkage and selection operator (LASSO) regression was used for the preliminary screening of the indices. To enhance the model accuracy, we introduced an alpha index to further screen and rank the indices based on their significance. Random forest (RF) was used to construct the prediction model. The Shapley Additive Explanation (SHAP) and Local Interpretable Model-Agnostic Explanation (LIME) methods were utilized to explain the model. RESULTS According to the inclusion criteria, 201 cancer patients with COVID-19, including 36 variables indices, were included in the analysis. The top eight indices (albumin, lactate dehydrogenase, cystatin C, neutrophil count, creatine kinase isoenzyme, red blood cell distribution width, D-dimer and chest computed tomography) for predicting the occurrence of ACI in cancer patients with COVID-19 infection were included in the RF model. The model achieved an area under curve (AUC) of 0.940, an accuracy of 0.866, a sensitivity of 0.750 and a specificity of 0.900. The calibration curve and decision curve analysis showed good calibration and clinical practicability. SHAP results demonstrated that albumin was the most important index for predicting the occurrence of ACI. LIME results showed that the model could predict the probability of ACI in each cancer patient infected with COVID-19 individually. CONCLUSION We developed a novel machine-learning model that demonstrates high explainability and accuracy in predicting the occurrence of ACI in cancer patients with COVID-19 infection, using laboratory and imaging indices.
Collapse
|
25
|
The promise of explainable deep learning for omics data analysis: Adding new discovery tools to AI. N Biotechnol 2023; 77:1-11. [PMID: 37329982 DOI: 10.1016/j.nbt.2023.06.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/01/2023] [Accepted: 06/14/2023] [Indexed: 06/19/2023]
Abstract
Deep learning has already revolutionised the way a wide range of data is processed in many areas of daily life. The ability to learn abstractions and relationships from heterogeneous data has provided impressively accurate prediction and classification tools to handle increasingly big datasets. This has a significant impact on the growing wealth of omics datasets, with the unprecedented opportunity for a better understanding of the complexity of living organisms. While this revolution is transforming the way these data are analyzed, explainable deep learning is emerging as an additional tool with the potential to change the way biological data is interpreted. Explainability addresses critical issues such as transparency, so important when computational tools are introduced especially in clinical environments. Moreover, it empowers artificial intelligence with the capability to provide new insights into the input data, thus adding an element of discovery to these already powerful resources. In this review, we provide an overview of the transformative effects explainable deep learning is having on multiple sectors, ranging from genome engineering and genomics, from radiomics to drug design and clinical trials. We offer a perspective to life scientists, to better understand the potential of these tools, and a motivation to implement them in their research, by suggesting learning resources they can use to move their first steps in this field.
Collapse
|
26
|
A practical guide to the implementation of AI in orthopaedic research - part 1: opportunities in clinical application and overcoming existing challenges. J Exp Orthop 2023; 10:117. [PMID: 37968370 PMCID: PMC10651597 DOI: 10.1186/s40634-023-00683-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 10/21/2023] [Indexed: 11/17/2023] Open
Abstract
Artificial intelligence (AI) has the potential to transform medical research by improving disease diagnosis, clinical decision-making, and outcome prediction. Despite the rapid adoption of AI and machine learning (ML) in other domains and industry, deployment in medical research and clinical practice poses several challenges due to the inherent characteristics and barriers of the healthcare sector. Therefore, researchers aiming to perform AI-intensive studies require a fundamental understanding of the key concepts, biases, and clinical safety concerns associated with the use of AI. Through the analysis of large, multimodal datasets, AI has the potential to revolutionize orthopaedic research, with new insights regarding the optimal diagnosis and management of patients affected musculoskeletal injury and disease. The article is the first in a series introducing fundamental concepts and best practices to guide healthcare professionals and researcher interested in performing AI-intensive orthopaedic research studies. The vast potential of AI in orthopaedics is illustrated through examples involving disease- or injury-specific outcome prediction, medical image analysis, clinical decision support systems and digital twin technology. Furthermore, it is essential to address the role of human involvement in training unbiased, generalizable AI models, their explainability in high-risk clinical settings and the implementation of expert oversight and clinical safety measures for failure. In conclusion, the opportunities and challenges of AI in medicine are presented to ensure the safe and ethical deployment of AI models for orthopaedic research and clinical application. Level of evidence IV.
Collapse
|
27
|
Cardiometabolic risk estimation using exposome data and machine learning. Int J Med Inform 2023; 179:105209. [PMID: 37729839 DOI: 10.1016/j.ijmedinf.2023.105209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 08/11/2023] [Accepted: 08/30/2023] [Indexed: 09/22/2023]
Abstract
BACKGROUND The human exposome encompasses all exposures that individuals encounter throughout their lifetime. It is now widely acknowledged that health outcomes are influenced not only by genetic factors but also by the interactions between these factors and various exposures. Consequently, the exposome has emerged as a significant contributor to the overall risk of developing major diseases, such as cardiovascular disease (CVD) and diabetes. Therefore, personalized early risk assessment based on exposome attributes might be a promising tool for identifying high-risk individuals and improving disease prevention. OBJECTIVE Develop and evaluate a novel and fair machine learning (ML) model for CVD and type 2 diabetes (T2D) risk prediction based on a set of readily available exposome factors. We evaluated our model using internal and external validation groups from a multi-center cohort. To be considered fair, the model was required to demonstrate consistent performance across different sub-groups of the cohort. METHODS From the UK Biobank, we identified 5,348 and 1,534 participants who within 13 years from the baseline visit were diagnosed with CVD and T2D, respectively. An equal number of participants who did not develop these pathologies were randomly selected as the control group. 109 readily available exposure variables from six different categories (physical measures, environmental, lifestyle, mental health events, sociodemographics, and early-life factors) from the participant's baseline visit were considered. We adopted the XGBoost ensemble model to predict individuals at risk of developing the diseases. The model's performance was compared to that of an integrative ML model which is based on a set of biological, clinical, physical, and sociodemographic variables, and, additionally for CVD, to the Framingham risk score. Moreover, we assessed the proposed model for potential bias related to sex, ethnicity, and age. Lastly, we interpreted the model's results using SHAP, a state-of-the-art explainability method. RESULTS The proposed ML model presents a comparable performance to the integrative ML model despite using solely exposome information, achieving a ROC-AUC of 0.78±0.01 and 0.77±0.01 for CVD and T2D, respectively. Additionally, for CVD risk prediction, the exposome-based model presents an improved performance over the traditional Framingham risk score. No bias in terms of key sensitive variables was identified. CONCLUSIONS We identified exposome factors that play an important role in identifying patients at risk of CVD and T2D, such as naps during the day, age completed full-time education, past tobacco smoking, frequency of tiredness/unenthusiasm, and current work status. Overall, this work demonstrates the potential of exposome-based machine learning as a fair CVD and T2D risk assessment tool.
Collapse
|
28
|
Pathological changes or technical artefacts? The problem of the heterogenous databases in COVID-19 CXR image analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107684. [PMID: 37356354 PMCID: PMC10278898 DOI: 10.1016/j.cmpb.2023.107684] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 06/11/2023] [Accepted: 06/18/2023] [Indexed: 06/27/2023]
Abstract
BACKGROUND When the COVID-19 pandemic commenced in 2020, scientists assisted medical specialists with diagnostic algorithm development. One scientific research area related to COVID-19 diagnosis was medical imaging and its potential to support molecular tests. Unfortunately, several systems reported high accuracy in development but did not fare well in clinical application. The reason was poor generalization, a long-standing issue in AI development. Researchers found many causes of this issue and decided to refer to them as confounders, meaning a set of artefacts and methodological errors associated with the method. We aim to contribute to this steed by highlighting an undiscussed confounder related to image resolution. METHODS 20 216 chest X-ray images (CXR) from worldwide centres were analyzed. The CXRs were bijectively projected into the 2D domain by performing Uniform Manifold Approximation and Projection (UMAP) embedding on the radiomic features (rUMAP) or CNN-based neural features (nUMAP) from the pre-last layer of the pre-trained classification neural network. Additional 44 339 thorax CXRs were used for validation. The comprehensive analysis of the multimodality of the density distribution in rUMAP/nUMAP domains and its relation to the original image properties was used to identify the main confounders. RESULTS nUMAP revealed a hidden bias of neural networks towards the image resolution, which the regular up-sampling procedure cannot compensate for. The issue appears regardless of the network architecture and is not observed in a high-resolution dataset. The impact of the resolution heterogeneity can be partially diminished by applying advanced deep-learning-based super-resolution networks. CONCLUSIONS rUMAP and nUMAP are great tools for image homogeneity analysis and bias discovery, as demonstrated by applying them to COVID-19 image data. Nonetheless, nUMAP could be applied to any type of data for which a deep neural network could be constructed. Advanced image super-resolution solutions are needed to reduce the impact of the resolution diversity on the classification network decision.
Collapse
|
29
|
An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2023; 14221:723-733. [PMID: 37982132 PMCID: PMC10657737 DOI: 10.1007/978-3-031-43895-0_68] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2023]
Abstract
One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work, we present an explainable, geometric, weighted-graph attention neural network (xGW-GAT) to identify functional networks predictive of the progression of gait difficulties in individuals with PD. xGW-GAT predicts the multi-class gait impairment on the MDS-Unified PD Rating Scale (MDS-UPDRS). Our computational- and data-efficient model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes, based on which we learn an attention mask yielding individual- and group-level explainability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, xGW-GAT identifies functional connectivity patterns associated with gait impairment in PD and offers interpretable explanations of functional subnetworks associated with motor impairment. Our model successfully outperforms several existing methods while simultaneously revealing clinically-relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT.
Collapse
|
30
|
Abstract
BACKGROUND The cause and symptoms of long COVID are poorly understood. It is challenging to predict whether a given COVID-19 patient will develop long COVID in the future. METHODS We used electronic health record (EHR) data from the National COVID Cohort Collaborative to predict the incidence of long COVID. We trained two machine learning (ML) models - logistic regression (LR) and random forest (RF). Features used to train predictors included symptoms and drugs ordered during acute infection, measures of COVID-19 treatment, pre-COVID comorbidities, and demographic information. We assigned the 'long COVID' label to patients diagnosed with the U09.9 ICD10-CM code. The cohorts included patients with (a) EHRs reported from data partners using U09.9 ICD10-CM code and (b) at least one EHR in each feature category. We analysed three cohorts: all patients (n = 2,190,579; diagnosed with long COVID = 17,036), inpatients (149,319; 3,295), and outpatients (2,041,260; 13,741). FINDINGS LR and RF models yielded median AUROC of 0.76 and 0.75, respectively. Ablation study revealed that drugs had the highest influence on the prediction task. The SHAP method identified age, gender, cough, fatigue, albuterol, obesity, diabetes, and chronic lung disease as explanatory features. Models trained on data from one N3C partner and tested on data from the other partners had average AUROC of 0.75. INTERPRETATION ML-based classification using EHR information from the acute infection period is effective in predicting long COVID. SHAP methods identified important features for prediction. Cross-site analysis demonstrated the generalizability of the proposed methodology. FUNDING NCATS U24 TR002306, NCATS UL1 TR003015, Axle Informatics Subcontract: NCATS-P00438-B, NIH/NIDDK/OD, PSR2015-1720GVALE_01, G43C22001320007, and Director, Office of Science, Office of Basic Energy Sciences of the U.S. Department of Energy Contract No. DE-AC02-05CH11231.
Collapse
|
31
|
Black-box assisted medical decisions: AI power vs. ethical physician care. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2023; 26:285-292. [PMID: 37273041 PMCID: PMC10425517 DOI: 10.1007/s11019-023-10153-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 04/08/2023] [Indexed: 06/06/2023]
Abstract
I raise an ethical problem with physicians using "black box" medical AI algorithms, arguing that its use would compromise proper patient care. Even if AI results are reliable, my contention is that without being able to explain medical decisions to patients, physicians' use of black box AIs would erode the effective and respectful care they provide patients. In addition, I argue that physicians should use AI black boxes only for patients in dire straits, or when physicians use AI as a "co-pilot" (analogous to a spellchecker) but can independently confirm its accuracy. My argument will be further sharpened when, lastly, I give important attention to Alex John London's objection that physicians already sometimes prescribe treatment, such as lithium drugs, even though neither researchers nor doctors can explain why the treatment works.
Collapse
|
32
|
GSM-Net: A global sequence modelling network for the segmentation of short axis CINE MRI images. Comput Med Imaging Graph 2023; 108:102266. [PMID: 37385047 DOI: 10.1016/j.compmedimag.2023.102266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 05/04/2023] [Accepted: 06/11/2023] [Indexed: 07/01/2023]
Abstract
Atrial Fibrillation (AF) is a disease where the atria fail to properly contract but quiver instead, due to the abnormal electrical activity of the atrial tissue. In AF patients, anatomical and functional parameters of the left atrium (LA) largely differ from that of healthy people due to LA remodelling, which can continue in many cases after the catheter ablation treatment. Therefore, it is important to follow up with AF patients to detect any recurrence. LA segmentation masks obtained from short-axis CINE MRI images are used as the gold standard for the quantification of LA parameters. Thick slices of CINE MRI images hinder the use of 3D networks for segmentation while 2D architectures often fail to model inter-slice dependencies. This study presents GSM-Net which approximates 3D networks with effective modelling of inter-slice similarities with two new modules: global slice sequence encoder (GSSE) and sequence dependent channel attention module (SdCAt). In contrast to previous work modelling only local inter-slice similarities, GSSE also models global spatial dependencies across slices. SdCAt generates a distribution of attention weights over MRI slices per channel, to better trace characteristic changes in the size of the LA or other structures across slices. We found that GSM-Net outperforms previous methods on LA segmentation and helps to identify AF recurrence patients. We believe that GSM-Net can be used as an automatic tool to estimate LA parameters such as ejection fraction to identify AF, and to follow up with patients after treatment to detect any recurrence.
Collapse
|
33
|
A novel systematic pipeline for increased predictability and explainability of growth patterns in children using trajectory features. Int J Med Inform 2023; 177:105143. [PMID: 37473656 DOI: 10.1016/j.ijmedinf.2023.105143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 06/28/2023] [Accepted: 07/05/2023] [Indexed: 07/22/2023]
Abstract
OBJECTIVE Longitudinal patterns of growth in early childhood are associated with health conditions throughout life. Knowledge of such patterns and the ability to predict them can lead to better prevention and improved health promotion in adulthood. However, growth analyses are characterized by significant variability, and pattern detection is affected by the method applied. Moreover, pattern labelling is typically performed based on ad hoc methods, such as visualizations or clinical experience. Here, we propose a novel pipeline using features extracted from growth trajectories using mathematical, statistical and machine-learning approaches to predict growth patterns and label them in a systematic and unequivocal manner. METHODS We extracted mathematical and clinical features from 9577 children growth trajectories embedded with machine-learning predictions of the growth patterns. We experimented with two sets of features (CAnonical Time-series Characteristics and trajectory features specific to growth), developmental periods and six machine-learning classifiers. Clinical experts provided labels for the detected patterns and decision rules were created to associate the features with the labelled patterns. The predictive capacity of the extracted features was validated on two heterogenous populations (The Applied Research Group for Kids and the 2004 Pelotas Birth Cohort, based in Canada and Brazil, respectively). RESULTS Features predictive ability measured by accuracy and F1 score was ≥ 80% and ≥ 0.76 respectively in both cohorts. A small number of features (n = 74) was sufficient to distinguish between growth patterns in both cohorts. Slope, intercept of the trajectory, age at peak value, start value and change of the growth measure were among the top identified features. CONCLUSION Growth features can be reliably used as predictors of growth patterns and provide an unbiased understanding of growth patterns. They can be used as tool to reduce the effort to repeat analysis and variability concerning anthropometric measures, time points and analytical methods, in the context of the same or similar populations.
Collapse
|
34
|
Explainable machine learning framework to predict personalized physiological aging. Aging Cell 2023; 22:e13872. [PMID: 37300327 PMCID: PMC10410015 DOI: 10.1111/acel.13872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/17/2023] [Accepted: 05/03/2023] [Indexed: 06/12/2023] Open
Abstract
Attaining personalized healthy aging requires accurate monitoring of physiological changes and identifying subclinical markers that predict accelerated or delayed aging. Classic biostatistical methods most rely on supervised variables to estimate physiological aging and do not capture the full complexity of inter-parameter interactions. Machine learning (ML) is promising, but its black box nature eludes direct understanding, substantially limiting physician confidence and clinical usage. Using a broad population dataset from the National Health and Nutrition Examination Survey (NHANES) study including routine biological variables and after selection of XGBoost as the most appropriate algorithm, we created an innovative explainable ML framework to determine a Personalized physiological age (PPA). PPA predicted both chronic disease and mortality independently of chronological age. Twenty-six variables were sufficient to predict PPA. Using SHapley Additive exPlanations (SHAP), we implemented a precise quantitative associated metric for each variable explaining physiological (i.e., accelerated or delayed) deviations from age-specific normative data. Among the variables, glycated hemoglobin (HbA1c) displays a major relative weight in the estimation of PPA. Finally, clustering profiles of identical contextualized explanations reveal different aging trajectories opening opportunities to specific clinical follow-up. These data show that PPA is a robust, quantitative and explainable ML-based metric that monitors personalized health status. Our approach also provides a complete framework applicable to different datasets or variables, allowing precision physiological age estimation.
Collapse
|
35
|
An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment. ARXIV 2023:arXiv:2307.13108v1. [PMID: 37547656 PMCID: PMC10402187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work, we present an explainable, geometric, weighted-graph attention neural network (xGW-GAT) to identify functional networks predictive of the progression of gait difficulties in individuals with PD. xGW-GAT predicts the multi-class gait impairment on the MDS-Unified PD Rating Scale (MDS-UPDRS). Our computational- and data-efficient model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes, based on which we learn an attention mask yielding individual- and group-level explain-ability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, xGW-GAT identifies functional connectivity patterns associated with gait impairment in PD and offers interpretable explanations of functional subnetworks associated with motor impairment. Our model successfully outperforms several existing methods while simultaneously revealing clinically-relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT.
Collapse
|
36
|
DeepMiCa: Automatic segmentation and classification of breast MIcroCAlcifications from mammograms. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 235:107483. [PMID: 37030174 DOI: 10.1016/j.cmpb.2023.107483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 02/04/2023] [Accepted: 03/12/2023] [Indexed: 05/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Breast cancer is the world's most prevalent form of cancer. The survival rates have increased in the last years mainly due to factors such as screening programs for early detection, new insights on the disease mechanisms as well as personalised treatments. Microcalcifications are the only first detectable sign of breast cancer and diagnosis timing is strongly related to the chances of survival. Nevertheless microcalcifications detection and classification as benign or malignant lesions is still a challenging clinical task and their malignancy can only be proven after a biopsy procedure. We propose DeepMiCa, a fully automated and visually explainable deep-learning based pipeline for the analysis of raw mammograms with microcalcifications. Our aim is to propose a reliable decision support system able to guide the diagnosis and help the clinicians to better inspect borderline difficult cases. METHODS DeepMiCa is composed by three main steps: (1) Preprocessing of the raw scans (2) Automatic patch-based Semantic Segmentation using a UNet based network with a custom loss function appositely designed to deal with extremely small lesions (3) Classification of the detected lesions with a deep transfer-learning approach. Finally, state-of-the-art explainable AI methods are used to produce maps for a visual interpretation of the classification results. Each step of DeepMiCa is designed to address the main limitations of the previous proposed works resulting in a novel automated and accurate pipeline easily customisable to meet radiologists' needs. RESULTS The proposed segmentation and classification algorithms achieve an area under the ROC curve of 0.95 and 0.89 respectively. Compared to previously proposed works, this method does not require high performance computational resources and provides a visual explanation of the final classification results. CONCLUSION To conclude, we designed a novel fully automated pipeline for detection and classification of breast microcalcifications. We believe that the proposed system has the potential to provide a second opinion in the diagnosis process giving the clinicians the opportunity to quickly visualise and inspect relevant imaging characteristics. In the clinical practice the proposed decision support system could help reduce the rate of misclassified lesions and consequently the number of unnecessary biopsies.
Collapse
|
37
|
BolT: Fused window transformers for fMRI time series analysis. Med Image Anal 2023; 88:102841. [PMID: 37224718 DOI: 10.1016/j.media.2023.102841] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 02/07/2023] [Accepted: 05/10/2023] [Indexed: 05/26/2023]
Abstract
Deep-learning models have enabled performance leaps in analysis of high-dimensional functional MRI (fMRI) data. Yet, many previous methods are suboptimally sensitive for contextual representations across diverse time scales. Here, we present BolT, a blood-oxygen-level-dependent transformer model, for analyzing multi-variate fMRI time series. BolT leverages a cascade of transformer encoders equipped with a novel fused window attention mechanism. Encoding is performed on temporally-overlapped windows within the time series to capture local representations. To integrate information temporally, cross-window attention is computed between base tokens in each window and fringe tokens from neighboring windows. To gradually transition from local to global representations, the extent of window overlap and thereby number of fringe tokens are progressively increased across the cascade. Finally, a novel cross-window regularization is employed to align high-level classification features across the time series. Comprehensive experiments on large-scale public datasets demonstrate the superior performance of BolT against state-of-the-art methods. Furthermore, explanatory analyses to identify landmark time points and regions that contribute most significantly to model decisions corroborate prominent neuroscientific findings in the literature.
Collapse
|
38
|
A clinically motivated self-supervised approach for content-based image retrieval of CT liver images. Comput Med Imaging Graph 2023; 107:102239. [PMID: 37207397 DOI: 10.1016/j.compmedimag.2023.102239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 05/02/2023] [Accepted: 05/02/2023] [Indexed: 05/21/2023]
Abstract
Deep learning-based approaches for content-based image retrieval (CBIR) of computed tomography (CT) liver images is an active field of research, but suffer from some critical limitations. First, they are heavily reliant on labeled data, which can be challenging and costly to acquire. Second, they lack transparency and explainability, which limits the trustworthiness of deep CBIR systems. We address these limitations by: (1) Proposing a self-supervised learning framework that incorporates domain-knowledge into the training procedure, and, (2) by providing the first representation learning explainability analysis in the context of CBIR of CT liver images. Results demonstrate improved performance compared to the standard self-supervised approach across several metrics, as well as improved generalization across datasets. Further, we conduct the first representation learning explainability analysis in the context of CBIR, which reveals new insights into the feature extraction process. Lastly, we perform a case study with cross-examination CBIR that demonstrates the usability of our proposed framework. We believe that our proposed framework could play a vital role in creating trustworthy deep CBIR systems that can successfully take advantage of unlabeled data.
Collapse
|
39
|
Comparison of correctly and incorrectly classified patients for in-hospital mortality prediction in the intensive care unit. BMC Med Res Methodol 2023; 23:102. [PMID: 37095430 PMCID: PMC10124049 DOI: 10.1186/s12874-023-01921-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 04/13/2023] [Indexed: 04/26/2023] Open
Abstract
BACKGROUND The use of machine learning is becoming increasingly popular in many disciplines, but there is still an implementation gap of machine learning models in clinical settings. Lack of trust in models is one of the issues that need to be addressed in an effort to close this gap. No models are perfect, and it is crucial to know in which use cases we can trust a model and for which cases it is less reliable. METHODS Four different algorithms are trained on the eICU Collaborative Research Database using similar features as the APACHE IV severity-of-disease scoring system to predict hospital mortality in the ICU. The training and testing procedure is repeated 100 times on the same dataset to investigate whether predictions for single patients change with small changes in the models. Features are then analysed separately to investigate potential differences between patients consistently classified correctly and incorrectly. RESULTS A total of 34 056 patients (58.4%) are classified as true negative, 6 527 patients (11.3%) as false positive, 3 984 patients (6.8%) as true positive, and 546 patients (0.9%) as false negatives. The remaining 13 108 patients (22.5%) are inconsistently classified across models and rounds. Histograms and distributions of feature values are compared visually to investigate differences between groups. CONCLUSIONS It is impossible to distinguish the groups using single features alone. Considering a combination of features, the difference between the groups is clearer. Incorrectly classified patients have features more similar to patients with the same prediction rather than the same outcome.
Collapse
|
40
|
Explainable hybrid word representations for sentiment analysis of financial news. Neural Netw 2023; 164:115-123. [PMID: 37148607 DOI: 10.1016/j.neunet.2023.04.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 01/30/2023] [Accepted: 04/10/2023] [Indexed: 05/08/2023]
Abstract
Due to the increasing interest of people in the stock and financial market, the sentiment analysis of news and texts related to the sector is of utmost importance. This helps the potential investors in deciding what company to invest in and what are their long-term benefits. However, it is challenging to analyze the sentiments of texts related to the financial domain, given the enormous amount of information available. The existing approaches are unable to capture complex attributes of language such as word usage, including semantics and syntax throughout the context, and polysemy in the context. Further, these approaches failed to interpret the models' predictability, which is obscure to humans. Models' interpretability to justify the predictions has remained largely unexplored and has become important to engender users' trust in the predictions by providing insight into the model prediction. Accordingly, in this paper, we present an explainable hybrid word representation that first augments the data to address the class imbalance issue and then integrates three embeddings to involve polysemy in context, semantics, and syntax in a context. We then fed our proposed word representation to a convolutional neural network (CNN) with attention to capture the sentiment. The experimental results show that our model outperforms several baselines of both classic classifiers and combinations of various word embedding models in the sentiment analysis of financial news. The experimental results also show that the proposed model outperforms several baselines of word embeddings and contextual embeddings when they are separately fed to a neural network model. Further, we show the explainability of the proposed method by presenting the visualization results to explain the reason for a prediction in the sentiment analysis of financial news.
Collapse
|
41
|
The goal of explaining black boxes in EEG seizure prediction is not to explain models' decisions. Epilepsia Open 2023. [PMID: 37073831 DOI: 10.1002/epi4.12748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 04/18/2023] [Indexed: 04/20/2023] Open
Abstract
Many state-of-the-art methods for seizure prediction, using the electroencephalogram, are based on machine learning models that are black boxes, weakening the trust of clinicians in them for high-risk decisions. Seizure prediction concerns a multidimensional time-series problem that performs continuous sliding window analysis and classification. In this work, we make a critical review of which explanations increase trust in models' decisions for predicting seizures. We developed three machine learning methodologies to explore their explainability potential. These contain different levels of model transparency: a logistic regression, an ensemble of fifteen Support Vector Machines, and an ensemble of three Convolutional Neural Networks. For each methodology, we evaluated quasi-prospectively the performance in 40 patients (testing data comprised 2055 hours and 104 seizures). We selected patients with good and poor performance to explain the models' decisions. Then, with Grounded Theory, we evaluated how these explanations helped specialists (data scientists and clinicians working in epilepsy) to understand the obtained model dynamics. We obtained four lessons for better communication between data scientists and clinicians. We found that the goal of explainability is not to explain the system's decisions but to improve the system itself. Model transparency is not the most significant factor in explaining a model decision for seizure prediction. Even when using intuitive and state-of-the-art features, it is hard to understand brain dynamics and their relationship with the developed models. We achieve an increase in understanding by developing, in parallel, several systems that explicitly deal with signal dynamics changes that help develop a complete problem formulation.
Collapse
|
42
|
Insights into geospatial heterogeneity of landslide susceptibility based on the SHAP-XGBoost model. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2023; 332:117357. [PMID: 36731409 DOI: 10.1016/j.jenvman.2023.117357] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 01/05/2023] [Accepted: 01/22/2023] [Indexed: 06/18/2023]
Abstract
The spatial heterogeneity of landslide influencing factors is the main reason for the poor generalizability of the susceptibility evaluation model. This study aimed to construct a comprehensive explanatory framework for landslide susceptibility evaluation models based on the SHAP (SHapley Additive explanation)-XGBoost (eXtreme Gradient Boosting) algorithm, analyze the regional characteristics and spatial heterogeneity of landslide influencing factors, and discuss the heterogeneity of the generalizability of the models under different landscapes. Firstly, we selected different regions in typical mountainous hilly region and constructed a geospatial database containing 12 landslide influencing factors such as elevation, annual average rainfall, slope, lithology, and NDVI through field surveys, satellite images, and a literature review. Subsequently, the landslide susceptibility evaluation model was constructed based on the XGBoost algorithm and spatial database, and the prediction results of the landslide susceptibility evaluation model were explained based on regional topography, geology, and hydrology using the SHAP algorithm. Finally, the model was generalized and applied to regions with both similar and very different topography, geology, meteorology, and vegetation, to explore the spatial heterogeneity of the generalizability of the model. The following conclusions were drawn: the spatial distribution of landslides is heterogeneous and complex, and the contribution of each influencing factor on the occurrence of landslides has obvious regional characteristics and spatial heterogeneity. The generalizability of the landslide susceptibility evaluation model is spatially heterogeneous and has better generalizability to regions with similar regional characteristics. Further explanation of the XGBoost landslide susceptibility evaluation model using the SHAP method allows quantitative analysis of the differences in how much various factors contribute to disasters due to spatial heterogeneity, from the perspective of global and local evaluation units. In summary, the integrated explanatory framework based on the SHAP-XGBoost model can quantify the contribution of influencing factors on landslide occurrence at both global and local levels, which is conducive to the construction and improvement of the influencing factor system of landslide susceptibility in different regions. It can also provide a reference for predicting potential landslide hazard-prone areas and for Explainable Artificial Intelligence (XAI) research.
Collapse
|
43
|
TT-Net: Tensorized Transformer Network for 3D medical image segmentation. Comput Med Imaging Graph 2023; 107:102234. [PMID: 37075619 DOI: 10.1016/j.compmedimag.2023.102234] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 02/09/2023] [Accepted: 03/24/2023] [Indexed: 04/21/2023]
Abstract
Accurate segmentation of organs, tissues and lesions is essential for computer-assisted diagnosis. Previous works have achieved success in the field of automatic segmentation. However, there exists two limitations. (1) They are remain challenged by complex conditions, such as segmentation target is variable in location, size and shape, especially for different imaging modalities. (2) Existing transformer-based networks suffer from a high parametric complexity. To solve these limitations, we propose a new Tensorized Transformer Network (TT-Net). In this paper, (1) Multi-scale transformer with layers-fusion is proposed to faithfully capture context interaction information. (2) Cross Shared Attention (CSA) module that based on pHash similarity fusion (pSF) is well-designed to extract the global multi-variate dependency features. (3) Tensorized Self-Attention (TSA) module is proposed to deal with the large number of parameters, which can also be easily embedded into other models. In addition, TT-Net gains a good explainability through visualizing the transformer layers. The proposed method is evaluated on three widely accepted public datasets and one clinical dataset, which contains different imaging modalities. Comprehensive results show that TT-Net outperforms other state-of-the-art methods for the four different segmentation tasks. Besides, the compression module which can be easily embedded into other transformer-based methods achieves lower computation with comparable segmentation performance.
Collapse
|
44
|
A new xAI framework with feature explainability for tumors decision-making in Ultrasound data: comparing with Grad-CAM. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 235:107527. [PMID: 37086704 DOI: 10.1016/j.cmpb.2023.107527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 03/13/2023] [Accepted: 04/02/2023] [Indexed: 05/03/2023]
Abstract
BACKGROUND AND OBJECTIVE The value of implementing artificial intelligence (AI) on ultrasound screening for thyroid cancer has been acknowledged, with numerous early studies confirming AI might help physicians acquire more accurate diagnoses. However, the black box nature of AI's decision-making process makes it difficult for users to grasp the foundation of AI's predictions. Furthermore, explainability is not only related to AI performance, but also responsibility and risk in medical diagnosis. In this paper, we offer Explainer, an intrinsically explainable framework that can categorize images and create heatmaps highlighting the regions on which its prediction is based. METHODS A dataset of 19341 thyroid ultrasound images with pathological results and physician-annotated TI-RADS features is used to train and test the robustness of the proposed framework. Then we conducted a benign-malignant classification study to determine whether physicians perform better with the assistance of an explainer than they do alone or with Gradient-weighted Class Activation Mapping (Grad-CAM). RESULTS Reader studies show that the Explainer can achieve a more accurate diagnosis while explaining heatmaps, and that physicians' performances are improved when assisted by the Explainer. Case study results confirm that the Explainer is capable of locating more reasonable and feature-related regions than the Grad-CAM. CONCLUSIONS The Explainer offers physicians a tool to understand the basis of AI predictions and evaluate their reliability, which has the potential to unbox the "black box" of medical imaging AI.
Collapse
|
45
|
Explainability of deep learning models in medical video analysis: a survey. PeerJ Comput Sci 2023; 9:e1253. [PMID: 37346619 PMCID: PMC10280416 DOI: 10.7717/peerj-cs.1253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 01/20/2023] [Indexed: 06/23/2023]
Abstract
Deep learning methods have proven to be effective for multiple diagnostic tasks in medicine and have been performing significantly better in comparison to other traditional machine learning methods. However, the black-box nature of deep neural networks has restricted their use in real-world applications, especially in healthcare. Therefore, explainability of the machine learning models, which focuses on providing of the comprehensible explanations of model outputs, may affect the possibility of adoption of such models in clinical use. There are various studies reviewing approaches to explainability in multiple domains. This article provides a review of the current approaches and applications of explainable deep learning for a specific area of medical data analysis-medical video processing tasks. The article introduces the field of explainable AI and summarizes the most important requirements for explainability in medical applications. Subsequently, we provide an overview of existing methods, evaluation metrics and focus more on those that can be applied to analytical tasks involving the processing of video data in the medical domain. Finally we identify some of the open research issues in the analysed area.
Collapse
|
46
|
Few-shot learning using explainable Siamese twin network for the automated classification of blood cells. Med Biol Eng Comput 2023; 61:1549-1563. [PMID: 36800155 DOI: 10.1007/s11517-023-02804-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 02/06/2023] [Indexed: 02/18/2023]
Abstract
Automated classification of blood cells from microscopic images is an interesting research area owing to advancements of efficient neural network models. The existing deep learning methods rely on large data for network training and generating such large data could be time-consuming. Further, explainability is required via class activation mapping for better understanding of the model predictions. Therefore, we developed a Siamese twin network (STN) model based on contrastive learning that trains on relatively few images for the classification of healthy peripheral blood cells using EfficientNet-B3 as the base model. Hence, in this study, a total of 17,092 publicly accessible cell histology images were analyzed from which 6% were used for STN training, 6% for few-shot validation, and the rest 88% for few-shot testing. The proposed architecture demonstrates percent accuracies of 97.00, 98.78, 94.59, 95.70, 98.86, 97.09, 99.71, and 96.30 during 8-way 5-shot testing for the classification of basophils, eosinophils, immature granulocytes, erythroblasts, lymphocytes, monocytes, platelets, and neutrophils, respectively. Further, we propose a novel class activation mapping scheme that highlights the important regions in the test image for the STN model interpretability. Overall, the proposed framework could be used for a fully automated self-exploratory classification of healthy peripheral blood cells. The whole proposed framework demonstrates the Siamese twin network training and 8-way k-shot testing. The values indicate the amount of dissimilarity.
Collapse
|
47
|
Explaining sentiment analysis results on social media texts through visualization. MULTIMEDIA TOOLS AND APPLICATIONS 2023; 82:22613-22629. [PMID: 36747895 PMCID: PMC9892668 DOI: 10.1007/s11042-023-14432-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 02/23/2022] [Accepted: 01/22/2023] [Indexed: 06/01/2023]
Abstract
Today, Artificial Intelligence is achieving prodigious real-time performance, thanks to growing computational data and power capacities. However, there is little knowledge about what system results convey; thus, they are at risk of being susceptible to bias, and with the roots of Artificial Intelligence ("AI") in almost every territory, even a minuscule bias can result in excessive damage. Efforts towards making AI interpretable have been made to address fairness, accountability, and transparency concerns. This paper proposes two unique methods to understand the system's decisions aided by visualizing the results. For this study, interpretability has been implemented on Natural Language Processing-based sentiment analysis using data from various social media sites like Twitter, Facebook, and Reddit. With Valence Aware Dictionary for Sentiment Reasoning ("VADER"), heatmaps are generated, which account for visual justification of the result, increasing comprehensibility. Furthermore, Locally Interpretable Model-Agnostic Explanations ("LIME") have been used to provide in-depth insight into the predictions. It has been found experimentally that the proposed system can surpass several contemporary systems designed to attempt interpretability.
Collapse
|
48
|
An explainable autoencoder with multi-paradigm fMRI fusion for identifying differences in dynamic functional connectivity during brain development. Neural Netw 2023; 159:185-197. [PMID: 36580711 DOI: 10.1016/j.neunet.2022.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 10/19/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022]
Abstract
Multi-paradigm deep learning models show great potential for dynamic functional connectivity (dFC) analysis by integrating complementary information. However, many of them cannot use information from different paradigms effectively and have poor explainability, that is, the ability to identify significant features that contribute to decision making. In this paper, we propose a multi-paradigm fusion-based explainable deep sparse autoencoder (MF-EDSAE) to address these issues. Considering explainability, the MF-EDSAE is constructed based on a deep sparse autoencoder (DSAE). For integrating information effectively, the MF-EDASE contains the nonlinear fusion layer and multi-paradigm hypergraph regularization. We apply the model to the Philadelphia Neurodevelopmental Cohort and demonstrate it achieves better performance in detecting dynamic FC (dFC) that differ significantly during brain development than the single-paradigm DSAE. The experimental results show that children have more dispersive dFC patterns than adults. The function of the brain transits from undifferentiated systems to specialized networks during brain development. Meanwhile, adults have stronger connectivities between task-related functional networks for a given task than children. As the brain develops, the patterns of the global dFC change more quickly when stimulated by a task.
Collapse
|
49
|
Gender and sex bias in COVID-19 epidemiological data through the lens of causality. Inf Process Manag 2023; 60:103276. [PMID: 36647369 PMCID: PMC9834203 DOI: 10.1016/j.ipm.2023.103276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 12/08/2022] [Accepted: 01/09/2023] [Indexed: 01/13/2023]
Abstract
The COVID-19 pandemic has spurred a large amount of experimental and observational studies reporting clear correlation between the risk of developing severe COVID-19 (or dying from it) and whether the individual is male or female. This paper is an attempt to explain the supposed male vulnerability to COVID-19 using a causal approach. We proceed by identifying a set of confounding and mediating factors, based on the review of epidemiological literature and analysis of sex-dis-aggregated data. Those factors are then taken into consideration to produce explainable and fair prediction and decision models from observational data. The paper outlines how non-causal models can motivate discriminatory policies such as biased allocation of the limited resources in intensive care units (ICUs). The objective is to anticipate and avoid disparate impact and discrimination, by considering causal knowledge and causal-based techniques to compliment the collection and analysis of observational big-data. The hope is to contribute to more careful use of health related information access systems for developing fair and robust predictive models.
Collapse
|
50
|
Explaining deep reinforcement learning decisions in complex multiagent settings: towards enabling automation in air traffic flow management. APPL INTELL 2023; 53:4063-4098. [PMID: 35694685 PMCID: PMC9169601 DOI: 10.1007/s10489-022-03605-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/03/2022] [Indexed: 02/04/2023]
Abstract
With the objective to enhance human performance and maximize engagement during the performance of tasks, we aim to advance automation for decision making in complex and large-scale multi-agent settings. Towards these goals, this paper presents a deep multi agent reinforcement learning method for resolving demand - capacity imbalances in real-world Air Traffic Management settings with thousands of agents. Agents comprising the system are able to jointly decide on the measures to be applied to resolve imbalances, while they provide explanations on their decisions: This information is rendered and explored via appropriate visual analytics tools. The paper presents how major challenges of scalability and complexity are addressed, and provides results from evaluation tests that show the abilities of models to provide high-quality solutions and high-fidelity explanations.
Collapse
|