1
|
Shao Y, Zamrini EY, Ahmed A, Cheng Y, Nelson SJ, Kokkinos P, Zeng-Treitler Q. A Novel Explainable AI Method to Assess Associations between Temporal Patterns in Patient Trajectories and Adverse Outcome Risks: Analyzing Fitness as a Risk Factor of ADRD. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.17.24307541. [PMID: 38798505 PMCID: PMC11118636 DOI: 10.1101/2024.05.17.24307541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
We present a novel explainable artificial intelligence (XAI) method to assess the associations between the temporal patterns in the patient trajectories recorded in longitudinal clinical data and the adverse outcome risks, through explanations for a type of deep neural network model called Hybrid Value-Aware Transformer (HVAT) model. The HVAT models can learn jointly from longitudinal and non-longitudinal clinical data, and in particular can leverage the time-varying numerical values associated with the clinical codes or concepts within the longitudinal data for outcome prediction. The key component of the XAI method is the definitions of two derived variables, the temporal mean and the temporal slope, which are defined for the clinical concepts with associated time-varying numerical values. The two variables represent the overall level and the rate of change over time, respectively, in the trajectory formed by the values associated with the clinical concept. Two operations on the original values are designed for changing the values of the two derived variables separately. The effects of the two variables on the outcome risks learned by the HVAT model are calculated in terms of impact scores and impacts. Interpretations of the impact scores and impacts as being similar to those of odds ratios are also provided. We applied the XAI method to the study of cardiorespiratory fitness (CRF) as a risk factor of Alzheimer's disease and related dementias (ADRD). Using a retrospective case-control study design, we found that each one-unit increase in the overall CRF level is associated with a 5% reduction in ADRD risk, while each one-unit increase in the changing rate of CRF over time is associated with a 1% reduction. A closer investigation revealed that the association between the changing rate of CRF level and the ADRD risk is nonlinear, or more specifically, approximately piecewise linear along the axis of the changing rate on two pieces: the piece of negative changing rates and the piece of positive changing rates.
Collapse
|
2
|
Sengupta S, Anastasio MA. A Test Statistic Estimation-Based Approach for Establishing Self-Interpretable CNN-Based Binary Classifiers. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1753-1765. [PMID: 38163307 PMCID: PMC11065575 DOI: 10.1109/tmi.2023.3348699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2024]
Abstract
Interpretability is highly desired for deep neural network-based classifiers, especially when addressing high-stake decisions in medical imaging. Commonly used post-hoc interpretability methods have the limitation that they can produce plausible but different interpretations of a given model, leading to ambiguity about which one to choose. To address this problem, a novel decision-theory-inspired approach is investigated to establish a self-interpretable model, given a pre-trained deep binary black-box medical image classifier. This approach involves utilizing a self-interpretable encoder-decoder model in conjunction with a single-layer fully connected network with unity weights. The model is trained to estimate the test statistic of the given trained black-box deep binary classifier to maintain a similar accuracy. The decoder output image, referred to as an equivalency map, is an image that represents a transformed version of the to-be-classified image that, when processed by the fixed fully connected layer, produces the same test statistic value as the original classifier. The equivalency map provides a visualization of the transformed image features that directly contribute to the test statistic value and, moreover, permits quantification of their relative contributions. Unlike the traditional post-hoc interpretability methods, the proposed method is self-interpretable, quantitative. Detailed quantitative and qualitative analyses have been performed with three different medical image binary classification tasks.
Collapse
|
3
|
Ersavas T, Smith MA, Mattick JS. Novel applications of Convolutional Neural Networks in the age of Transformers. Sci Rep 2024; 14:10000. [PMID: 38693215 PMCID: PMC11063149 DOI: 10.1038/s41598-024-60709-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 04/26/2024] [Indexed: 05/03/2024] Open
Abstract
Convolutional Neural Networks (CNNs) have been central to the Deep Learning revolution and played a key role in initiating the new age of Artificial Intelligence. However, in recent years newer architectures such as Transformers have dominated both research and practical applications. While CNNs still play critical roles in many of the newer developments such as Generative AI, they are far from being thoroughly understood and utilised to their full potential. Here we show that CNNs can recognise patterns in images with scattered pixels and can be used to analyse complex datasets by transforming them into pseudo images with minimal processing for any high dimensional dataset, representing a more general approach to the application of CNNs to datasets such as in molecular biology, text, and speech. We introduce a pipeline called DeepMapper, which allows analysis of very high dimensional datasets without intermediate filtering and dimension reduction, thus preserving the full texture of the data, enabling detection of small variations normally deemed 'noise'. We demonstrate that DeepMapper can identify very small perturbations in large datasets with mostly random variables, and that it is superior in speed and on par in accuracy to prior work in processing large datasets with large numbers of features.
Collapse
Affiliation(s)
- Tansel Ersavas
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW, 2052, Australia.
| | - Martin A Smith
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW, 2052, Australia
- Department of Biochemistry and Molecular Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, H3C 3J7, Canada
- CHU Sainte-Justine Research Centre, Montreal, Canada
- UNSW RNA Institute, UNSW Sydney, Australia
| | - John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW, 2052, Australia.
| |
Collapse
|
4
|
Shafiabady N, Hadjinicolaou N, Hettikankanamage N, MohammadiSavadkoohi E, Wu RMX, Vakilian J. eXplainable Artificial Intelligence (XAI) for improving organisational regility. PLoS One 2024; 19:e0301429. [PMID: 38656983 PMCID: PMC11042710 DOI: 10.1371/journal.pone.0301429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 03/15/2024] [Indexed: 04/26/2024] Open
Abstract
Since the pandemic started, organisations have been actively seeking ways to improve their organisational agility and resilience (regility) and turn to Artificial Intelligence (AI) to gain a deeper understanding and further enhance their agility and regility. Organisations are turning to AI as a critical enabler to achieve these goals. AI empowers organisations by analysing large data sets quickly and accurately, enabling faster decision-making and building agility and resilience. This strategic use of AI gives businesses a competitive advantage and allows them to adapt to rapidly changing environments. Failure to prioritise agility and responsiveness can result in increased costs, missed opportunities, competition and reputational damage, and ultimately, loss of customers, revenue, profitability, and market share. Prioritising can be achieved by utilising eXplainable Artificial Intelligence (XAI) techniques, illuminating how AI models make decisions and making them transparent, interpretable, and understandable. Based on previous research on using AI to predict organisational agility, this study focuses on integrating XAI techniques, such as Shapley Additive Explanations (SHAP), in organisational agility and resilience. By identifying the importance of different features that affect organisational agility prediction, this study aims to demystify the decision-making processes of the prediction model using XAI. This is essential for the ethical deployment of AI, fostering trust and transparency in these systems. Recognising key features in organisational agility prediction can guide companies in determining which areas to concentrate on in order to improve their agility and resilience.
Collapse
Affiliation(s)
- Niusha Shafiabady
- Faculty of Science and Technology, Charles Darwin University, Haymarket, New South Wales, Australia
| | - Nick Hadjinicolaou
- Adelaide Institute of Higher Education, Adelaide, South Australia, Australia
| | | | | | - Robert M. X. Wu
- Faculty of Engineering and Information Technology, University of Technology Sydney, Broadway, New South Wales, Australia
| | - James Vakilian
- Faculty of Science and Technology, Charles Darwin University, Haymarket, New South Wales, Australia
| |
Collapse
|
5
|
Tappan I, Lindbeck EM, Nichols JA, Harley JB. Explainable AI Elucidates Musculoskeletal Biomechanics: A Case Study Using Wrist Surgeries. Ann Biomed Eng 2024; 52:498-509. [PMID: 37943340 DOI: 10.1007/s10439-023-03394-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 10/20/2023] [Indexed: 11/10/2023]
Abstract
As datasets increase in size and complexity, biomechanists have turned to artificial intelligence (AI) to aid their analyses. This paper explores how explainable AI (XAI) can enhance the interpretability of biomechanics data derived from musculoskeletal simulations. We use machine learning to classify the simulated lateral pinch data as belonging to models with healthy or one of two types of surgically altered wrists. This simulation-based classification task is analogous to using biomechanical movement and force data to clinically diagnose a pathological state. The XAI describes which musculoskeletal features best explain the classifications and, in turn, the pathological states, at both the local (individual decision) level and global (entire algorithm) level. We demonstrate that these descriptions agree with assessments in the literature and additionally identify the blind spots that can be missed with traditional statistical techniques.
Collapse
Affiliation(s)
- Isaly Tappan
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, 32611, USA
| | - Erica M Lindbeck
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, 32611, USA
| | - Jennifer A Nichols
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL, 32611, USA
| | - Joel B Harley
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, 32611, USA.
| |
Collapse
|
6
|
Iqbal T, Khalid A, Ullah I. Explaining decisions of a light-weight deep neural network for real-time coronary artery disease classification in magnetic resonance imaging. JOURNAL OF REAL-TIME IMAGE PROCESSING 2024; 21:31. [PMID: 38348346 PMCID: PMC10858933 DOI: 10.1007/s11554-023-01411-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 12/28/2023] [Indexed: 02/15/2024]
Abstract
In certain healthcare settings, such as emergency or critical care units, where quick and accurate real-time analysis and decision-making are required, the healthcare system can leverage the power of artificial intelligence (AI) models to support decision-making and prevent complications. This paper investigates the optimization of healthcare AI models based on time complexity, hyper-parameter tuning, and XAI for a classification task. The paper highlights the significance of a lightweight convolutional neural network (CNN) for analysing and classifying Magnetic Resonance Imaging (MRI) in real-time and is compared with CNN-RandomForest (CNN-RF). The role of hyper-parameter is also examined in finding optimal configurations that enhance the model's performance while efficiently utilizing the limited computational resources. Finally, the benefits of incorporating the XAI technique (e.g. GradCAM and Layer-wise Relevance Propagation) in providing transparency and interpretable explanations of AI model predictions, fostering trust, and error/bias detection are explored. Our inference time on a MacBook laptop for 323 test images of size 100x100 is only 2.6 sec, which is merely 8 milliseconds per image while providing comparable classification accuracy with the ensemble model of CNN-RF classifiers. Using the proposed model, clinicians/cardiologists can achieve accurate and reliable results while ensuring patients' safety and answering questions imposed by the General Data Protection Regulation (GDPR). The proposed investigative study will advance the understanding and acceptance of AI systems in connected healthcare settings.
Collapse
Affiliation(s)
- Talha Iqbal
- Insight SFI Research Centre for Data Analytics, University of Galway, Galway, H91 TK33 Ireland
| | - Aaleen Khalid
- School of Computer Science, University of Galway, Galway, H91 TK33 Ireland
| | - Ihsan Ullah
- Insight SFI Research Centre for Data Analytics, University of Galway, Galway, H91 TK33 Ireland
- School of Computer Science, University of Galway, Galway, H91 TK33 Ireland
| |
Collapse
|
7
|
Seo H, Lee S, Yun S, Leem S, So S, Han DH. RenseNet: A Deep Learning Network Incorporating Residual and Dense Blocks with Edge Conservative Module to Improve Small-Lesion Classification and Model Interpretation. Cancers (Basel) 2024; 16:570. [PMID: 38339320 PMCID: PMC10854971 DOI: 10.3390/cancers16030570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/16/2024] [Accepted: 01/27/2024] [Indexed: 02/12/2024] Open
Abstract
Deep learning has become an essential tool in medical image analysis owing to its remarkable performance. Target classification and model interpretability are key applications of deep learning in medical image analysis, and hence many deep learning-based algorithms have emerged. Many existing deep learning-based algorithms include pooling operations, which are a type of subsampling used to enlarge the receptive field. However, pooling operations degrade the image details in terms of signal processing theory, which is significantly sensitive to small objects in an image. Therefore, in this study, we designed a Rense block and edge conservative module to effectively manipulate previous feature information in the feed-forward learning process. Specifically, a Rense block, an optimal design that incorporates skip connections of residual and dense blocks, was demonstrated through mathematical analysis. Furthermore, we avoid blurring of the features in the pooling operation through a compensation path in the edge conservative module. Two independent CT datasets of kidney stones and lung tumors, in which small lesions are often included in the images, were used to verify the proposed RenseNet. The results of the classification and explanation heatmaps show that the proposed RenseNet provides the best inference and interpretation compared to current state-of-the-art methods. The proposed RenseNet can significantly contribute to efficient diagnosis and treatment because it is effective for small lesions that might be misclassified or misinterpreted.
Collapse
Affiliation(s)
- Hyunseok Seo
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Seokjun Lee
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Sojin Yun
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Saebom Leem
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Seohee So
- Bionics Research Center, Biomedical Research Division, Korea Institute of Science and Technology (KIST), Seoul 02792, Republic of Korea; (S.L.); (S.Y.); (S.L.); (S.S.)
| | - Deok Hyun Han
- Department of Urology, Samsung Medical Center (SMC), Seoul 06351, Republic of Korea;
| |
Collapse
|
8
|
Charlton PH, Allen J, Bailón R, Baker S, Behar JA, Chen F, Clifford GD, Clifton DA, Davies HJ, Ding C, Ding X, Dunn J, Elgendi M, Ferdoushi M, Franklin D, Gil E, Hassan MF, Hernesniemi J, Hu X, Ji N, Khan Y, Kontaxis S, Korhonen I, Kyriacou PA, Laguna P, Lázaro J, Lee C, Levy J, Li Y, Liu C, Liu J, Lu L, Mandic DP, Marozas V, Mejía-Mejía E, Mukkamala R, Nitzan M, Pereira T, Poon CCY, Ramella-Roman JC, Saarinen H, Shandhi MMH, Shin H, Stansby G, Tamura T, Vehkaoja A, Wang WK, Zhang YT, Zhao N, Zheng D, Zhu T. The 2023 wearable photoplethysmography roadmap. Physiol Meas 2023; 44:111001. [PMID: 37494945 PMCID: PMC10686289 DOI: 10.1088/1361-6579/acead2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 04/04/2023] [Accepted: 07/26/2023] [Indexed: 07/28/2023]
Abstract
Photoplethysmography is a key sensing technology which is used in wearable devices such as smartwatches and fitness trackers. Currently, photoplethysmography sensors are used to monitor physiological parameters including heart rate and heart rhythm, and to track activities like sleep and exercise. Yet, wearable photoplethysmography has potential to provide much more information on health and wellbeing, which could inform clinical decision making. This Roadmap outlines directions for research and development to realise the full potential of wearable photoplethysmography. Experts discuss key topics within the areas of sensor design, signal processing, clinical applications, and research directions. Their perspectives provide valuable guidance to researchers developing wearable photoplethysmography technology.
Collapse
Affiliation(s)
- Peter H Charlton
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, United Kingdom
- Research Centre for Biomedical Engineering, City, University of London, London, EC1V 0HB, United Kingdom
| | - John Allen
- Research Centre for Intelligent Healthcare, Coventry University, Coventry, CV1 5RW, United Kingdom
- Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, United Kingdom
| | - Raquel Bailón
- Biomedical Signal Interpretation and Computational Simulation (BSICoS) Group, Aragon Institute of Engineering Research (I3A), IIS Aragon, University of Zaragoza, E-50018 Zaragoza, Spain
- CIBER-BBN, Instituto de Salud Carlos III, C/Monforte de Lemos 3-5, E-28029 Madrid, Spain
| | - Stephanie Baker
- College of Science and Engineering, James Cook University, Cairns, 4878 Queensland, Australia
| | - Joachim A Behar
- Faculty of Biomedical Engineering, Technion Israel Institute of Technology, Haifa, 3200003, Israel
| | - Fei Chen
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, 518055 Guandong, People’s Republic of China
| | - Gari D Clifford
- Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, United States of America
- Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, United States of America
| | - David A Clifton
- Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, United Kingdom
| | - Harry J Davies
- Department of Electrical and Electronic Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Cheng Ding
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, United States of America
- Department of Biomedical Engineering, Emory University, Atlanta, GA 30322, United States of America
| | - Xiaorong Ding
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, People’s Republic of China
| | - Jessilyn Dunn
- Department of Biomedical Engineering, Duke University, Durham, NC 27708-0187, United States of America
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC 27708-0187, United States of America
- Duke Clinical Research Institute, Durham, NC 27705-3976, United States of America
| | - Mohamed Elgendi
- Biomedical and Mobile Health Technology Laboratory, Department of Health Sciences and Technology, ETH Zurich, Zurich, 8008, Switzerland
| | - Munia Ferdoushi
- Department of Electrical and Computer Engineering, University of Southern California, 90089, Los Angeles, California, United States of America
- The Institute for Technology and Medical Systems (ITEMS), Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, United States of America
| | - Daniel Franklin
- Institute of Biomedical Engineering, Translational Biology & Engineering Program, Ted Rogers Centre for Heart Research, University of Toronto, Toronto, M5G 1M1, Canada
| | - Eduardo Gil
- Biomedical Signal Interpretation and Computational Simulation (BSICoS) Group, Aragon Institute of Engineering Research (I3A), IIS Aragon, University of Zaragoza, E-50018 Zaragoza, Spain
- CIBER-BBN, Instituto de Salud Carlos III, C/Monforte de Lemos 3-5, E-28029 Madrid, Spain
| | - Md Farhad Hassan
- Department of Electrical and Computer Engineering, University of Southern California, 90089, Los Angeles, California, United States of America
- The Institute for Technology and Medical Systems (ITEMS), Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, United States of America
| | - Jussi Hernesniemi
- Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, 33720, Finland
- Tampere Heart Hospital, Wellbeing Services County of Pirkanmaa, Tampere, 33520, Finland
| | - Xiao Hu
- Nell Hodgson Woodruff School of Nursing, Emory University, Atlanta, 30322, Georgia, United States of America
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, 30322, Georgia, United States of America
- Department of Computer Sciences, College of Arts and Sciences, Emory University, Atlanta, GA 30322, United States of America
| | - Nan Ji
- Hong Kong Center for Cerebrocardiovascular Health Engineering (COCHE), Hong Kong Science and Technology Park, Hong Kong, 999077, People’s Republic of China
| | - Yasser Khan
- Department of Electrical and Computer Engineering, University of Southern California, 90089, Los Angeles, California, United States of America
- The Institute for Technology and Medical Systems (ITEMS), Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, United States of America
| | - Spyridon Kontaxis
- Biomedical Signal Interpretation and Computational Simulation (BSICoS) Group, Aragon Institute of Engineering Research (I3A), IIS Aragon, University of Zaragoza, E-50018 Zaragoza, Spain
- CIBER-BBN, Instituto de Salud Carlos III, C/Monforte de Lemos 3-5, E-28029 Madrid, Spain
| | - Ilkka Korhonen
- Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, 33720, Finland
| | - Panicos A Kyriacou
- Research Centre for Biomedical Engineering, City, University of London, London, EC1V 0HB, United Kingdom
| | - Pablo Laguna
- Biomedical Signal Interpretation and Computational Simulation (BSICoS) Group, Aragon Institute of Engineering Research (I3A), IIS Aragon, University of Zaragoza, E-50018 Zaragoza, Spain
- CIBER-BBN, Instituto de Salud Carlos III, C/Monforte de Lemos 3-5, E-28029 Madrid, Spain
| | - Jesús Lázaro
- Biomedical Signal Interpretation and Computational Simulation (BSICoS) Group, Aragon Institute of Engineering Research (I3A), IIS Aragon, University of Zaragoza, E-50018 Zaragoza, Spain
- CIBER-BBN, Instituto de Salud Carlos III, C/Monforte de Lemos 3-5, E-28029 Madrid, Spain
| | - Chungkeun Lee
- Digital Health Devices Division, Medical Device Evaluation Department, National Institute of Food and Drug Safety Evaluation, Ministry of Food and Drug Safety, Cheongju, 28159, Republic of Korea
| | - Jeremy Levy
- Faculty of Biomedical Engineering, Technion Israel Institute of Technology, Haifa, 3200003, Israel
- Faculty of Electrical and Computer Engineering, Technion Institute of Technology, Haifa, 3200003, Israel
| | - Yumin Li
- State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing 210096, People’s Republic of China
| | - Chengyu Liu
- State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing 210096, People’s Republic of China
| | - Jing Liu
- Analog Devices Inc, San Jose, CA 95124, United States of America
| | - Lei Lu
- Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, United Kingdom
| | - Danilo P Mandic
- Department of Electrical and Electronic Engineering, Imperial College London, London, SW7 2AZ, United Kingdom
| | - Vaidotas Marozas
- Department of Electronics Engineering, Kaunas University of Technology, 44249 Kaunas, Lithuania
- Biomedical Engineering Institute, Kaunas University of Technology, 44249 Kaunas, Lithuania
| | - Elisa Mejía-Mejía
- Research Centre for Biomedical Engineering, City, University of London, London, EC1V 0HB, United Kingdom
| | - Ramakrishna Mukkamala
- Department of Bioengineering and Department of Anesthesiology and Perioperative Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Meir Nitzan
- Department of Physics/Electro-Optic Engineering, Lev Academic Center, 91160 Jerusalem, Israel
| | - Tania Pereira
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, Porto, 4200-465, Portugal
- Faculty of Engineering, University of Porto, Porto, 4200-465, Portugal
| | | | - Jessica C Ramella-Roman
- Department of Biomedical Engineering and Herbert Wertheim College of Medicine, Florida International University, Miami, FL 33174, United States of America
| | - Harri Saarinen
- Tampere Heart Hospital, Wellbeing Services County of Pirkanmaa, Tampere, 33520, Finland
| | - Md Mobashir Hasan Shandhi
- Department of Biomedical Engineering, Duke University, Durham, NC 27708-0187, United States of America
| | - Hangsik Shin
- Department of Digital Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, 05505, Republic of Korea
| | - Gerard Stansby
- Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, United Kingdom
- Northern Vascular Centre, Freeman Hospital, Newcastle upon Tyne, NE7 7DN, United Kingdom
| | - Toshiyo Tamura
- Future Robotics Organization, Waseda University, Tokyo, 1698050, Japan
| | - Antti Vehkaoja
- Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, 33720, Finland
- PulseOn Ltd, Espoo, 02150, Finland
| | - Will Ke Wang
- Department of Biomedical Engineering, Duke University, Durham, NC 27708-0187, United States of America
| | - Yuan-Ting Zhang
- Hong Kong Center for Cerebrocardiovascular Health Engineering (COCHE), Hong Kong Science and Technology Park, Hong Kong, 999077, People’s Republic of China
- Department of Biomedical Engineering, City University of Hong Kong, Hong Kong, 999077, People’s Republic of China
| | - Ni Zhao
- Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong
| | - Dingchang Zheng
- Research Centre for Intelligent Healthcare, Coventry University, Coventry, CV1 5RW, United Kingdom
| | - Tingting Zhu
- Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, United Kingdom
| |
Collapse
|
9
|
Ventura F, Greco S, Apiletti D, Cerquitelli T. Explaining deep convolutional models by measuring the influence of interpretable features in image classification. Data Min Knowl Discov 2023. [DOI: 10.1007/s10618-023-00915-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
AbstractThe accuracy and flexibility of Deep Convolutional Neural Networks (DCNNs) have been highly validated over the past years. However, their intrinsic opaqueness is still affecting their reliability and limiting their application in critical production systems, where the black-box behavior is difficult to be accepted. This work proposes EBAnO, an innovative explanation framework able to analyze the decision-making process of DCNNs in image classification by providing prediction-local and class-based model-wise explanations through the unsupervised mining of knowledge contained in multiple convolutional layers. EBAnO provides detailed visual and numerical explanations thanks to two specific indexes that measure the features’ influence and their influence precision in the decision-making process. The framework has been experimentally evaluated, both quantitatively and qualitatively, by (i) analyzing its explanations with four state-of-the-art DCNN architectures, (ii) comparing its results with three state-of-the-art explanation strategies and (iii) assessing its effectiveness and easiness of understanding through human judgment, by means of an online survey. EBAnO has been released as open-source code and it is freely available online.
Collapse
|
10
|
Shao Y, Ahmed A, Zamrini EY, Cheng Y, Goulet JL, Zeng-Treitler Q. Enhancing Clinical Data Analysis by Explaining Interaction Effects between Covariates in Deep Neural Network Models. J Pers Med 2023; 13:jpm13020217. [PMID: 36836451 PMCID: PMC9967882 DOI: 10.3390/jpm13020217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/21/2023] [Accepted: 01/24/2023] [Indexed: 01/28/2023] Open
Abstract
Deep neural network (DNN) is a powerful technology that is being utilized by a growing number and range of research projects, including disease risk prediction models. One of the key strengths of DNN is its ability to model non-linear relationships, which include covariate interactions. We developed a novel method called interaction scores for measuring the covariate interactions captured by DNN models. As the method is model-agnostic, it can also be applied to other types of machine learning models. It is designed to be a generalization of the coefficient of the interaction term in a logistic regression; hence, its values are easily interpretable. The interaction score can be calculated at both an individual level and population level. The individual-level score provides an individualized explanation for covariate interactions. We applied this method to two simulated datasets and a real-world clinical dataset on Alzheimer's disease and related dementia (ADRD). We also applied two existing interaction measurement methods to those datasets for comparison. The results on the simulated datasets showed that the interaction score method can explain the underlying interaction effects, there are strong correlations between the population-level interaction scores and the ground truth values, and the individual-level interaction scores vary when the interaction was designed to be non-uniform. Another validation of our new method is that the interactions discovered from the ADRD data included both known and novel relationships.
Collapse
Affiliation(s)
- Yijun Shao
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
- Correspondence:
| | - Ali Ahmed
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
- Department of Medicine, School of Medicine, Georgetown University, Washington, DC 20057, USA
| | - Edward Y. Zamrini
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
- Department of Neurology, School of Medicine, University of Utah, Salt Lake City, UT 84108, USA
- Irvine Clinical Research, Irvine, CA 92614, USA
- Cognitive Neurology Consulting, Newport Beach, CA 92614, USA
| | - Yan Cheng
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
| | - Joseph L. Goulet
- VA Connecticut Healthcare System, New Haven, CT 06516, USA
- Department of Emergency Medicine, Yale School of Medicine, Yale University, New Haven, CT 06516, USA
| | - Qing Zeng-Treitler
- Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
- Washington DC VA Medical Center, Washington, DC 20422, USA
| |
Collapse
|
11
|
Kwon HJ, Koo HI, Soh JW, Cho NI. Inverse-Based Approach to Explaining and Visualizing Convolutional Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:7318-7329. [PMID: 34138716 DOI: 10.1109/tnnls.2021.3084757] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This article presents a new method for understanding and visualizing convolutional neural networks (CNNs). Most existing approaches to this problem focus on a global score and evaluate the pixelwise contribution of inputs to the score. The analysis of CNNs for multilabeled outputs or regression has not yet been considered in the literature, despite their success on image classification tasks with well-defined global scores. To address this problem, we propose a new inverse-based approach that computes the inverse of a feedforward pass to identify activations of interest in lower layers. We developed a layerwise inverse procedure based on two observations: 1) inverse results should have consistent internal activations to the original forward pass and 2) a small amount of activation in inverse results is desirable for human interpretability. Experimental results show that the proposed method allows us to analyze CNNs for classification and regression in the same framework. We demonstrated that our method successfully finds attributions in the inputs for image classification with comparable performance to state-of-the-art methods. To visualize the tradeoff between various methods, we developed a novel plot that shows the tradeoff between the amount of activations and the rate of class reidentification. In the case of regression, our method showed that conventional CNNs for single image super-resolution overlook a portion of frequency bands that may result in performance degradation.
Collapse
|
12
|
Shi W, Huang G, Song S, Wu C. Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:10222-10235. [PMID: 34882545 DOI: 10.1109/tpami.2021.3133717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Deep reinforcement learning (RL) agents are becoming increasingly proficient in a range of complex control tasks. However, the agent's behavior is usually difficult to interpret due to the introduction of black-box function, making it difficult to acquire the trust of users. Although there have been some interesting interpretation methods for vision-based RL, most of them cannot uncover temporal causal information, raising questions about their reliability. To address this problem, we present a temporal-spatial causal interpretation (TSCI) model to understand the agent's long-term behavior, which is essential for sequential decision-making. TSCI model builds on the formulation of temporal causality, which reflects the temporal causal relations between sequential observations and decisions of RL agent. Then a separate causal discovery network is employed to identify temporal-spatial causal features, which are constrained to satisfy the temporal causality. TSCI model is applicable to recurrent agents and can be used to discover causal features with high efficiency once trained. The empirical results show that TSCI model can produce high-resolution and sharp attention masks to highlight task-relevant temporal-spatial information that constitutes most evidence about how vision-based RL agents make sequential decisions. In addition, we further demonstrate that our method is able to provide valuable causal interpretations for vision-based RL agents from the temporal perspective.
Collapse
|
13
|
Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond. Knowl Inf Syst 2022. [DOI: 10.1007/s10115-022-01756-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
14
|
Meddage DPP, Ekanayake IU, Herath S, Gobirahavan R, Muttil N, Rathnayake U. Predicting Bulk Average Velocity with Rigid Vegetation in Open Channels Using Tree-Based Machine Learning: A Novel Approach Using Explainable Artificial Intelligence. SENSORS (BASEL, SWITZERLAND) 2022; 22:4398. [PMID: 35746184 PMCID: PMC9229711 DOI: 10.3390/s22124398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 06/01/2022] [Accepted: 06/08/2022] [Indexed: 06/15/2023]
Abstract
Predicting the bulk-average velocity (UB) in open channels with rigid vegetation is complicated due to the non-linear nature of the parameters. Despite their higher accuracy, existing regression models fail to highlight the feature importance or causality of the respective predictions. Therefore, we propose a method to predict UB and the friction factor in the surface layer (fS) using tree-based machine learning (ML) models (decision tree, extra tree, and XGBoost). Further, Shapley Additive exPlanation (SHAP) was used to interpret the ML predictions. The comparison emphasized that the XGBoost model is superior in predicting UB (R = 0.984) and fS (R = 0.92) relative to the existing regression models. SHAP revealed the underlying reasoning behind predictions, the dependence of predictions, and feature importance. Interestingly, SHAP adheres to what is generally observed in complex flow behavior, thus, improving trust in predictions.
Collapse
Affiliation(s)
- D. P. P. Meddage
- Department of Civil and Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka;
| | - I. U. Ekanayake
- Department of Computer Engineering, University of Peradeniya, Galaha 20400, Sri Lanka;
| | - Sumudu Herath
- Department of Civil and Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka;
| | - R. Gobirahavan
- Department of Civil and Environmental Engineering, University of Ruhuna, Matara 81000, Sri Lanka;
| | - Nitin Muttil
- Institute for Sustainable Industries & Liveable Cities, Victoria University, P.O. Box 14428, Melbourne, VIC 8001, Australia
- College of Engineering and Science, Victoria University, P.O. Box 14428, Melbourne, VIC 8001, Australia
| | - Upaka Rathnayake
- Department of Civil Engineering, Sri Lanka Institute of Information Technology, Malabe 10115, Sri Lanka;
| |
Collapse
|
15
|
Shi W, Huang G, Song S, Wang Z, Lin T, Wu C. Self-Supervised Discovering of Interpretable Features for Reinforcement Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:2712-2724. [PMID: 33186101 DOI: 10.1109/tpami.2020.3037898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deep reinforcement learning (RL) has recently led to many breakthroughs on a range of complex control tasks. However, the agent's decision-making process is generally not transparent. The lack of interpretability hinders the applicability of RL in safety-critical scenarios. While several methods have attempted to interpret vision-based RL, most come without detailed explanation for the agent's behavior. In this paper, we propose a self-supervised interpretable framework, which can discover interpretable features to enable easy understanding of RL agents even for non-experts. Specifically, a self-supervised interpretable network (SSINet) is employed to produce fine-grained attention masks for highlighting task-relevant information, which constitutes most evidence for the agent's decisions. We verify and evaluate our method on several Atari 2600 games as well as Duckietown, which is a challenging self-driving car simulator environment. The results show that our method renders empirical evidences about how the agent makes decisions and why the agent performs well or badly, especially when transferred to novel scenes. Overall, our method provides valuable insight into the internal decision-making process of vision-based RL. In addition, our method does not use any external labelled data, and thus demonstrates the possibility to learn high-quality mask through a self-supervised manner, which may shed light on new paradigms for label-free vision learning such as self-supervised segmentation and detection.
Collapse
|
16
|
Tavanapong W, Oh J, Riegler MA, Khaleel M, Mittal B, de Groen PC. Artificial Intelligence for Colonoscopy: Past, Present, and Future. IEEE J Biomed Health Inform 2022; 26:3950-3965. [PMID: 35316197 PMCID: PMC9478992 DOI: 10.1109/jbhi.2022.3160098] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
During the past decades, many automated image analysis methods have been developed for colonoscopy. Real-time implementation of the most promising methods during colonoscopy has been tested in clinical trials, including several recent multi-center studies. All trials have shown results that may contribute to prevention of colorectal cancer. We summarize the past and present development of colonoscopy video analysis methods, focusing on two categories of artificial intelligence (AI) technologies used in clinical trials. These are (1) analysis and feedback for improving colonoscopy quality and (2) detection of abnormalities. Our survey includes methods that use traditional machine learning algorithms on carefully designed hand-crafted features as well as recent deep-learning methods. Lastly, we present the gap between current state-of-the-art technology and desirable clinical features and conclude with future directions of endoscopic AI technology development that will bridge the current gap.
Collapse
|
17
|
Michel A, Jha SK, Ewetz R. A survey on the vulnerability of deep neural networks against adversarial attacks. PROGRESS IN ARTIFICIAL INTELLIGENCE 2022. [DOI: 10.1007/s13748-021-00269-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
18
|
Zhang Q, Wang X, Cao R, Wu YN, Shi F, Zhu SC. Extraction of an Explanatory Graph to Interpret a CNN. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:3863-3877. [PMID: 32386138 DOI: 10.1109/tpami.2020.2992207] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This paper introduces an explanatory graph representation to reveal object parts encoded inside convolutional layers of a CNN. Given a pre-trained CNN, each filter1 in a conv-layer usually represents a mixture of object parts. We develop a simple yet effective method to learn an explanatory graph, which automatically disentangles object parts from each filter without any part annotations. Specifically, given the feature map of a filter, we mine neural activations from the feature map, which correspond to different object parts. The explanatory graph is constructed to organize each mined part as a graph node. Each edge connects two nodes, whose corresponding object parts usually co-activate and keep a stable spatial relationship. Experiments show that each graph node consistently represented the same object part through different images, which boosted the transferability of CNN features. The explanatory graph transferred features of object parts to the task of part localization, and our method significantly outperformed other approaches.
Collapse
|
19
|
Morilla I. Repairing the human with artificial intelligence in oncology. Artif Intell Cancer 2021; 2:60-68. [DOI: 10.35713/aic.v2.i5.60] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 10/26/2021] [Accepted: 10/27/2021] [Indexed: 02/06/2023] Open
Abstract
Artificial intelligence is a groundbreaking tool to learn and analyse higher features extracted from any dataset at large scale. This ability makes it ideal to facing any complex problem that may generally arise in the biomedical domain or oncology in particular. In this work, we envisage to provide a global vision of this mathematical discipline outgrowth by linking some other related subdomains such as transfer, reinforcement or federated learning. Complementary, we also introduce the recently popular method of topological data analysis that improves the performance of learning models.
Collapse
Affiliation(s)
- Ian Morilla
- Laboratoire Analyse, Géométrie et Applications - Institut Galilée, Sorbonne Paris Nord University, Paris 75006, France
| |
Collapse
|
20
|
Classification of Explainable Artificial Intelligence Methods through Their Output Formats. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3030032] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Machine and deep learning have proven their utility to generate data-driven models with high accuracy and precision. However, their non-linear, complex structures are often difficult to interpret. Consequently, many scholars have developed a plethora of methods to explain their functioning and the logic of their inferences. This systematic review aimed to organise these methods into a hierarchical classification system that builds upon and extends existing taxonomies by adding a significant dimension—the output formats. The reviewed scientific papers were retrieved by conducting an initial search on Google Scholar with the keywords “explainable artificial intelligence”; “explainable machine learning”; and “interpretable machine learning”. A subsequent iterative search was carried out by checking the bibliography of these articles. The addition of the dimension of the explanation format makes the proposed classification system a practical tool for scholars, supporting them to select the most suitable type of explanation format for the problem at hand. Given the wide variety of challenges faced by researchers, the existing XAI methods provide several solutions to meet the requirements that differ considerably between the users, problems and application fields of artificial intelligence (AI). The task of identifying the most appropriate explanation can be daunting, thus the need for a classification system that helps with the selection of methods. This work concludes by critically identifying the limitations of the formats of explanations and by providing recommendations and possible future research directions on how to build a more generally applicable XAI method. Future work should be flexible enough to meet the many requirements posed by the widespread use of AI in several fields, and the new regulations.
Collapse
|
21
|
Dutta A, Singh KK, Anand A. SpliceViNCI: Visualizing the splicing of non-canonical introns through recurrent neural networks. J Bioinform Comput Biol 2021; 19:2150014. [PMID: 34088258 DOI: 10.1142/s0219720021500141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Most of the current computational models for splice junction prediction are based on the identification of canonical splice junctions. However, it is observed that the junctions lacking the consensus dimers GT and AG also undergo splicing. Identification of such splice junctions, called the non-canonical splice junctions, is also essential for a comprehensive understanding of the splicing phenomenon. This work focuses on the identification of non-canonical splice junctions through the application of a bidirectional long short-term memory (BLSTM) network. Furthermore, we apply a back-propagation-based (integrated gradient) and a perturbation-based (occlusion) visualization techniques to extract the non-canonical splicing features learned by the model. The features obtained are validated with the existing knowledge from the literature. Integrated gradient extracts features that comprise contiguous nucleotides, whereas occlusion extracts features that are individual nucleotides distributed across the sequence.
Collapse
Affiliation(s)
- Aparajita Dutta
- Department of CSE, Indian Institute of Technology, Guwahati, India
| | | | - Ashish Anand
- Department of CSE, Indian Institute of Technology, Guwahati, India
| |
Collapse
|
22
|
Improving performance of deep learning models with axiomatic attribution priors and expected gradients. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00343-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
23
|
Hu L, Chen J, Vaughan J, Aramideh S, Yang H, Wang K, Sudjianto A, Nair VN. Supervised Machine Learning Techniques: An Overview with Applications to Banking. Int Stat Rev 2021. [DOI: 10.1111/insr.12448] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Linwei Hu
- Wells Fargo & Company Charlotte North Carolina USA
| | - Jie Chen
- Wells Fargo & Company Charlotte North Carolina USA
| | - Joel Vaughan
- Wells Fargo & Company Charlotte North Carolina USA
| | | | - Hanyu Yang
- Wells Fargo & Company Charlotte North Carolina USA
| | - Kelly Wang
- Wells Fargo & Company Charlotte North Carolina USA
| | | | | |
Collapse
|
24
|
Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients. ACTA ACUST UNITED AC 2021; 2:233-244. [PMID: 34223192 DOI: 10.1038/s43018-020-00169-2] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Cell-line screens create expansive datasets for learning predictive markers of drug response, but these models do not readily translate to the clinic with its diverse contexts and limited data. In the present study, we apply a recently developed technique, few-shot machine learning, to train a versatile neural network model in cell lines that can be tuned to new contexts using few additional samples. The model quickly adapts when switching among different tissue types and in moving from cell-line models to clinical contexts, including patient-derived tumor cells and patient-derived xenografts. It can also be interpreted to identify the molecular features most important to a drug response, highlighting critical roles for RB1 and SMAD4 in the response to CDK inhibition and RNF8 and CHD4 in the response to ATM inhibition. The few-shot learning framework provides a bridge from the many samples surveyed in high-throughput screens (n-of-many) to the distinctive contexts of individual patients (n-of-one).
Collapse
|
25
|
Explaining the black-box model: A survey of local interpretation methods for deep neural networks. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.08.011] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
26
|
Apicella A, Isgrò F, Prevete R, Tamburrini G. Middle-Level Features for the Explanation of Classification Systems by Sparse Dictionary Methods. Int J Neural Syst 2020; 30:2050040. [PMID: 32727317 DOI: 10.1142/s0129065720500409] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Machine learning (ML) systems are affected by a pervasive lack of transparency. The eXplainable Artificial Intelligence (XAI) research area addresses this problem and the related issue of explaining the behavior of ML systems in terms that are understandable to human beings. In many explanation of XAI approaches, the output of ML systems are explained in terms of low-level features of their inputs. However, these approaches leave a substantive explanatory burden with human users, insofar as the latter are required to map low-level properties into more salient and readily understandable parts of the input. To alleviate this cognitive burden, an alternative model-agnostic framework is proposed here. This framework is instantiated to address explanation problems in the context of ML image classification systems, without relying on pixel relevance maps and other low-level features of the input. More specifically, one obtains sets of middle-level properties of classification inputs that are perceptually salient by applying sparse dictionary learning techniques. These middle-level properties are used as building blocks for explanations of image classifications. The achieved explanations are parsimonious, for their reliance on a limited set of middle-level image properties. And they can be contrastive, because the set of middle-level image properties can be used to explain why the system advanced the proposed classification over other antagonist classifications. In view of its model-agnostic character, the proposed framework is adaptable to a variety of other ML systems and explanation problems.
Collapse
Affiliation(s)
- A Apicella
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell'Informazione, Università degli Studi di Napoli Federico II, 80125 Napoli, Italy
| | - F Isgrò
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell'Informazione, Università degli Studi di Napoli Federico II, 80125 Napoli, Italy
| | - R Prevete
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell'Informazione, Università degli Studi di Napoli Federico II, 80125 Napoli, Italy
| | - G Tamburrini
- Dipartimento di Ingegneria Elettrica e delle Tecnologie dell'Informazione, Università degli Studi di Napoli Federico II, 80125 Napoli, Italy
| |
Collapse
|
27
|
Wataya T, Nakanishi K, Suzuki Y, Kido S, Tomiyama N. Introduction to deep learning: minimum essence required to launch a research. Jpn J Radiol 2020; 38:907-921. [DOI: 10.1007/s11604-020-00998-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 06/02/2020] [Indexed: 02/08/2023]
|
28
|
Abstract
AbstractFatal accidents are a major issue hindering the wide acceptance of safety-critical systems that employ machine learning and deep learning models, such as automated driving vehicles. In order to use machine learning in a safety-critical system, it is necessary to demonstrate the safety and security of the system through engineering processes. However, thus far, no such widely accepted engineering concepts or frameworks have been established for these systems. The key to using a machine learning model in a deductively engineered system is decomposing the data-driven training of machine learning models into requirement, design, and verification, particularly for machine learning models used in safety-critical systems. Simultaneously, open problems and relevant technical fields are not organized in a manner that enables researchers to select a theme and work on it. In this study, we identify, classify, and explore the open problems in engineering (safety-critical) machine learning systems—that is, in terms of requirement, design, and verification of machine learning models and systems—as well as discuss related works and research directions, using automated driving vehicles as an example. Our results show that machine learning models are characterized by a lack of requirements specification, lack of design specification, lack of interpretability, and lack of robustness. We also perform a gap analysis on a conventional system quality standard SQuaRE with the characteristics of machine learning models to study quality models for machine learning systems. We find that a lack of requirements specification and lack of robustness have the greatest impact on conventional quality models.
Collapse
|
29
|
|
30
|
Geete K, Pandey M. A noise-based stabilizer for convolutional neural networks. J STAT COMPUT SIM 2019. [DOI: 10.1080/00949655.2019.1610883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Kanu Geete
- Maulana Azad National Institute of Technology, Bhopal, MP, India
| | - Manish Pandey
- Maulana Azad National Institute of Technology, Bhopal, MP, India
| |
Collapse
|
31
|
Kuwajima H, Tanaka M, Okutomi M. Improving transparency of deep neural inference process. PROGRESS IN ARTIFICIAL INTELLIGENCE 2019. [DOI: 10.1007/s13748-019-00179-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|