1
|
Metsch JM, Hauschild AC. BenchXAI: Comprehensive benchmarking of post-hoc explainable AI methods on multi-modal biomedical data. Comput Biol Med 2025; 191:110124. [PMID: 40239236 DOI: 10.1016/j.compbiomed.2025.110124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Revised: 02/13/2025] [Accepted: 03/31/2025] [Indexed: 04/18/2025]
Abstract
The increasing digitalization of multi-modal data in medicine and novel artificial intelligence (AI) algorithms opens up a large number of opportunities for predictive models. In particular, deep learning models show great performance in the medical field. A major limitation of such powerful but complex models originates from their 'black-box' nature. Recently, a variety of explainable AI (XAI) methods have been introduced to address this lack of transparency and trust in medical AI. However, the majority of such methods have solely been evaluated on single data modalities. Meanwhile, with the increasing number of XAI methods, integrative XAI frameworks and benchmarks are essential to compare their performance on different tasks. For that reason, we developed BenchXAI, a novel XAI benchmarking package supporting comprehensive evaluation of fifteen XAI methods, investigating their robustness, suitability, and limitations in biomedical data. We employed BenchXAI to validate these methods in three common biomedical tasks, namely clinical data, medical image and signal data, and biomolecular data. Our newly designed sample-wise normalization approach for post-hoc XAI methods enables the statistical evaluation and visualization of performance and robustness. We found that the XAI methods Integrated Gradients, DeepLift, DeepLiftShap, and GradientShap performed well over all three tasks, while methods like Deconvolution, Guided Backpropagation, and LRP-α1-β0 struggled for some tasks. With acts such as the EU AI Act the application of XAI in the biomedical domain becomes more and more essential. Our evaluation study represents a first step towards verifying the suitability of different XAI methods for various medical domains.
Collapse
Affiliation(s)
| | - Anne-Christin Hauschild
- Institute for Medical Informatics, University Medical Center Göttingen, Germany; Institute for Predictive Deep Learning in Medicine and Healthcare, Justus-Liebig University, Gießen, Germany
| |
Collapse
|
2
|
Predicting the Kidney Graft Survival Using Optimized African Buffalo-Based Artificial Neural Network. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:6503714. [PMID: 35607394 PMCID: PMC9124117 DOI: 10.1155/2022/6503714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 03/31/2022] [Accepted: 04/11/2022] [Indexed: 12/02/2022]
Abstract
A variety of receptor and donor characteristics influence long-and short-term kidney graft survival. It is critical to predict the effectiveness of kidney transplantation to optimise organ allocation. This would allow patients to choose the best accessible kidney donor and the optimal immunosuppressive medication. Several studies have attempted to identify factors that predispose to graft rejection, but the results have been contradictory. As a result, the goal of this paper is to use the African buffalo-based artificial neural network (AB-ANN) approach to uncover predictive risk variables related to kidney graft. These two feature selection approaches combine to provide a novel hybrid feature selection technique that could select the most important elements to improve prediction accuracy. The feature analysis revealed that clinical features have varied effects on transplant survival. The collected data is processed in both training and testing methods. The prediction model's performance, in terms of accuracy, precision, recall, and F-measure, was examined, and the results were compared with those of other existing systems, including naive Bayesian, random forest, and J48 classifier. The results suggest that the proposed approach can forecast graft survival in kidney recipients' next visits in a creative manner and with more accuracy compared with other classifiers. This proposed method is more efficient for predicting kidney graft survival. Incorporating those clinical tools into outpatient clinics' everyday workflows could help physicians make better and more personalised decisions.
Collapse
|
3
|
Abdollahi J, Nouri-Moghaddam B. A hybrid method for heart disease diagnosis utilizing feature selection based ensemble classifier model generation. IRAN JOURNAL OF COMPUTER SCIENCE 2022; 5:229-246. [PMCID: PMC9081959 DOI: 10.1007/s42044-022-00104-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 04/19/2022] [Indexed: 09/29/2023]
Abstract
Heart disease is one of the most complicated diseases, and it affects a large number of individuals throughout the world. In healthcare, particularly cardiology, early and accurate detection of cardiac disease is critical. The Heart Disease Data Set-UCI repository collects data on heart disease. The search space and complexity of the classification models are increased by this raw dataset, which contains redundant and inconsistent data. We need to eliminate the redundant and unnecessary elements from the data to improve classification accuracy. As a consequence, feature selection approaches might be useful for reducing the cost of diagnosis by identifying the most important qualities. This research developed an ensemble classification model based on a feature selection approach in which selected features play a role in classification. Accordingly, a classification approach was introduced using ensemble learning with a genetic algorithm, feature selection, and biomedical test values to diagnose heart disease. Based on the results, it is deduced that the benefits of using the feature selection method vary depending on the utilized machine learning technique. However, the best-proposed model based on the combination of genetic algorithm and the ensemble learning model has achieved an accuracy of 97.57% on the considered datasets. The suggested diagnosis system achieved better accuracy than previously proposed methods and can easily be implemented in healthcare to identify heart disease.
Collapse
Affiliation(s)
- Jafar Abdollahi
- Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
| | - Babak Nouri-Moghaddam
- Department of Computer Engineering, Ardabil Branch, Islamic Azad University, Ardabil, Iran
| |
Collapse
|
4
|
Zhenya Q, Zhang Z. A hybrid cost-sensitive ensemble for heart disease prediction. BMC Med Inform Decis Mak 2021; 21:73. [PMID: 33632225 PMCID: PMC7905907 DOI: 10.1186/s12911-021-01436-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 02/11/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Heart disease is the primary cause of morbidity and mortality in the world. It includes numerous problems and symptoms. The diagnosis of heart disease is difficult because there are too many factors to analyze. What's more, the misclassification cost could be very high. METHODS A cost-sensitive ensemble method was proposed to improve the efficiency of diagnosis and reduce the misclassification cost. The proposed method contains five heterogeneous classifiers: random forest, logistic regression, support vector machine, extreme learning machine and k-nearest neighbor. T-test was used to investigate if the performance of the ensemble was better than individual classifiers and the contribution of Relief algorithm. RESULTS The best performance was achieved by the proposed method according to ten-fold cross validation. The statistical tests demonstrated that the performance of the proposed ensemble was significantly superior to individual classifiers, and the efficiency of classification was distinctively improved by Relief algorithm. CONCLUSIONS The proposed ensemble gained significantly better results compared with individual classifiers and previous studies, which implies that it can be used as a promising alternative tool in medical decision making for heart disease diagnosis.
Collapse
Affiliation(s)
- Qi Zhenya
- College of Management and Economics, Tianjin University, Nankai District, Tianjin, 300072 People’s Republic of China
| | - Zuoru Zhang
- School of Mathematical Science, Hebei Normal University, Yuhua District, Shijiazhuang, 050024 People’s Republic of China
| |
Collapse
|
5
|
Inan O, Uzer MS. A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2021. [DOI: 10.1007/s13369-020-04972-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
6
|
Physician-Friendly Machine Learning: A Case Study with Cardiovascular Disease Risk Prediction. J Clin Med 2019; 8:jcm8071050. [PMID: 31323843 PMCID: PMC6678298 DOI: 10.3390/jcm8071050] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 07/14/2019] [Accepted: 07/15/2019] [Indexed: 11/29/2022] Open
Abstract
Machine learning is often perceived as a sophisticated technology accessible only by highly trained experts. This prevents many physicians and biologists from using this tool in their research. The goal of this paper is to eliminate this out-dated perception. We argue that the recent development of auto machine learning techniques enables biomedical researchers to quickly build competitive machine learning classifiers without requiring in-depth knowledge about the underlying algorithms. We study the case of predicting the risk of cardiovascular diseases. To support our claim, we compare auto machine learning techniques against a graduate student using several important metrics, including the total amounts of time required for building machine learning models and the final classification accuracies on unseen test datasets. In particular, the graduate student manually builds multiple machine learning classifiers and tunes their parameters for one month using scikit-learn library, which is a popular machine learning library to obtain ones that perform best on two given, publicly available datasets. We run an auto machine learning library called auto-sklearn on the same datasets. Our experiments find that automatic machine learning takes 1 h to produce classifiers that perform better than the ones built by the graduate student in one month. More importantly, building this classifier only requires a few lines of standard code. Our findings are expected to change the way physicians see machine learning and encourage wide adoption of Artificial Intelligence (AI) techniques in clinical domains.
Collapse
|
7
|
Polato M, Aiolli F. Boolean kernels for rule based interpretation of support vector machines. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.11.094] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
8
|
Săcară AM, Indolean C, Cristea VM, Mureşan LM. Application of adaptive neuro-fuzzy interference system on biosorption of malachite green using fir ( Abies nordmanniana) cones biomass. CHEM ENG COMMUN 2019. [DOI: 10.1080/00986445.2018.1555531] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Ana Maria Săcară
- Department of Chemical Engineering, Faculty of Chemistry and Chemical Engineering, Babeş-Bolyai University, Cluj-Napoca, Romania
| | - Cerasella Indolean
- Department of Chemical Engineering, Faculty of Chemistry and Chemical Engineering, Babeş-Bolyai University, Cluj-Napoca, Romania
| | - Vasile-Mircea Cristea
- Department of Chemical Engineering, Faculty of Chemistry and Chemical Engineering, Babeş-Bolyai University, Cluj-Napoca, Romania
| | - Liana Maria Mureşan
- Department of Chemical Engineering, Faculty of Chemistry and Chemical Engineering, Babeş-Bolyai University, Cluj-Napoca, Romania
| |
Collapse
|
9
|
|
10
|
A Comparison Study on Rule Extraction from Neural Network Ensembles, Boosted Shallow Trees, and SVMs. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING 2018. [DOI: 10.1155/2018/4084850] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
One way to make the knowledge stored in an artificial neural network more intelligible is to extract symbolic rules. However, producing rules from Multilayer Perceptrons (MLPs) is an NP-hard problem. Many techniques have been introduced to generate rules from single neural networks, but very few were proposed for ensembles. Moreover, experiments were rarely assessed by 10-fold cross-validation trials. In this work, based on the Discretized Interpretable Multilayer Perceptron (DIMLP), experiments were performed on 10 repetitions of stratified 10-fold cross-validation trials over 25 binary classification problems. The DIMLP architecture allowed us to produce rules from DIMLP ensembles, boosted shallow trees (BSTs), and Support Vector Machines (SVM). The complexity of rulesets was measured with the average number of generated rules and average number of antecedents per rule. From the 25 used classification problems, the most complex rulesets were generated from BSTs trained by “gentle boosting” and “real boosting.” Moreover, we clearly observed that the less complex the rules were, the better their fidelity was. In fact, rules generated from decision stumps trained by modest boosting were, for almost all the 25 datasets, the simplest with the highest fidelity. Finally, in terms of average predictive accuracy and average ruleset complexity, the comparison of some of our results to those reported in the literature proved to be competitive.
Collapse
|
11
|
Shinde S, Kulkarni U. Extended fuzzy hyperline-segment neural network with classification rule extraction. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.03.036] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
12
|
Alkhasawneh MS, Tay LT. A Hybrid Intelligent System Integrating the Cascade Forward Neural Network with Elman Neural Network. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2017. [DOI: 10.1007/s13369-017-2833-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Paul AK, Shill PC, Rabin MRI, Murase K. Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease. APPL INTELL 2017. [DOI: 10.1007/s10489-017-1037-6] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
14
|
Synergy effects between grafting and subdivision in Re-RX with J48graft for the diagnosis of thyroid disease. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2017.06.011] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
15
|
Bologna G, Hayashi Y. Characterization of Symbolic Rules Embedded in Deep DIMLP Networks: A Challenge to Transparency of Deep Learning. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH 2017. [DOI: 10.1515/jaiscr-2017-0019] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
Rule extraction from neural networks is a fervent research topic. In the last 20 years many authors presented a number of techniques showing how to extract symbolic rules from Multi Layer Perceptrons (MLPs). Nevertheless, very few were related to ensembles of neural networks and even less for networks trained by deep learning. On several datasets we performed rule extraction from ensembles of Discretized Interpretable Multi Layer Perceptrons (DIMLP), and DIMLPs trained by deep learning. The results obtained on the Thyroid dataset and the Wisconsin Breast Cancer dataset show that the predictive accuracy of the extracted rules compare very favorably with respect to state of the art results. Finally, in the last classification problem on digit recognition, generated rules from the MNIST dataset can be viewed as discriminatory features in particular digit areas. Qualitatively, with respect to rule complexity in terms of number of generated rules and number of antecedents per rule, deep DIMLPs and DIMLPs trained by arcing give similar results on a binary classification problem involving digits 5 and 8. On the whole MNIST problem we showed that it is possible to determine the feature detectors created by neural networks and also that the complexity of the extracted rulesets can be well balanced between accuracy and interpretability.
Collapse
Affiliation(s)
- Guido Bologna
- Department of Computer Science, University of Applied Science of Western Switzerland , Rue de la Prairie 4, Geneva 1202, Switzerland
| | - Yoichi Hayashi
- Department of Computer Science, Meiji University , Tama-ku, Kawasaki, Kanagawa 214-8571, Japan
| |
Collapse
|
16
|
An enhanced fuzzy min–max neural network with ant colony optimization based-rule-extractor for decision making. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.02.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
17
|
Liu X, Wang X, Su Q, Zhang M, Zhu Y, Wang Q, Wang Q. A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2017; 2017:8272091. [PMID: 28127385 PMCID: PMC5239990 DOI: 10.1155/2017/8272091] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Revised: 07/12/2016] [Accepted: 08/01/2016] [Indexed: 11/17/2022]
Abstract
Heart disease is one of the most common diseases in the world. The objective of this study is to aid the diagnosis of heart disease using a hybrid classification system based on the ReliefF and Rough Set (RFRS) method. The proposed system contains two subsystems: the RFRS feature selection system and a classification system with an ensemble classifier. The first system includes three stages: (i) data discretization, (ii) feature extraction using the ReliefF algorithm, and (iii) feature reduction using the heuristic Rough Set reduction algorithm that we developed. In the second system, an ensemble classifier is proposed based on the C4.5 classifier. The Statlog (Heart) dataset, obtained from the UCI database, was used for experiments. A maximum classification accuracy of 92.59% was achieved according to a jackknife cross-validation scheme. The results demonstrate that the performance of the proposed system is superior to the performances of previously reported classification techniques.
Collapse
Affiliation(s)
- Xiao Liu
- School of Economics and Management, Tongji University, Shanghai, China
| | - Xiaoli Wang
- School of Economics and Management, Tongji University, Shanghai, China
| | - Qiang Su
- School of Economics and Management, Tongji University, Shanghai, China
| | - Mo Zhang
- School of Economics and Management, Shanghai Maritime University, Shanghai, China
| | - Yanhong Zhu
- Department of Scientific Research, Shanghai General Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Qiugen Wang
- Trauma Center, Shanghai General Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Qian Wang
- Trauma Center, Shanghai General Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| |
Collapse
|
18
|
Bimba AT, Idris N, Al-Hunaiyyan A, Mahmud RB, Abdelaziz A, Khan S, Chang V. Towards knowledge modeling and manipulation technologies: A survey. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT 2016. [DOI: 10.1016/j.ijinfomgt.2016.05.022] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
19
|
Peker M. A decision support system to improve medical diagnosis using a combination of k-medoids clustering based attribute weighting and SVM. J Med Syst 2016; 40:116. [DOI: 10.1007/s10916-016-0477-6] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 03/15/2016] [Indexed: 11/28/2022]
|
20
|
Shinde S, Kulkarni U. Extracting classification rules from modified fuzzy min–max neural network for data with mixed attributes. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2015.10.032] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
21
|
|
22
|
Use of the recursive-rule extraction algorithm with continuous attributes to improve diagnostic accuracy in thyroid disease. INFORMATICS IN MEDICINE UNLOCKED 2015. [DOI: 10.1016/j.imu.2015.12.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
|
23
|
Use of a Recursive-Rule eXtraction algorithm with J48graft to achieve highly accurate and concise rule extraction from a large breast cancer dataset. INFORMATICS IN MEDICINE UNLOCKED 2015. [DOI: 10.1016/j.imu.2015.12.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
|
24
|
Yilmaz N, Inan O, Uzer MS. A New Data Preparation Method Based on Clustering Algorithms for Diagnosis Systems of Heart and Diabetes Diseases. J Med Syst 2014; 38:48. [DOI: 10.1007/s10916-014-0048-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2013] [Accepted: 03/27/2014] [Indexed: 12/20/2022]
|
25
|
Neuro-Fuzzy System Based Kernel for Classification with Support Vector Machines. ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING 2014. [DOI: 10.1007/978-3-319-02309-0_45] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
26
|
JEMAI OLFA, ZAIED MOURAD, AMAR CHOKRIBEN, ALIMI MOHAMEDADEL. FAST LEARNING ALGORITHM OF WAVELET NETWORK BASED ON FAST WAVELET TRANSFORM. INT J PATTERN RECOGN 2012. [DOI: 10.1142/s0218001411009111] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, a novel learning algorithm of wavelet networks based on the Fast Wavelet Transform (FWT) is proposed. It has many advantages compared to other algorithms, in which we solve the problem in previous works, when the weights of the hidden layer to the output layer are determined by applying the back propagation algorithm or by direct solution which requires to compute the matrix inversion, this may cause intensive computation when the learning data is too large. However, the new algorithm is realized by iterative application of FWT to compute the connection weights. Furthermore, we have extended the novel learning algorithm by using Levenberg–Marquardt method to optimize the learning functions. The experimental results have demonstrated that our model is remarkably more refreshing than some of the previously established models in terms of both speed and efficiency.
Collapse
Affiliation(s)
- OLFA JEMAI
- Higher Institute of Computer Sciences, Road El Jorf, Km 22.5 — 4119 Medenine, Tunisia
- REGIM: REsearch Group on Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENIS), BP 1173, Sfax, 3038, Tunisia
| | - MOURAD ZAIED
- REGIM: REsearch Group on Intelligent Machines, National Engineering School of Sfax (ENIS), Tunisia
| | - CHOKRI BEN AMAR
- REGIM: REsearch Group on Intelligent Machines, National Engineering School of Sfax (ENIS), Tunisia
| | - MOHAMED ADEL ALIMI
- REGIM: REsearch Group on Intelligent Machines, National Engineering School of Sfax (ENIS), Tunisia
| |
Collapse
|
27
|
Blachnik M, Duch W. LVQ algorithm with instance weighting for generation of prototype-based rules. Neural Netw 2011; 24:824-30. [PMID: 21726977 DOI: 10.1016/j.neunet.2011.05.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2010] [Revised: 04/27/2011] [Accepted: 05/26/2011] [Indexed: 10/18/2022]
Abstract
Crisp and fuzzy-logic rules are used for comprehensible representation of data, but rules based on similarity to prototypes are equally useful and much less known. Similarity-based methods belong to the most accurate data mining approaches. A large group of such methods is based on instance selection and optimization, with the Learning Vector Quantization (LVQ) algorithm being a prominent example. Accuracy of LVQ depends highly on proper initialization of prototypes and the optimization mechanism. This paper introduces prototype initialization based on context dependent clustering and modification of the LVQ cost function that utilizes additional information about class-dependent distribution of training vectors. This approach is illustrated on several benchmark datasets, finding simple and accurate models of data in the form of prototype-based rules.
Collapse
Affiliation(s)
- Marcin Blachnik
- Department of Management and Informatics, Silesian University of Technology, Katowice, Krasinskiego 8, Poland.
| | | |
Collapse
|
28
|
Castro F, Nebot À, Mugica F. On the extraction of decision support rules from fuzzy predictive models. Appl Soft Comput 2011. [DOI: 10.1016/j.asoc.2011.01.018] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
29
|
Huynh TQ, Reggia JA. Guiding hidden layer representations for improved rule extraction from neural networks. ACTA ACUST UNITED AC 2010; 22:264-75. [PMID: 21138801 DOI: 10.1109/tnn.2010.2094205] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The production of relatively large and opaque weight matrices by error backpropagation learning has inspired substantial research on how to extract symbolic human-readable rules from trained networks. While considerable progress has been made, the results at present are still relatively limited, in part due to the large numbers of symbolic rules that can be generated. Most past work to address this issue has focused on progressively more powerful methods for rule extraction (RE) that try to minimize the number of weights and/or improve rule expressiveness. In contrast, here we take a different approach in which we modify the error backpropagation training process so that it learns a different hidden layer representation of input patterns than would normally occur. Using five publicly available datasets, we show via computational experiments that the modified learning method helps to extract fewer rules without increasing individual rule complexity and without decreasing classification accuracy. We conclude that modifying error backpropagation so that it more effectively separates learned pattern encodings in the hidden layer is an effective way to improve contemporary RE methods.
Collapse
Affiliation(s)
- Thuan Q Huynh
- Department of Computer Science, Universityof Maryland, College Park, MD 20742, USA.
| | | |
Collapse
|
30
|
Rivero D, Dorado J, Rabuñal J, Pazos A. Generation and simplification of Artificial Neural Networks by means of Genetic Programming. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2010.05.010] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
31
|
Al-Batah MS, Mat Isa NA, Zamli KZ, Azizli KA. Modified Recursive Least Squares algorithm to train the Hybrid Multilayered Perceptron (HMLP) network. Appl Soft Comput 2010. [DOI: 10.1016/j.asoc.2009.06.018] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
32
|
|
33
|
|
34
|
Odajima K, Hayashi Y, Tianxia G, Setiono R. Greedy rule generation from discrete data and its use in neural network rule extraction. Neural Netw 2008; 21:1020-8. [DOI: 10.1016/j.neunet.2008.01.003] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2007] [Revised: 01/25/2008] [Accepted: 01/25/2008] [Indexed: 11/29/2022]
|
35
|
Automatic Design of ANNs by Means of GP for Data Mining Tasks: Iris Flower Classification Problem. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS 2007. [DOI: 10.1007/978-3-540-71618-1_31] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
36
|
Duch W. Towards Comprehensive Foundations of Computational Intelligence. CHALLENGES FOR COMPUTATIONAL INTELLIGENCE 2007. [DOI: 10.1007/978-3-540-71984-7_11] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
|
37
|
|
38
|
Abdel-Aal RE. GMDH-based feature ranking and selection for improved classification of medical data. J Biomed Inform 2005; 38:456-68. [PMID: 16337569 DOI: 10.1016/j.jbi.2005.03.003] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2005] [Revised: 03/29/2005] [Accepted: 03/30/2005] [Indexed: 11/17/2022]
Abstract
Medical applications are often characterized by a large number of disease markers and a relatively small number of data records. We demonstrate that complete feature ranking followed by selection can lead to appreciable reductions in data dimensionality, with significant improvements in the implementation and performance of classifiers for medical diagnosis. We describe a novel approach for ranking all features according to their predictive quality using properties unique to learning algorithms based on the group method of data handling (GMDH). An abductive network training algorithm is repeatedly used to select groups of optimum predictors from the feature set at gradually increasing levels of model complexity specified by the user. Groups selected earlier are better predictors. The process is then repeated to rank features within individual groups. The resulting full feature ranking can be used to determine the optimum feature subset by starting at the top of the list and progressively including more features until the classification error rate on an out-of-sample evaluation set starts to increase due to overfitting. The approach is demonstrated on two medical diagnosis datasets (breast cancer and heart disease) and comparisons are made with other feature ranking and selection methods. Receiver operating characteristics (ROC) analysis is used to compare classifier performance. At default model complexity, dimensionality reduction of 22 and 54% could be achieved for the breast cancer and heart disease data, respectively, leading to improvements in the overall classification performance. For both datasets, considerable dimensionality reduction introduced no significant reduction in the area under the ROC curve. GMDH-based feature selection results have also proved effective with neural network classifiers.
Collapse
Affiliation(s)
- R E Abdel-Aal
- Physics Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia.
| |
Collapse
|
39
|
Abdel-Aal RE. Improved classification of medical data using abductive network committees trained on different feature subsets. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2005; 80:141-53. [PMID: 16169631 DOI: 10.1016/j.cmpb.2005.08.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2005] [Revised: 07/30/2005] [Accepted: 08/01/2005] [Indexed: 05/04/2023]
Abstract
This paper demonstrates the use of abductive network classifier committees trained on different features for improving classification accuracy in medical diagnosis. In an earlier publication, committee members were trained on different subsets of the training set to ensure enough diversity for improved committee performance. In situations characterized by high data dimensionality, i.e. a large number of features and a relatively few training examples, it may be more advantageous to split the feature set rather than the training set. We describe a novel approach for tentatively ranking the features and forming subsets of uniform predictive quality for training individual members. The abductive network training algorithm is used to select optimum predictors from the feature set at various levels of model complexity specified by the user. Using the resulting tentative ranking, the features are grouped into mutually exclusive subsets of approximately equal predictive power for training the members. The approach is demonstrated on three standard medical diagnosis datasets (breast cancer, heart disease, and diabetes). Three-member committees trained on different feature subsets and using simple output combination methods reduce classification errors by up to 20% compared to the best single model developed with the full feature set. Results are compared with those reported previously with members trained through splitting the training set. Training abductive committee members on feature subsets of approximately equal predictive power achieves both diversity and quality for improved committee performance. Ensemble feature subset selection can be performed using GMDH-based learning algorithms. The approach should be advantageous in situations characterized by high data dimensionality.
Collapse
Affiliation(s)
- R E Abdel-Aal
- Department of Computer Engineering, King Fahd University of Petroleum and Minerals, P.O. Box 1759, KFUPM, Dhahran 31261, Saudi Arabia.
| |
Collapse
|
40
|
Abstract
Probability that a crisp logical rule applied to imprecise input data is true may be computed using fuzzy membership function (MF). All reasonable assumptions about input uncertainty distributions lead to MFs of sigmoidal shape. Convolution of several inputs with uniform uncertainty leads to bell-shaped Gaussian-like uncertainty functions. Relations between input uncertainties and fuzzy rules are systematically explored and several new types of MFs discovered. Multilayered perceptron (MLP) networks are shown to be a particular implementation of hierarchical sets of fuzzy threshold logic rules based on sigmoidal MFs. They are equivalent to crisp logical networks applied to input data with uncertainty. Leaving fuzziness on the input side makes the networks or the rule systems easier to understand. Practical applications of these ideas are presented for analysis of questionnaire data and gene expression data.
Collapse
Affiliation(s)
- Włodzisław Duch
- Department of Informatics, Nicholaus Copernicus University, 87-100 Toruń, Poland.
| |
Collapse
|
41
|
Velayutham CS, Kumar S. Asymmetric subsethood-product fuzzy neural inference system (ASuPFuNIS). IEEE TRANSACTIONS ON NEURAL NETWORKS 2005; 16:160-74. [PMID: 15732396 DOI: 10.1109/tnn.2004.836202] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
This paper presents an asymmetric subsethood-product fuzzy neural inference system (ASuPFuNIS) that directly extends the SuPFuNIS model by permitting signal and weight fuzzy sets to be modeled by asymmetric Gaussian membership functions. The asymmetric subsethood-product network admits both numeric as well as linguistic inputs. Input nodes, which act as tunable feature fuzzifiers, fuzzify numeric inputs with asymmetric Gaussian fuzzy sets; and linguistic inputs are presented as is. The antecedent and consequent labels of standard fuzzy if-then rules are represented as asymmetric Gaussian fuzzy connection weights of the network. The model uses mutual subsethood based activation spread and a product aggregation operator that works in conjunction with volume defuzzification in a gradient descent learning framework. Despite the increase in the number of free parameters, the proposed model performs better than SuPFuNIS, on various benchmarking problems, both in terms of the performance accuracy and architectural economy and compares excellently with other various existing models with a performance better than most of them.
Collapse
Affiliation(s)
- C Shunmuga Velayutham
- Department of Physics and Computer Science, Faculty of Science, Dayalbagh Educational Institute, Dayalbagh, Agra 282005 India
| | | |
Collapse
|
42
|
|
43
|
Fuzzy logic and evolutionary algorithm—two techniques in rule extraction from neural networks. Neurocomputing 2005. [DOI: 10.1016/j.neucom.2004.04.015] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
44
|
|
45
|
Rabuñal JR, Dorado J, Pazos A, Pereira J, Rivero D. A new approach to the extraction of ANN rules and to their generalization capacity through GP. Neural Comput 2004; 16:1483-523. [PMID: 15165398 DOI: 10.1162/089976604323057461] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Various techniques for the extraction of ANN rules have been used, but most of them have focused on certain types of networks and their training. There are very few methods that deal with ANN rule extraction as systems that are independent of their architecture, training, and internal distribution of weights, connections, and activation functions. This article proposes a methodology for the extraction of ANN rules, regardless of their architecture, and based on genetic programming. The strategy is based on the previous algorithm and aims at achieving the generalization capacity that is characteristic of ANNs by means of symbolic rules that are understandable to human beings.
Collapse
Affiliation(s)
- Juan R Rabuñal
- Department of Information and Communications Technologies, University of A Coruña, Faculta de Informática, Campus Elviña s/n, 15192 A Coruña, Spain.
| | | | | | | | | |
Collapse
|
46
|
Abstract
The inherent black-box nature of neural networks is an important drawback with respect to the problem of explanation of neural network responses. Although several articles have tackled the problem of rule extraction from a single neural network, just a few papers have investigated rule extraction from several combined neural networks. In this article we describe how to translate symbolic rules into the Discretized Interpretable Multi-Layer Perceptron (DIMLP) and how to extract rules from one or several combined neural networks. Our approach consists of characterizing discriminant hyperplane frontiers. Unordered rules are extracted in polynomial time with respect to the size of the problem and the size of the network. Moreover, the degree of matching between extracted rules and neural network responses is 100% on training examples. We applied single DIMLP networks to 17 data sets related to medical diagnosis and medical prognosis problems. Results based on 10-fold cross-validation showed that the DIMLP model was on average as accurate as standard multi-layer perceptrons (MLP). Furthermore, DIMLP networks were significantly more accurate than CN2 on eight problems, whereas only on one problem CN2 was better than DIMLP. Finally, a non-Hodgkin lymphoma diagnosis problem based on classification of electrophoresis gels was defined. It turned out that ensembles of DIMLP networks were significantly more accurate than CN2 (96.1% +/- 1.4 versus 82.7% +/- 4.0). Finally, symbolic rules revealed the presence of five important spots for the discrimination of the class of Lymphocyte Leukemia/Chronic Lymphoid Leukemia (Lc/LLc), and the class of Centrocytic Lymphoma (Cc).
Collapse
Affiliation(s)
- G Bologna
- Swiss Institute of Bioinformatics, Rue Michel Servet 1, 1211 Geneva, Switzerland.
| |
Collapse
|
47
|
Abstract
We propose a new scheme for enlarging generalized learning vector quantization (GLVQ) with weighting factors for the input dimensions. The factors allow an appropriate scaling of the input dimensions according to their relevance. They are adapted automatically during training according to the specific classification task whereby training can be interpreted as stochastic gradient descent on an appropriate error function. This method leads to a more powerful classifier and to an adaptive metric with little extra cost compared to standard GLVQ. Moreover, the size of the weighting factors indicates the relevance of the input dimensions. This proposes a scheme for automatically pruning irrelevant input dimensions. The algorithm is verified on artificial data sets and the iris data from the UCI repository. Afterwards, the method is compared to several well known algorithms which determine the intrinsic data dimension on real world satellite image data.
Collapse
Affiliation(s)
- Barbara Hammer
- Department of Mathematics and Computer Science, University of Osnabrück, Germany.
| | | |
Collapse
|
48
|
|
49
|
|
50
|
Paul S, Kumar S. Subsethood-product fuzzy neural inference system (SuPFuNIS). ACTA ACUST UNITED AC 2002; 13:578-99. [DOI: 10.1109/tnn.2002.1000126] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|