1
|
Chakraborty M. Rule extraction from convolutional neural networks for heart disease prediction. Biomed Eng Lett 2024; 14:649-661. [PMID: 38946810 PMCID: PMC11208388 DOI: 10.1007/s13534-024-00358-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Revised: 01/21/2024] [Accepted: 01/30/2024] [Indexed: 07/02/2024] Open
Abstract
The accurate prediction of heart disease is crucial in the field of medicine. While convolutional neural networks have shown remarkable precision in heart disease prediction, they are often perceived as opaque models due to their complex internal workings. This paper introduces a novel method, named Extraction of Classification Rules from Convolutional Neural Network (ECRCNN), aimed at extracting rules from convolutional neural networks to enhance interpretability in heart disease prediction. The ECRCNN algorithm analyses updated kernels to derive understandable rules from convolutional neural networks, providing valuable insights into the contributing factors of heart disease. The algorithm's performance is assessed using the Statlog (Heart) dataset from the University of California, Irvine's repository. Experimental results underscore the effectiveness of the ECRCNN algorithm in predicting heart disease and extracting meaningful rules. The extracted rules can assist healthcare professionals in making precise diagnoses and formulating targeted treatment plans. In summary, the proposed method bridges the gap between the high accuracy of convolutional neural networks and the interpretability necessary for informed decision-making in heart disease prediction.
Collapse
Affiliation(s)
- Manomita Chakraborty
- School of Computer Science and Engineering, VIT-AP University, Amaravati, Andhra Pradesh 522237 India
| |
Collapse
|
2
|
Zhang Y, Tino P, Leonardis A, Tang K. A Survey on Neural Network Interpretability. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2021. [DOI: 10.1109/tetci.2021.3100641] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
3
|
Optimal artificial neural network-based data mining technique for stress prediction in working employees. Soft comput 2021. [DOI: 10.1007/s00500-021-06058-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
4
|
Dattachaudhuri A, Biswas SK, Chakraborty M, Sarkar S. A transparent rule-based expert system using neural network. Soft comput 2021. [DOI: 10.1007/s00500-020-05547-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
5
|
Yu J, Liu G. RETRACTED: Knowledge-based deep belief network for machining roughness prediction and knowledge discovery. COMPUT IND 2020. [DOI: 10.1016/j.compind.2020.103262] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
6
|
Yu J, Liu G. Knowledge extraction and insertion to deep belief network for gearbox fault diagnosis. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105883] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
7
|
Rule extraction from neural network trained using deep belief network and back propagation. Knowl Inf Syst 2020. [DOI: 10.1007/s10115-020-01473-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
8
|
Tran SN, d'Avila Garcez AS. Deep Logic Networks: Inserting and Extracting Knowledge From Deep Belief Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:246-258. [PMID: 27845678 DOI: 10.1109/tnnls.2016.2603784] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Developments in deep learning have seen the use of layerwise unsupervised learning combined with supervised learning for fine-tuning. With this layerwise approach, a deep network can be seen as a more modular system that lends itself well to learning representations. In this paper, we investigate whether such modularity can be useful to the insertion of background knowledge into deep networks, whether it can improve learning performance when it is available, and to the extraction of knowledge from trained deep networks, and whether it can offer a better understanding of the representations learned by such networks. To this end, we use a simple symbolic language-a set of logical rules that we call confidence rules-and show that it is suitable for the representation of quantitative reasoning in deep networks. We show by knowledge extraction that confidence rules can offer a low-cost representation for layerwise networks (or restricted Boltzmann machines). We also show that layerwise extraction can produce an improvement in the accuracy of deep belief networks. Furthermore, the proposed symbolic characterization of deep networks provides a novel method for the insertion of prior knowledge and training of deep networks. With the use of this method, a deep neural-symbolic system is proposed and evaluated, with the experimental results indicating that modularity through the use of confidence rules and knowledge insertion can be beneficial to network performance.
Collapse
|
9
|
A Comparison Study on Rule Extraction from Neural Network Ensembles, Boosted Shallow Trees, and SVMs. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING 2018. [DOI: 10.1155/2018/4084850] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
One way to make the knowledge stored in an artificial neural network more intelligible is to extract symbolic rules. However, producing rules from Multilayer Perceptrons (MLPs) is an NP-hard problem. Many techniques have been introduced to generate rules from single neural networks, but very few were proposed for ensembles. Moreover, experiments were rarely assessed by 10-fold cross-validation trials. In this work, based on the Discretized Interpretable Multilayer Perceptron (DIMLP), experiments were performed on 10 repetitions of stratified 10-fold cross-validation trials over 25 binary classification problems. The DIMLP architecture allowed us to produce rules from DIMLP ensembles, boosted shallow trees (BSTs), and Support Vector Machines (SVM). The complexity of rulesets was measured with the average number of generated rules and average number of antecedents per rule. From the 25 used classification problems, the most complex rulesets were generated from BSTs trained by “gentle boosting” and “real boosting.” Moreover, we clearly observed that the less complex the rules were, the better their fidelity was. In fact, rules generated from decision stumps trained by modest boosting were, for almost all the 25 datasets, the simplest with the highest fidelity. Finally, in terms of average predictive accuracy and average ruleset complexity, the comparison of some of our results to those reported in the literature proved to be competitive.
Collapse
|
10
|
Shinde S, Kulkarni U. Extended fuzzy hyperline-segment neural network with classification rule extraction. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.03.036] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
11
|
Bologna G, Hayashi Y. Characterization of Symbolic Rules Embedded in Deep DIMLP Networks: A Challenge to Transparency of Deep Learning. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH 2017. [DOI: 10.1515/jaiscr-2017-0019] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
Rule extraction from neural networks is a fervent research topic. In the last 20 years many authors presented a number of techniques showing how to extract symbolic rules from Multi Layer Perceptrons (MLPs). Nevertheless, very few were related to ensembles of neural networks and even less for networks trained by deep learning. On several datasets we performed rule extraction from ensembles of Discretized Interpretable Multi Layer Perceptrons (DIMLP), and DIMLPs trained by deep learning. The results obtained on the Thyroid dataset and the Wisconsin Breast Cancer dataset show that the predictive accuracy of the extracted rules compare very favorably with respect to state of the art results. Finally, in the last classification problem on digit recognition, generated rules from the MNIST dataset can be viewed as discriminatory features in particular digit areas. Qualitatively, with respect to rule complexity in terms of number of generated rules and number of antecedents per rule, deep DIMLPs and DIMLPs trained by arcing give similar results on a binary classification problem involving digits 5 and 8. On the whole MNIST problem we showed that it is possible to determine the feature detectors created by neural networks and also that the complexity of the extracted rulesets can be well balanced between accuracy and interpretability.
Collapse
Affiliation(s)
- Guido Bologna
- Department of Computer Science, University of Applied Science of Western Switzerland , Rue de la Prairie 4, Geneva 1202, Switzerland
| | - Yoichi Hayashi
- Department of Computer Science, Meiji University , Tama-ku, Kawasaki, Kanagawa 214-8571, Japan
| |
Collapse
|
12
|
Chan V, Chan C. Towards Developing the Piece-Wise Linear Neural Network Algorithm for Rule Extraction. INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE 2017. [DOI: 10.4018/ijcini.2017040104] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This paper discusses development and application of a decomposition neural network rule extraction algorithm for nonlinear regression problems. The algorithm is called the piece-wise linear artificial neural network or PWL-ANN algorithm. The objective of the algorithm is to “open up” the black box of a neural network model so that rules in the form of linear equations are generated by approximating the sigmoid activation functions of the hidden neurons in an artificial neural network (ANN). The preliminary results showed that the algorithm gives high fidelity and satisfactory results on sixteen of the nineteen tested datasets. By analyzing the values of R2 given by the PWL approximation on the hidden neurons and the overall output, it is evident that in addition to accurate approximation of each individual node of a given ANN model, there are more factors affecting the fidelity of the PWL-ANN algorithm Nevertheless, the algorithm shows promising potential for domains when better understanding about the problem is needed.
Collapse
Affiliation(s)
| | - Christine Chan
- University of Regina, Faculty of Engineering and Applied Science, Regina, Canada
| |
Collapse
|
13
|
Biswas SK, Chakraborty M, Purkayastha B, Roy P, Thounaojam DM. Rule Extraction from Training Data Using Neural Network. INT J ARTIF INTELL T 2016. [DOI: 10.1142/s0218213017500063] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Data Mining is a powerful technology to help organization to concentrate on most important data by extracting useful information from large database. One of the most commonly used techniques in data mining is Artificial Neural Network due to its high performance in many application domains. Despite many advantages of Artificial Neural Network, one of its main drawbacks is its inherent black box nature which is the main problem of using Artificial Neural Network in data mining. Therefore, this paper proposes a rule extraction algorithm from neural network using classified and misclassified data to convert the black box nature of Artificial Neural Network into a white box. The proposed algorithm is a modification of the existing algorithm, Rule Extraction by Reverse Engineering (RxREN). The proposed algorithm extracts rules from trained neural network for datasets with mixed mode attributes using pedagogical approach. The proposed algorithm uses both classified as well as misclassified data to find out the data ranges of significant attributes in respective classes, which is the innovation of the proposed algorithm. The experimental results clearly show that the performance of the proposed algorithm is superior to existing algorithms.
Collapse
Affiliation(s)
- Saroj Kumar Biswas
- Computer Science and Engineering Department, National Institute of Technology Silchar-788010, Assam, India
| | - Manomita Chakraborty
- Computer Science and Engineering Department, National Institute of Technology Silchar-788010, Assam, India
| | - Biswajit Purkayastha
- Computer Science and Engineering Department, National Institute of Technology Silchar-788010, Assam, India
| | - Pinki Roy
- Computer Science and Engineering Department, National Institute of Technology Silchar-788010, Assam, India
| | - Dalton Meitei Thounaojam
- Computer Science and Engineering Department, National Institute of Technology Silchar-788010, Assam, India
| |
Collapse
|
14
|
Pessa E. Neural Network Models. NATURE-INSPIRED COMPUTING 2016:368-395. [DOI: 10.4018/978-1-5225-0788-8.ch015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
Abstract
The Artificial Neural Network (ANN) models gained a wide popularity owing to a number of claimed advantages such as biological plausibility, tolerance with respect to errors or noise in the input data, learning ability allowing an adaptability to environmental constraints. Notwithstanding the fact that most of these advantages are not typical only of ANNs, engineers, psychologists and neuroscientists made an extended use of ANN models in a large number of scientific investigations. In most cases, however, these models have been introduced in order to provide optimization tools more useful than the ones commonly used by traditional Optimization Theory. Unfortunately, just the successful performance of ANN models in optimization tasks produced a widespread neglect of the true – and important – objectives pursued by the first promoters of these models. These objectives can be shortly summarized by the manifesto of connectionist psychology, stating that mental processes are nothing but macroscopic phenomena, emergent from the cooperative interaction of a large number of microscopic knowledge units. This statement – wholly in line with the goal of statistical mechanics – can be readily extended to other processes, beyond the mental ones, including social, economic, and, in general, organizational ones. Therefore this chapter has been designed in order to answer a number of related questions, such as: are the ANN models able to grant for the occurrence of this sort of emergence? How can the occurrence of this emergence be empirically detected? How can the emergence produced by ANN models be controlled? In which sense the ANN emergence could offer a new paradigm for the explanation of macroscopic phenomena? Answering these questions induces to focus the chapter on less popular ANNs, such as the recurrent ones, while neglecting more popular models, such as perceptrons, and on less used units, such as spiking neurons, rather than on McCulloch-Pitts neurons. Moreover, the chapter must mention a number of strategies of emergence detection, useful for researchers performing computer simulations of ANN behaviours. Among these strategies it is possible to quote the reduction of ANN models to continuous models, such as the neural field models or the neural mass models, the recourse to the methods of Network Theory and the employment of techniques borrowed by Statistical Physics, like the one based on the Renormalization Group. Of course, owing to space (and mathematical expertise) requirements, most mathematical details of the proposed arguments are neglected, and, to gain more information, the reader is deferred to the quoted literature.
Collapse
|
15
|
Use of a Recursive-Rule eXtraction algorithm with J48graft to achieve highly accurate and concise rule extraction from a large breast cancer dataset. INFORMATICS IN MEDICINE UNLOCKED 2015. [DOI: 10.1016/j.imu.2015.12.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
|
16
|
de Avila e Silva S, Forte F, T S Sartor I, Andrighetti T, J L Gerhardt G, Longaray Delamare AP, Echeverrigaray S. DNA duplex stability as discriminative characteristic for Escherichia coli σ(54)- and σ(28)- dependent promoter sequences. Biologicals 2013; 42:22-8. [PMID: 24172230 DOI: 10.1016/j.biologicals.2013.10.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2013] [Accepted: 10/01/2013] [Indexed: 11/17/2022] Open
Abstract
The advent of modern high-throughput sequencing has made it possible to generate vast quantities of genomic sequence data. However, the processing of this volume of information, including prediction of gene-coding and regulatory sequences remains an important bottleneck in bioinformatics research. In this work, we integrated DNA duplex stability into the repertoire of a Neural Network (NN) capable of predicting promoter regions with augmented accuracy, specificity and sensitivity. We took our method beyond a simplistic analysis based on a single sigma subunit of RNA polymerase, incorporating the six main sigma-subunits of Escherichia coli. This methodology employed successfully re-discovered known promoter sequences recognized by E. coli RNA polymerase subunits σ(24), σ(28), σ(32), σ(38), σ(54) and σ(70), with highlighted accuracies for σ(28)- and σ(54)- dependent promoter sequences (values obtained were 80% and 78.8%, respectively). Furthermore, the discrimination of promoters according to the σ factor made it possible to extract functional commonalities for the genes expressed by each type of promoter. The DNA duplex stability rises as a distinctive feature which improves the recognition and classification of σ(28)- and σ(54)- dependent promoter sequences. The findings presented in this report underscore the usefulness of including DNA biophysical parameters into NN learning algorithms to increase accuracy, specificity and sensitivity in promoter beyond what is accomplished based on sequence alone.
Collapse
Affiliation(s)
- Scheila de Avila e Silva
- Universidade de Caxias do Sul, Instituto de Biotecnologia, Rua Francisco Getúlio Vargas, 1130, CEP 95070-560 Caxias do Sul, RS, Brazil.
| | - Franciele Forte
- Universidade de Caxias do Sul, Instituto de Biotecnologia, Rua Francisco Getúlio Vargas, 1130, CEP 95070-560 Caxias do Sul, RS, Brazil.
| | - Ivaine T S Sartor
- Universidade de Caxias do Sul, Instituto de Biotecnologia, Rua Francisco Getúlio Vargas, 1130, CEP 95070-560 Caxias do Sul, RS, Brazil.
| | - Tahila Andrighetti
- Universidade de Caxias do Sul, Instituto de Biotecnologia, Rua Francisco Getúlio Vargas, 1130, CEP 95070-560 Caxias do Sul, RS, Brazil.
| | - Günther J L Gerhardt
- Universidade de Caxias do Sul, Instituto de Biotecnologia, Rua Francisco Getúlio Vargas, 1130, CEP 95070-560 Caxias do Sul, RS, Brazil.
| | - Ana Paula Longaray Delamare
- Universidade de Caxias do Sul, Instituto de Biotecnologia, Rua Francisco Getúlio Vargas, 1130, CEP 95070-560 Caxias do Sul, RS, Brazil.
| | - Sergio Echeverrigaray
- Universidade de Caxias do Sul, Instituto de Biotecnologia, Rua Francisco Getúlio Vargas, 1130, CEP 95070-560 Caxias do Sul, RS, Brazil.
| |
Collapse
|
17
|
A novel classification model for cotton yarn quality based on trained neural network using genetic algorithm. Knowl Based Syst 2013. [DOI: 10.1016/j.knosys.2012.10.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
18
|
Augasta MG, Kathirvalavakumar T. Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems. Neural Process Lett 2011. [DOI: 10.1007/s11063-011-9207-8] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
19
|
Pedrycz A, Dong F, Hirota K. Representation of neural networks through their multi-linearization. Neurocomputing 2011. [DOI: 10.1016/j.neucom.2011.03.038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
20
|
A new data mining scheme using artificial neural networks. SENSORS 2011; 11:4622-47. [PMID: 22163866 PMCID: PMC3231400 DOI: 10.3390/s110504622] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2011] [Revised: 04/11/2011] [Accepted: 04/14/2011] [Indexed: 11/16/2022]
Abstract
Classification is one of the data mining problems receiving enormous attention in the database community. Although artificial neural networks (ANNs) have been successfully applied in a wide range of machine learning applications, they are however often regarded as black boxes, i.e., their predictions cannot be explained. To enhance the explanation of ANNs, a novel algorithm to extract symbolic rules from ANNs has been proposed in this paper. ANN methods have not been effectively utilized for data mining tasks because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by human experts. With the proposed approach, concise symbolic rules with high accuracy, that are easily explainable, can be extracted from the trained ANNs. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the accuracy. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of benchmark data mining classification problems.
Collapse
|
21
|
Shi J, Su Q, Zhang C, Huang G, Zhu Y. An intelligent decision support algorithm for diagnosis of colorectal cancer through serum tumor markers. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2010; 100:97-107. [PMID: 20346535 DOI: 10.1016/j.cmpb.2010.03.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2009] [Revised: 02/25/2010] [Accepted: 03/01/2010] [Indexed: 05/29/2023]
Abstract
Nowadays, a wide range of serum tumor markers can be applied in the diagnosis of colorectal cancer. There exists a wide variability in the type and number of routinely used markers so that, sometimes, patients may receive redundant or insufficient checks. Furthermore, the traditional single cutoff point also hinders the efficient utilization of the continuous check value of a tumor marker. In order to improve the diagnostic accuracy (DA) and decrease the cost, it is necessary to optimize the check combinations and exploit the check values fully. To this end, focusing on colorectal cancer (CRC), an artificial intelligent algorithm entitled DS-STM (diagnosis strategy of serum tumor makers) is developed in this paper. DS-STM can provide decision support for physicians on the usage of different tumor markers and diagnosis of colorectal cancer (CRC). The study demonstrates that, instead of five or more tumor markers, two markers are already enough for diagnosis for most CRC patients. The experimental study shows, compared to the traditional serial test, DS-STM can improve DA from 67.53% to 73.87% for the same validation dataset. In addition, a significant cost reduction can be achieved with the new developed diagnosis strategy.
Collapse
Affiliation(s)
- Jinghua Shi
- Department of Industrial Engineering and Logistics Management, Shanghai Jiao Tong University, Dong Chuan Road 800, Minhang District, Shanghai 200240, China
| | | | | | | | | |
Collapse
|