1
|
Ensemble of Networks for Multilabel Classification. SIGNALS 2022. [DOI: 10.3390/signals3040054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Multilabel learning goes beyond standard supervised learning models by associating a sample with more than one class label. Among the many techniques developed in the last decade to handle multilabel learning best approaches are those harnessing the power of ensembles and deep learners. This work proposes merging both methods by combining a set of gated recurrent units, temporal convolutional neural networks, and long short-term memory networks trained with variants of the Adam optimization approach. We examine many Adam variants, each fundamentally based on the difference between present and past gradients, with step size adjusted for each parameter. We also combine Incorporating Multiple Clustering Centers and a bootstrap-aggregated decision trees ensemble, which is shown to further boost classification performance. In addition, we provide an ablation study for assessing the performance improvement that each module of our ensemble produces. Multiple experiments on a large set of datasets representing a wide variety of multilabel tasks demonstrate the robustness of our best ensemble, which is shown to outperform the state-of-the-art.
Collapse
|
2
|
Nanni L, Lumini A, Brahnam S. Neural networks for anatomical therapeutic chemical (ATC) classification. APPLIED COMPUTING AND INFORMATICS 2022. [DOI: 10.1108/aci-11-2021-0301] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeAutomatic anatomical therapeutic chemical (ATC) classification is progressing at a rapid pace because of its potential in drug development. Predicting an unknown compound's therapeutic and chemical characteristics in terms of how it affects multiple organs and physiological systems makes automatic ATC classification a vital yet challenging multilabel problem. The aim of this paper is to experimentally derive an ensemble of different feature descriptors and classifiers for ATC classification that outperforms the state-of-the-art.Design/methodology/approachThe proposed method is an ensemble generated by the fusion of neural networks (i.e. a tabular model and long short-term memory networks (LSTM)) and multilabel classifiers based on multiple linear regression (hMuLab). All classifiers are trained on three sets of descriptors. Features extracted from the trained LSTMs are also fed into hMuLab. Evaluations of ensembles are compared on a benchmark data set of 3883 ATC-coded pharmaceuticals taken from KEGG, a publicly available drug databank.FindingsExperiments demonstrate the power of the authors’ best ensemble, EnsATC, which is shown to outperform the best methods reported in the literature, including the state-of-the-art developed by the fast.ai research group. The MATLAB source code of the authors’ system is freely available to the public at https://github.com/LorisNanni/Neural-networks-for-anatomical-therapeutic-chemical-ATC-classification.Originality/valueThis study demonstrates the power of extracting LSTM features and combining them with ATC descriptors in ensembles for ATC classification.
Collapse
|
3
|
Suri JS, Bhagawati M, Paul S, Protogerou AD, Sfikakis PP, Kitas GD, Khanna NN, Ruzsa Z, Sharma AM, Saxena S, Faa G, Laird JR, Johri AM, Kalra MK, Paraskevas KI, Saba L. A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review. Diagnostics (Basel) 2022; 12:722. [PMID: 35328275 PMCID: PMC8947682 DOI: 10.3390/diagnostics12030722] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 03/10/2022] [Accepted: 03/13/2022] [Indexed: 12/16/2022] Open
Abstract
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks.
Collapse
Affiliation(s)
- Jasjit S. Suri
- Stroke Diagnostic and Monitoring Division, AtheroPoint™, Roseville, CA 95661, USA
| | - Mrinalini Bhagawati
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Sudip Paul
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Athanasios D. Protogerou
- Research Unit Clinic, Laboratory of Pathophysiology, Department of Cardiovascular Prevention, National and Kapodistrian University of Athens, 11527 Athens, Greece;
| | - Petros P. Sfikakis
- Rheumatology Unit, National Kapodistrian University of Athens, 11527 Athens, Greece;
| | - George D. Kitas
- Arthritis Research UK Centre for Epidemiology, Manchester University, Manchester 46962, UK;
| | - Narendra N. Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi 110020, India;
| | - Zoltan Ruzsa
- Department of Internal Medicines, Invasive Cardiology Division, University of Szeged, 6720 Szeged, Hungary;
| | - Aditya M. Sharma
- Division of Cardiovascular Medicine, University of Virginia, Charlottesville, VA 22903, USA;
| | - Sanjay Saxena
- Department of CSE, International Institute of Information Technology, Bhubaneswar 751003, India;
| | - Gavino Faa
- Department of Pathology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| | - John R. Laird
- Cardiology Department, St. Helena Hospital, St. Helena, CA 94574, USA;
| | - Amer M. Johri
- Department of Medicine, Division of Cardiology, Queen’s University, Kingston, ON K7L 3N6, Canada;
| | - Manudeep K. Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA;
| | - Kosmas I. Paraskevas
- Department of Vascular Surgery, Central Clinic of Athens, N. Iraklio, 14122 Athens, Greece;
| | - Luca Saba
- Department of Radiology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| |
Collapse
|
5
|
Wang L, Qi S, Liu Y, Lou H, Zuo X. Bagging k-dependence Bayesian network classifiers. INTELL DATA ANAL 2021. [DOI: 10.3233/ida-205125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Bagging has attracted much attention due to its simple implementation and the popularity of bootstrapping. By learning diverse classifiers from resampled datasets and averaging the outcomes, bagging investigates the possibility of achieving substantial classification performance of the base classifier. Diversity has been recognized as a very important characteristic in bagging. This paper presents an efficient and effective bagging approach, that learns a set of independent Bayesian network classifiers (BNCs) from disjoint data subspaces. The number of bits needed to describe the data is measured in terms of log likelihood, and redundant edges are identified to optimize the topologies of the learned BNCs. Our extensive experimental evaluation on 54 publicly available datasets from the UCI machine learning repository reveals that the proposed algorithm achieves a competitive classification performance compared with state-of-the-art BNCs that use or do not use bagging procedures, such as tree-augmented naive Bayes (TAN), k-dependence Bayesian classifier (KDB), bagging NB or bagging TAN.
Collapse
Affiliation(s)
- Limin Wang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Sikai Qi
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Yang Liu
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
| | - Hua Lou
- Department of Software and Big Data, Changzhou College of Information Technology, Changzhou, Jiangsu, China
| | - Xin Zuo
- School of Foreign Languages, Changchun University of Technology, Changchun, Jilin, China
| |
Collapse
|
6
|
Rectified-Linear-Unit-Based Deep Learning for Biomedical Multi-label Data. Interdiscip Sci 2016; 9:419-422. [PMID: 27837428 DOI: 10.1007/s12539-016-0196-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 10/03/2016] [Accepted: 10/25/2016] [Indexed: 12/15/2022]
Abstract
Disease diagnosis is one of the major data mining questions by the clinicians. The current diagnosis models usually have a strong assumption that one patient has only one disease, i.e. a single-label data mining problem. But the patients, especially when at the late stages, may have more than one disease and require a multi-label diagnosis. The multi-label data mining is much more difficult than a single-label one, and very few algorithms have been developed for this situation. Deep learning is a data mining algorithm with highly dense inner structure and has achieved many successful applications in the other areas. We propose a hypothesis that rectified-linear-unit-based deep learning algorithm may also be good at the clinical questions, by revising the last layer as a multi-label output. The proof-of-concept experimental data support the hypothesis, and the community may be interested in trying more applications.
Collapse
|