1
|
Vallejo-Mancero B, Faci-Lázaro S, Zapata M, Soriano J, Madrenas J. Real-time hardware emulation of neural cultures: A comparative study of in vitro, in silico and in duris silico models. Neural Netw 2024; 179:106593. [PMID: 39142177 DOI: 10.1016/j.neunet.2024.106593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 06/20/2024] [Accepted: 07/31/2024] [Indexed: 08/16/2024]
Abstract
Biological neural networks are well known for their capacity to process information with extremely low power consumption. Fields such as Artificial Intelligence, with high computational costs, are seeking for alternatives inspired in biological systems. An inspiring alternative is to implement hardware architectures that replicate the behavior of biological neurons but with the flexibility in programming capabilities of an electronic device, all combined with a relatively low operational cost. To advance in this quest, here we analyze the capacity of the HEENS hardware architecture to operate in a similar manner as an in vitro neuronal network grown in the laboratory. For that, we considered data of spontaneous activity in living neuronal cultures of about 400 neurons and compared their collective dynamics and functional behavior with those obtained from direct numerical simulations (in silico) and hardware implementations (in duris silico). The results show that HEENS is capable to mimic both the in vitro and in silico systems with high efficient-cost ratio, and on different network topological designs. Our work shows that compact low-cost hardware implementations are feasible, opening new avenues for future, highly efficient neuromorphic devices and advanced human-machine interfacing.
Collapse
Affiliation(s)
- Bernardo Vallejo-Mancero
- Department of Electronic Engineering, Universitat Politecnica de Catalunya, Jordi Girona, 1-3, edif. C4, Barcelona, 08034, Catalunya, Spain.
| | - Sergio Faci-Lázaro
- Department of Condensed Matter Physics, University of Zaragoza, C. de Pedro Cerbuna, 12, Zaragoza, 50009, Spain; GOTHAM Lab, Institute of Biocomputation and Physics of Complex Systems, University of Zaragoza, C. de Pedro Cerbuna, 12, Zaragoza, 50009, Spain
| | - Mireya Zapata
- Department of Electronic Engineering, Universitat Politecnica de Catalunya, Jordi Girona, 1-3, edif. C4, Barcelona, 08034, Catalunya, Spain; Centro de Investigación en Mecatrónica y Sistemas Interactivos - MIST, Universidad Indoamérica, Machala y Sabanilla, Quito, 170103, Ecuador
| | - Jordi Soriano
- Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martíi Franquès 1, Barcelona, 08028, Spain; Universitat de Barcelona Institute of Complex Systems (UBICS), Gran Via Corts Catalanes 585, Barcelona, 08007, Spain
| | - Jordi Madrenas
- Department of Electronic Engineering, Universitat Politecnica de Catalunya, Jordi Girona, 1-3, edif. C4, Barcelona, 08034, Catalunya, Spain
| |
Collapse
|
2
|
Wang X, Wang Y, Ma Z, Wong KC, Li X. Exhaustive Exploitation of Nature-Inspired Computation for Cancer Screening in an Ensemble Manner. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1366-1379. [PMID: 38578856 DOI: 10.1109/tcbb.2024.3385402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/07/2024]
Abstract
Accurate screening of cancer types is crucial for effective cancer detection and precise treatment selection. However, the association between gene expression profiles and tumors is often limited to a small number of biomarker genes. While computational methods using nature-inspired algorithms have shown promise in selecting predictive genes, existing techniques are limited by inefficient search and poor generalization across diverse datasets. This study presents a framework termed Evolutionary Optimized Diverse Ensemble Learning (EODE) to improve ensemble learning for cancer classification from gene expression data. The EODE methodology combines an intelligent grey wolf optimization algorithm for selective feature space reduction, guided random injection modeling for ensemble diversity enhancement, and subset model optimization for synergistic classifier combinations. Extensive experiments were conducted across 35 gene expression benchmark datasets encompassing varied cancer types. Results demonstrated that EODE obtained significantly improved screening accuracy over individual and conventionally aggregated models. The integrated optimization of advanced feature selection, directed specialized modeling, and cooperative classifier ensembles helps address key challenges in current nature-inspired approaches. This provides an effective framework for robust and generalized ensemble learning with gene expression biomarkers.
Collapse
|
3
|
Varzaneh ZA, Hosseini S. An improved equilibrium optimization algorithm for feature selection problem in network intrusion detection. Sci Rep 2024; 14:18696. [PMID: 39134565 PMCID: PMC11319621 DOI: 10.1038/s41598-024-67488-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/11/2024] [Indexed: 08/15/2024] Open
Abstract
In this paper, an enhanced equilibrium optimization (EO) version named Levy-opposition-equilibrium optimization (LOEO) is proposed to select effective features in network intrusion detection systems (IDSs). The opposition-based learning (OBL) approach is applied by this algorithm to improve the diversity of the population. Also, the Levy flight method is utilized to escape local optima. Then, the binary rendition of the algorithm called BLOEO is employed to feature selection in IDSs. One of the main challenges in IDSs is the high-dimensional feature space, with many irrelevant or redundant features. The BLOEO algorithm is designed to intelligently select the most informative subset of features. The empirical findings on NSL-KDD, UNSW-NB15, and CIC-IDS2017 datasets demonstrate the effectiveness of the BLOEO algorithm. This algorithm has an acceptable ability to effectively reduce the number of data features, maintaining a high intrusion detection accuracy of over 95%. Specifically, on the UNSW-NB15 dataset, BLOEO selected only 10.8 features on average, achieving an accuracy of 97.6% and a precision of 100%.
Collapse
Affiliation(s)
- Zahra Asghari Varzaneh
- Department of Computer Science, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Soodeh Hosseini
- Department of Computer Science, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran.
| |
Collapse
|
4
|
Li H, Liao B, Li J, Li S. A Survey on Biomimetic and Intelligent Algorithms with Applications. Biomimetics (Basel) 2024; 9:453. [PMID: 39194432 DOI: 10.3390/biomimetics9080453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 07/12/2024] [Accepted: 07/22/2024] [Indexed: 08/29/2024] Open
Abstract
The question "How does it work" has motivated many scientists. Through the study of natural phenomena and behaviors, many intelligence algorithms have been proposed to solve various optimization problems. This paper aims to offer an informative guide for researchers who are interested in tackling optimization problems with intelligence algorithms. First, a special neural network was comprehensively discussed, and it was called a zeroing neural network (ZNN). It is especially intended for solving time-varying optimization problems, including origin, basic principles, operation mechanism, model variants, and applications. This paper presents a new classification method based on the performance index of ZNNs. Then, two classic bio-inspired algorithms, a genetic algorithm and a particle swarm algorithm, are outlined as representatives, including their origin, design process, basic principles, and applications. Finally, to emphasize the applicability of intelligence algorithms, three practical domains are introduced, including gene feature extraction, intelligence communication, and the image process.
Collapse
Affiliation(s)
- Hao Li
- College of Computer Science and Engineering, Jishou University, Jishou 416000, China
- School of Communication and Electronic Engineering, Jishou University, Jishou 416000, China
| | - Bolin Liao
- College of Computer Science and Engineering, Jishou University, Jishou 416000, China
| | - Jianfeng Li
- College of Computer Science and Engineering, Jishou University, Jishou 416000, China
| | - Shuai Li
- College of Computer Science and Engineering, Jishou University, Jishou 416000, China
| |
Collapse
|
5
|
Zhu D, Bu Q, Zhu Z, Zhang Y, Wang Z. Advancing autonomy through lifelong learning: a survey of autonomous intelligent systems. Front Neurorobot 2024; 18:1385778. [PMID: 38644905 PMCID: PMC11027131 DOI: 10.3389/fnbot.2024.1385778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 03/25/2024] [Indexed: 04/23/2024] Open
Abstract
The combination of lifelong learning algorithms with autonomous intelligent systems (AIS) is gaining popularity due to its ability to enhance AIS performance, but the existing summaries in related fields are insufficient. Therefore, it is necessary to systematically analyze the research on lifelong learning algorithms with autonomous intelligent systems, aiming to gain a better understanding of the current progress in this field. This paper presents a thorough review and analysis of the relevant work on the integration of lifelong learning algorithms and autonomous intelligent systems. Specifically, we investigate the diverse applications of lifelong learning algorithms in AIS's domains such as autonomous driving, anomaly detection, robots, and emergency management, while assessing their impact on enhancing AIS performance and reliability. The challenging problems encountered in lifelong learning for AIS are summarized based on a profound understanding in literature review. The advanced and innovative development of lifelong learning algorithms for autonomous intelligent systems are discussed for offering valuable insights and guidance to researchers in this rapidly evolving field.
Collapse
Affiliation(s)
- Dekang Zhu
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
| | - Qianyi Bu
- College of Science and Engineering, University of Glasgow, Glasgow, United Kingdom
| | - Zhongpan Zhu
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
- College of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Yujie Zhang
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
| | - Zhipeng Wang
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
| |
Collapse
|
6
|
Zheng B, Li Y, Xiong G. Establishment and analysis of artificial neural network diagnosis model for coagulation-related molecular subgroups in coronary artery disease. Front Genet 2024; 15:1351774. [PMID: 38495669 PMCID: PMC10941628 DOI: 10.3389/fgene.2024.1351774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 02/20/2024] [Indexed: 03/19/2024] Open
Abstract
Background: Coronary artery disease (CAD) is the most common type of cardiovascular disease and cause significant morbidity and mortality. Abnormal coagulation cascade is one of the high-risk factors in CAD patients, but the molecular mechanism of coagulation in CAD is still limited. Methods: We clustered and categorized 352 CAD paitents based on the expression patterns of coagulation-related genes (CRGs), and then we explored the molecular and immunological variations across the subgroups to reveal the underlying biological characteristics of CAD patients. The feature genes between CRG-subgroups were further identified using a random forest model (RF) and least absolute shrinkage and selection operator (LASSO) regression, and an artificial neural network prediction model was constructed. Results: CAD patients could be divided into the C1 and C2 CRG-subgroups, with the C1 subgroup highly enriched in immune-related signaling pathways. The differential expressed genes between the two CRG-subgroups (DE-CRGs) were primarily enriched in signaling pathways connected to signal transduction and energy metabolism. Subsequently, 10 feature DE-CRGs were identified by RF and LASSO. We constructed a novel artificial neural network model using these 10 genes and evaluated and validated its diagnostic performance on a public dataset. Conclusion: Diverse molecular subgroups of CAD patients may each have a unique gene expression pattern. We may identify subgroups using a few feature genes, providing a theoretical basis for the precise treatment of CAD patients with different molecular subgroups.
Collapse
Affiliation(s)
- Biwei Zheng
- Department of Cardiology, Dongguan Hospital of Integrated Chinese and Western Medicine Affiliated to Guangzhou University of Traditional Chinese Medicine, Dongguan, China
| | - Yujing Li
- Shenzhen Traditional Chinese Medicine Hospital, The Fourth Clinical Medical College of Guangzhou University of Chinese Medicine, Shenzhen, China
- Beijing University of Chinese Medicine Shenzhen Hospital (Longgang), Shenzhen, China
| | - Guoliang Xiong
- Shenzhen Traditional Chinese Medicine Hospital, The Fourth Clinical Medical College of Guangzhou University of Chinese Medicine, Shenzhen, China
| |
Collapse
|
7
|
Liang J, Wang C, Zhang D, Xie Y, Zeng Y, Li T, Zuo Z, Ren J, Zhao Q. VSOLassoBag: a variable-selection oriented LASSO bagging algorithm for biomarker discovery in omic-based translational research. J Genet Genomics 2023; 50:151-162. [PMID: 36608930 DOI: 10.1016/j.jgg.2022.12.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Accepted: 12/26/2022] [Indexed: 01/04/2023]
Abstract
Screening biomolecular markers from high-dimensional biological data is one of the long-standing tasks for biomedical translational research. With its advantages in both feature shrinkage and biological interpretability, Least Absolute Shrinkage and Selection Operator (LASSO) algorithm is one of the most popular methods for the scenarios of clinical biomarker development. However, in practice, applying LASSO on omics-based data with high dimensions and low-sample size may usually result in an excess number of predictive variables, leading to the overfitting of the model. Here, we present VSOLassoBag, a wrapped LASSO approach by integrating an ensemble learning strategy to help select efficient and stable variables with high confidence from omics-based data. Using a bagging strategy in combination with a parametric method or inflection point search method, VSOLassoBag can integrate and vote variables generated from multiple LASSO models to determine the optimal candidates. The application of VSOLassoBag on both simulation datasets and real-world datasets shows that the algorithm can effectively identify markers for either case-control binary classification or prognosis prediction. In addition, by comparing with multiple existing algorithms, VSOLassoBag shows a comparable performance under different scenarios while resulting in fewer features than others. In summary, VSOLassoBag, which is available at https://seqworld.com/VSOLassoBag/ under the GPL v3 license, provides an alternative strategy for selecting reliable biomarkers from high-dimensional omics data. For user's convenience, we implement VSOLassoBag as an R package that provides multithreading computing configurations.
Collapse
Affiliation(s)
- Jiaqi Liang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong 510060, China; State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong 510275, China
| | - Chaoye Wang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong 510060, China
| | - Di Zhang
- Department of Coloproctology Surgery, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangdong Institute of Gastroenterology, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, Guangdong 510655, China
| | - Yubin Xie
- Precision Medicine Institute, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, Guangdong 510060, China
| | - Yanru Zeng
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong 510275, China
| | - Tianqin Li
- Computer Science Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Zhixiang Zuo
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong 510060, China
| | - Jian Ren
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong 510060, China
| | - Qi Zhao
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong 510060, China.
| |
Collapse
|
8
|
Liu J, Feng H, Tang Y, Zhang L, Qu C, Zeng X, Peng X. A novel hybrid algorithm based on Harris Hawks for tumor feature gene selection. PeerJ Comput Sci 2023; 9:e1229. [PMID: 37346505 PMCID: PMC10280456 DOI: 10.7717/peerj-cs.1229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 01/09/2023] [Indexed: 06/23/2023]
Abstract
Background Gene expression data are often used to classify cancer genes. In such high-dimensional datasets, however, only a few feature genes are closely related to tumors. Therefore, it is important to accurately select a subset of feature genes with high contributions to cancer classification. Methods In this article, a new three-stage hybrid gene selection method is proposed that combines a variance filter, extremely randomized tree and Harris Hawks (VEH). In the first stage, we evaluated each gene in the dataset through the variance filter and selected the feature genes that meet the variance threshold. In the second stage, we use extremely randomized tree to further eliminate irrelevant genes. Finally, we used the Harris Hawks algorithm to select the gene subset from the previous two stages to obtain the optimal feature gene subset. Results We evaluated the proposed method using three different classifiers on eight published microarray gene expression datasets. The results showed a 100% classification accuracy for VEH in gastric cancer, acute lymphoblastic leukemia and ovarian cancer, and an average classification accuracy of 95.33% across a variety of other cancers. Compared with other advanced feature selection algorithms, VEH has obvious advantages when measured by many evaluation criteria.
Collapse
Affiliation(s)
- Junjian Liu
- Department of Statistics, Hunan Normal University College of Mathematics and Statistics, Changsha, Hunan, China
| | - Huicong Feng
- Department of Pathology and Pathophysiology, Hunan Normal University School of Medicine, Changsha, Hunan, China
| | - Yifan Tang
- Department of Pathology and Pathophysiology, Hunan Normal University School of Medicine, Changsha, Hunan, China
| | - Lupeng Zhang
- Department of Biochemistry and Molecular Biology, Jishou University School of Medicine, Jishou, Hunan, China
| | - Chiwen Qu
- Department of Statistics, Hunan Normal University College of Mathematics and Statistics, Changsha, Hunan, China
| | - Xiaomin Zeng
- Department of Epidemiology and Health Statistics, Xiangya Public Health School, Central South University, Changsha, Hunan, China
| | - Xiaoning Peng
- Department of Statistics, Hunan Normal University College of Mathematics and Statistics, Changsha, Hunan, China
- Department of Pathology and Pathophysiology, Hunan Normal University School of Medicine, Changsha, Hunan, China
| |
Collapse
|
9
|
Abed-alguni BH, Alawad NA, Al-Betar MA, Paul D. Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection. APPL INTELL 2022; 53:13224-13260. [PMID: 36247211 PMCID: PMC9547101 DOI: 10.1007/s10489-022-04201-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/21/2022] [Indexed: 12/03/2022]
Abstract
This paper proposes new improved binary versions of the Sine Cosine Algorithm (SCA) for the Feature Selection (FS) problem. FS is an essential machine learning and data mining task of choosing a subset of highly discriminating features from noisy, irrelevant, high-dimensional, and redundant features to best represent a dataset. SCA is a recent metaheuristic algorithm established to emulate a model based on sine and cosine trigonometric functions. It was initially proposed to tackle problems in the continuous domain. The SCA has been modified to Binary SCA (BSCA) to deal with the binary domain of the FS problem. To improve the performance of BSCA, three accumulative improved variations are proposed (i.e., IBSCA1, IBSCA2, and IBSCA3) where the last version has the best performance. IBSCA1 employs Opposition Based Learning (OBL) to help ensure a diverse population of candidate solutions. IBSCA2 improves IBSCA1 by adding Variable Neighborhood Search (VNS) and Laplace distribution to support several mutation methods. IBSCA3 improves IBSCA2 by optimizing the best candidate solution using Refraction Learning (RL), a novel OBL approach based on light refraction. For performance evaluation, 19 real-wold datasets, including a COVID-19 dataset, were selected with different numbers of features, classes, and instances. Three performance measurements have been used to test the IBSCA versions: classification accuracy, number of features, and fitness values. Furthermore, the performance of the last variation of IBSCA3 is compared against 28 existing popular algorithms. Interestingly, IBCSA3 outperformed almost all comparative methods in terms of classification accuracy and fitness values. At the same time, it was ranked 15 out of 19 in terms of number of features. The overall simulation and statistical results indicate that IBSCA3 performs better than the other algorithms.
Collapse
Affiliation(s)
| | | | - Mohammed Azmi Al-Betar
- Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman, United Arab Emirates
| | - David Paul
- School of Science and Technology, University of New England, Armidale, Australia
| |
Collapse
|
10
|
Qin X, Zhang S, Yin D, Chen D, Dong X. Two-stage feature selection for classification of gene expression data based on an improved Salp Swarm Algorithm. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:13747-13781. [PMID: 36654066 DOI: 10.3934/mbe.2022641] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Microarray technology has developed rapidly in recent years, producing a large number of ultra-high dimensional gene expression data. However, due to the huge sample size and dimension proportion of gene expression data, it is very challenging work to screen important genes from gene expression data. For small samples of high-dimensional biomedical data, this paper proposes a two-stage feature selection framework combining Wrapper, embedding and filtering to avoid the curse of dimensionality. The proposed framework uses weighted gene co-expression network (WGCNA), random forest and minimal redundancy maximal relevance (mRMR) for first stage feature selection. In the second stage, a new gene selection method based on the improved binary Salp Swarm Algorithm is proposed, which combines machine learning methods to adaptively select feature subsets suitable for classification algorithms. Finally, the classification accuracy is evaluated using six methods: lightGBM, RF, SVM, XGBoost, MLP and KNN. To verify the performance of the framework and the effectiveness of the proposed algorithm, the number of genes selected and the classification accuracy was compared with the other five intelligent optimization algorithms. The results show that the proposed framework achieves an accuracy equal to or higher than other advanced intelligent algorithms on 10 datasets, and achieves an accuracy of over 97.6% on all 10 datasets. This shows that the method proposed in this paper can solve the feature selection problem related to high-dimensional data, and the proposed framework has no data set limitation, and it can be applied to other fields involving feature selection.
Collapse
Affiliation(s)
- Xiwen Qin
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Shuang Zhang
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Dongmei Yin
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Dongxue Chen
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Xiaogang Dong
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| |
Collapse
|
11
|
Akinola OO, Ezugwu AE, Agushaka JO, Zitar RA, Abualigah L. Multiclass feature selection with metaheuristic optimization algorithms: a review. Neural Comput Appl 2022; 34:19751-19790. [PMID: 36060097 PMCID: PMC9424068 DOI: 10.1007/s00521-022-07705-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 08/02/2022] [Indexed: 11/24/2022]
Abstract
Selecting relevant feature subsets is vital in machine learning, and multiclass feature selection is harder to perform since most classifications are binary. The feature selection problem aims at reducing the feature set dimension while maintaining the performance model accuracy. Datasets can be classified using various methods. Nevertheless, metaheuristic algorithms attract substantial attention to solving different problems in optimization. For this reason, this paper presents a systematic survey of literature for solving multiclass feature selection problems utilizing metaheuristic algorithms that can assist classifiers selects optima or near optima features faster and more accurately. Metaheuristic algorithms have also been presented in four primary behavior-based categories, i.e., evolutionary-based, swarm-intelligence-based, physics-based, and human-based, even though some literature works presented more categorization. Further, lists of metaheuristic algorithms were introduced in the categories mentioned. In finding the solution to issues related to multiclass feature selection, only articles on metaheuristic algorithms used for multiclass feature selection problems from the year 2000 to 2022 were reviewed about their different categories and detailed descriptions. We considered some application areas for some of the metaheuristic algorithms applied for multiclass feature selection with their variations. Popular multiclass classifiers for feature selection were also examined. Moreover, we also presented the challenges of metaheuristic algorithms for feature selection, and we identified gaps for further research studies.
Collapse
Affiliation(s)
- Olatunji O. Akinola
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Avenue, Pietermaritzburg Campus, Pietermaritzburg, 3201 KwaZulu-Natal South Africa
| | - Absalom E. Ezugwu
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Avenue, Pietermaritzburg Campus, Pietermaritzburg, 3201 KwaZulu-Natal South Africa
| | - Jeffrey O. Agushaka
- School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Avenue, Pietermaritzburg Campus, Pietermaritzburg, 3201 KwaZulu-Natal South Africa
| | - Raed Abu Zitar
- Sorbonne Center of Artificial Intelligence, Sorbonne University-Abu Dhabi, 38044 Abu Dhabi, United Arab Emirates
| | - Laith Abualigah
- Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, 19328 Jordan
- Faculty of Inforsmation Technology, Middle East University, Amman, 11831 Jordan
| |
Collapse
|
12
|
Abstract
The Harris hawk optimizer is a recent population-based metaheuristics algorithm that simulates the hunting behavior of hawks. This swarm-based optimizer performs the optimization procedure using a novel way of exploration and exploitation and the multiphases of search. In this review research, we focused on the applications and developments of the recent well-established robust optimizer Harris hawk optimizer (HHO) as one of the most popular swarm-based techniques of 2020. Moreover, several experiments were carried out to prove the powerfulness and effectivness of HHO compared with nine other state-of-art algorithms using Congress on Evolutionary Computation (CEC2005) and CEC2017. The literature review paper includes deep insight about possible future directions and possible ideas worth investigations regarding the new variants of the HHO algorithm and its widespread applications.
Collapse
|
13
|
Su Y, Du K, Wang J, Wei JM, Liu J. Multi-variable AUC for sifting complementary features and its biomedical application. Brief Bioinform 2022; 23:6536295. [PMID: 35212712 DOI: 10.1093/bib/bbac029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 01/14/2022] [Accepted: 01/18/2022] [Indexed: 11/13/2022] Open
Abstract
Although sifting functional genes has been discussed for years, traditional selection methods tend to be ineffective in capturing potential specific genes. First, typical methods focus on finding features (genes) relevant to class while irrelevant to each other. However, the features that can offer rich discriminative information are more likely to be the complementary ones. Next, almost all existing methods assess feature relations in pairs, yielding an inaccurate local estimation and lacking a global exploration. In this paper, we introduce multi-variable Area Under the receiver operating characteristic Curve (AUC) to globally evaluate the complementarity among features by employing Area Above the receiver operating characteristic Curve (AAC). Due to AAC, the class-relevant information newly provided by a candidate feature and that preserved by the selected features can be achieved beyond pairwise computation. Furthermore, we propose an AAC-based feature selection algorithm, named Multi-variable AUC-based Combined Features Complementarity, to screen discriminative complementary feature combinations. Extensive experiments on public datasets demonstrate the effectiveness of the proposed approach. Besides, we provide a gene set about prostate cancer and discuss its potential biological significance from the machine learning aspect and based on the existing biomedical findings of some individual genes.
Collapse
Affiliation(s)
- Yue Su
- College of Computer Science at Nankai University, China
| | - Keyu Du
- College of Computer Science at Nankai University, China
| | - Jun Wang
- College of Mathematics and Statistics Science at Ludong University, China
| | - Jin-Mao Wei
- College of Computer Science at Nankai University, China
| | - Jian Liu
- College of Computer Science at Nankai University, China
| |
Collapse
|