1
|
Zhou Y, Lin P, Xia L, Heidari AA, Chen Y, Liu L, Chen H, Li C, Li Y. An enhancing diagnostic pulmonary diseases diagnostic method for differentiating talaromycosis from tuberculosis. iScience 2025; 28:111867. [PMID: 40034117 PMCID: PMC11872622 DOI: 10.1016/j.isci.2025.111867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2024] [Revised: 11/27/2024] [Accepted: 01/20/2025] [Indexed: 03/05/2025] Open
Abstract
Talaromycosis (TSM) affects immunocompromised individuals, particularly those with human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS), causing varied pulmonary abnormalities on chest computed tomography (CT). These features overlap with pulmonary tuberculosis, making accurate differentiation essential for appropriate treatment. This study utilized real patient data from the First Affiliated Hospital of Wenzhou Medical University. A machine learning model, termed bIPCACO-FKNN, was developed, integrating an ant colony optimization (ACO) algorithm with a fuzzy k-nearest neighbors (FKNNs) classifier. This model introduces an incremental proportional-integral-derivative control strategy to enhance the search efficiency of ACO. Comparative analysis with several algorithms in the CEC 2017 benchmark functions confirms the superior performance of the IPCACO. Applying the bIPCACO-FKNN model for the prediction of pulmonary TSM achieved a prediction accuracy of 98.196% and a specificity of 99.500%, thus demonstrating its significant efficacy in accurately distinguishing between pulmonary TSM and tuberculosis. This provides an efficient and reliable machine learning tool for the differentiation between pulmonary TSM and tuberculosis.
Collapse
Affiliation(s)
- Ying Zhou
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China
| | - Pengchen Lin
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China
| | - Lijing Xia
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Yi Chen
- Key Laboratory of Intelligent Informatics for Safety & Emergency of Zhejiang Province, Wenzhou University, Wenzhou 325035, China
| | - Lei Liu
- College of Computer Science, Sichuan University, Chengdu, Sichuan 610065, China
| | - Huiling Chen
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou 325035, P.R. China
| | - Chengye Li
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China
| | - Yuping Li
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China
| |
Collapse
|
2
|
Borah K, Das HS, Seth S, Mallick K, Rahaman Z, Mallik S. A review on advancements in feature selection and feature extraction for high-dimensional NGS data analysis. Funct Integr Genomics 2024; 24:139. [PMID: 39158621 DOI: 10.1007/s10142-024-01415-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 07/30/2024] [Accepted: 08/01/2024] [Indexed: 08/20/2024]
Abstract
Recent advancements in biomedical technologies and the proliferation of high-dimensional Next Generation Sequencing (NGS) datasets have led to significant growth in the bulk and density of data. The NGS high-dimensional data, characterized by a large number of genomics, transcriptomics, proteomics, and metagenomics features relative to the number of biological samples, presents significant challenges for reducing feature dimensionality. The high dimensionality of NGS data poses significant challenges for data analysis, including increased computational burden, potential overfitting, and difficulty in interpreting results. Feature selection and feature extraction are two pivotal techniques employed to address these challenges by reducing the dimensionality of the data, thereby enhancing model performance, interpretability, and computational efficiency. Feature selection and feature extraction can be categorized into statistical and machine learning methods. The present study conducts a comprehensive and comparative review of various statistical, machine learning, and deep learning-based feature selection and extraction techniques specifically tailored for NGS and microarray data interpretation of humankind. A thorough literature search was performed to gather information on these techniques, focusing on array-based and NGS data analysis. Various techniques, including deep learning architectures, machine learning algorithms, and statistical methods, have been explored for microarray, bulk RNA-Seq, and single-cell, single-cell RNA-Seq (scRNA-Seq) technology-based datasets surveyed here. The study provides an overview of these techniques, highlighting their applications, advantages, and limitations in the context of high-dimensional NGS data. This review provides better insights for readers to apply feature selection and feature extraction techniques to enhance the performance of predictive models, uncover underlying biological patterns, and gain deeper insights into massive and complex NGS and microarray data.
Collapse
Affiliation(s)
- Kasmika Borah
- Department of Computer Science and Information Technology, Cotton University, Panbazar, Guwahati, 781001, Assam, India
| | - Himanish Shekhar Das
- Department of Computer Science and Information Technology, Cotton University, Panbazar, Guwahati, 781001, Assam, India.
| | - Soumita Seth
- Department of Computer Science and Engineering, Future Institute of Engineering and Management, Narendrapur, Kolkata, 700150, West Bengal, India
| | - Koushik Mallick
- Department of Computer Science and Engineering, RCC Institute of Information Technology, Canal S Rd, Beleghata, Kolkata, 700015, West Bengal, India
| | | | - Saurav Mallik
- Department of Environmental Health, Harvard T H Chan School of Public Health, Boston, MA, 02115, USA.
- Department of Pharmacology & Toxicology, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
3
|
Liang HW, Ameri R, Band S, Chen HS, Ho SY, Zaidan B, Chang KC, Chang A. Fall risk classification with posturographic parameters in community-dwelling older adults: a machine learning and explainable artificial intelligence approach. J Neuroeng Rehabil 2024; 21:15. [PMID: 38287415 PMCID: PMC10826018 DOI: 10.1186/s12984-024-01310-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 01/24/2024] [Indexed: 01/31/2024] Open
Abstract
BACKGROUND Computerized posturography obtained in standing conditions has been applied to classify fall risk for older adults or disease groups. Combining machine learning (ML) approaches is superior to traditional regression analysis for its ability to handle complex data regarding its characteristics of being high-dimensional, non-linear, and highly correlated. The study goal was to use ML algorithms to classify fall risks in community-dwelling older adults with the aid of an explainable artificial intelligence (XAI) approach to increase interpretability. METHODS A total of 215 participants were included for analysis. The input information included personal metrics and posturographic parameters obtained from a tracker-based posturography of four standing postures. Two classification criteria were used: with a previous history of falls and the timed-up-and-go (TUG) test. We used three meta-heuristic methods for feature selection to handle the large numbers of parameters and improve efficacy, and the SHapley Additive exPlanations (SHAP) method was used to display the weights of the selected features on the model. RESULTS The results showed that posturographic parameters could classify the participants with TUG scores higher or lower than 10 s but were less effective in classifying fall risk according to previous fall history. Feature selections improved the accuracy with the TUG as the classification label, and the Slime Mould Algorithm had the best performance (accuracy: 0.72 to 0.77, area under the curve: 0.80 to 0.90). In contrast, feature selection did not improve the model performance significantly with the previous fall history as a classification label. The SHAP values also helped to display the importance of different features in the model. CONCLUSION Posturographic parameters in standing can be used to classify fall risks with high accuracy based on the TUG scores in community-dwelling older adults. Using feature selection improves the model's performance. The results highlight the potential utility of ML algorithms and XAI to provide guidance for developing more robust and accurate fall classification models. Trial registration Not applicable.
Collapse
Affiliation(s)
- Huey-Wen Liang
- Department of Physical Medicine and Rehabilitation, National Taiwan University Hospital and College of Medicine, Taipei, Taiwan, ROC
| | - Rasoul Ameri
- Department of Information Management, National Yunlin University of Science and Technology, Douliu, Taiwan, ROC
| | - Shahab Band
- International Graduate School of Artificial Intelligence, National Yunlin University of Science and Technology, Douliu, Taiwan, ROC.
- Future Technology Research Center, National Yunlin University of Science and Technology, Douliu, Taiwan, ROC.
| | - Hsin-Shui Chen
- Department of Physical Medicine and Rehabilitation, National Taiwan University Hospital Yulin Branch, Douliu, Taiwan, ROC.
| | - Sung-Yu Ho
- Department of Information Management, National Yunlin University of Science and Technology, Douliu, Taiwan, ROC
| | - Bilal Zaidan
- International Graduate School of Artificial Intelligence, National Yunlin University of Science and Technology, Douliu, Taiwan, ROC
- SP Jain School of Global Management, Sydney, Australia
| | - Kai-Chieh Chang
- Department of Neurology, National Taiwan University Hospital Yulin Branch, Douliu, Taiwan, ROC
| | - Arthur Chang
- Department of Information Management, National Yunlin University of Science and Technology, Douliu, Taiwan, ROC
| |
Collapse
|
4
|
Wei Y, Othman Z, Daud KM, Luo Q, Zhou Y. Advances in Slime Mould Algorithm: A Comprehensive Survey. Biomimetics (Basel) 2024; 9:31. [PMID: 38248605 PMCID: PMC10813181 DOI: 10.3390/biomimetics9010031] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/15/2023] [Accepted: 10/16/2023] [Indexed: 01/23/2024] Open
Abstract
The slime mould algorithm (SMA) is a new swarm intelligence algorithm inspired by the oscillatory behavior of slime moulds during foraging. Numerous researchers have widely applied the SMA and its variants in various domains in the field and proved its value by conducting various literatures. In this paper, a comprehensive review of the SMA is introduced, which is based on 130 articles obtained from Google Scholar between 2022 and 2023. In this study, firstly, the SMA theory is described. Secondly, the improved SMA variants are provided and categorized according to the approach used to apply them. Finally, we also discuss the main applications domains of the SMA, such as engineering optimization, energy optimization, machine learning, network, scheduling optimization, and image segmentation. This review presents some research suggestions for researchers interested in this algorithm, such as conducting additional research on multi-objective and discrete SMAs and extending this to neural networks and extreme learning machining.
Collapse
Affiliation(s)
- Yuanfei Wei
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
- Xiangsihu College, Guangxi Minzu University, Nanning 530225, China
| | - Zalinda Othman
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
| | - Qifang Luo
- College of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, China
- Guangxi Key Laboratories of Hybrid Computation and IC Design Analysis, Nanning 530006, China
| | - Yongquan Zhou
- Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
- Xiangsihu College, Guangxi Minzu University, Nanning 530225, China
- College of Artificial Intelligence, Guangxi Minzu University, Nanning 530006, China
| |
Collapse
|
5
|
Chen Z, Xinxian L, Guo R, Zhang L, Dhahbi S, Bourouis S, Liu L, Wang X. Dispersed differential hunger games search for high dimensional gene data feature selection. Comput Biol Med 2023; 163:107197. [PMID: 37390761 DOI: 10.1016/j.compbiomed.2023.107197] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 06/08/2023] [Accepted: 06/19/2023] [Indexed: 07/02/2023]
Abstract
The realms of modern medicine and biology have provided substantial data sets of genetic roots that exhibit a high dimensionality. Clinical practice and associated processes are primarily dependent on data-driven decision-making. However, the high dimensionality of the data in these domains increases the complexity and size of processing. It can be challenging to determine representative genes while reducing the data's dimensionality. A successful gene selection will serve to mitigate the computing costs and refine the accuracy of the classification by eliminating superfluous or duplicative features. To address this concern, this research suggests a wrapper gene selection approach based on the HGS, combined with a dispersed foraging strategy and a differential evolution strategy, to form a new algorithm named DDHGS. Introducing the DDHGS algorithm to the global optimization field and its binary derivative bDDHGS to the feature selection problem is anticipated to refine the existing search balance between explorative and exploitative cores. We assess and confirm the efficacy of our proposed method, DDHGS, by comparing it with DE and HGS combined with a single strategy, seven classic algorithms, and ten advanced algorithms on the IEEE CEC 2017 test suite. Furthermore, to further evaluate DDHGS' performance, we compare it with several CEC winners and DE-based techniques of great efficiency on 23 popular optimization functions and the IEEE CEC 2014 benchmark test suite. The experimentation asserted that the bDDHGS approach was able to surpass bHGS and a variety of existing methods when applied to fourteen feature selection datasets from the UCI repository. The metrics measured--classification accuracy, the number of selected features, fitness scores, and execution time--all showed marked improvements with the use of bDDHGS. Considering all results, it can be concluded that bDDHGS is an optimal optimizer and an effective feature selection tool in the wrapper mode.
Collapse
Affiliation(s)
- Zhiqing Chen
- School of Intelligent Manufacturing, Wenzhou Polytechnic, Wenzhou, 325035, China.
| | - Li Xinxian
- Wenzhou Vocational College of Science and Technology, Wenzhou, 325006, China.
| | - Ran Guo
- Cyberspace Institute Advanced Technology, Guangzhou University, Guangzhou, 510006, China.
| | - Lejun Zhang
- Cyberspace Institute Advanced Technology, Guangzhou University, Guangzhou, 510006, China; College of Information Engineering, Yangzhou University, Yangzhou, 225127, China; Research and Development Center for E-Learning, Ministry of Education, Beijing, 100039, China.
| | - Sami Dhahbi
- Department of Computer Science, College of Science and Art at Mahayil, King Khalid University, Muhayil, Aseer, 62529, Saudi Arabia.
| | - Sami Bourouis
- Department of Information Technology, College of Computers and Information Technology, Taif University, P.O.Box 11099, Taif, 21944, Saudi Arabia.
| | - Lei Liu
- College of Computer Science, Sichuan University, Chengdu, Sichuan, 610065, China.
| | - Xianchuan Wang
- Information Technology Center, Wenzhou Medical University, Wenzhou, 325035, China.
| |
Collapse
|