1
|
Malik S, Patro SGK, Mahanty C, Kumar S, Lasisi A, Naveed QN, Kulkarni A, Buradi A, Emma AF, Kraiem N. Hybrid metaheuristic optimization for detecting and diagnosing noncommunicable diseases. Sci Rep 2025; 15:7816. [PMID: 40050658 PMCID: PMC11885463 DOI: 10.1038/s41598-025-91136-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2024] [Accepted: 02/18/2025] [Indexed: 03/09/2025] Open
Abstract
In our data-driven world, the healthcare sector faces significant challenges in the early detection and management of Non-Communicable Diseases (NCDs). The COVID-19 pandemic has further emphasized the need for effective tools to predict and treat NCDs, especially in individuals at risk. This research addresses these pressing concerns by proposing a comprehensive framework that combines advanced data mining techniques, feature selection, and meta-heuristic optimization. The proposed framework introduces novel hybrid algorithms, including the Hierarchical Genetic Multiple Reduct Selection Algorithm (H-GMRA) and the Customized Function-based Particle Swarm Optimization with Rough Set Theory for NCD Feature Selection (CPSO-RST-NFS). These algorithms aim to address the challenges of feature selection, computational complexity, and disease classification accuracy. H-GMRA outperforms traditional methods by identifying minimal feature sets with high dependency ratios. CPSO-RST-NFS combines meta-heuristic optimization with feature selection, resulting in improved efficiency and accuracy. Through extensive experimentation on diverse NCD datasets, this research demonstrates the framework's ability to select informative features, improve classification accuracy, and contribute to better patient outcomes. By bridging the gap between computational efficiency and disease classification accuracy, this work offers valuable insights for healthcare practitioners and data analysts, ultimately advancing the field of NCD research. The proposed framework presents a significant step towards enhancing the early detection and management of NCDs, offering hope for more precise clinical predictions and improved patient care.
Collapse
Affiliation(s)
- Saleem Malik
- Department of Computer Science and Engineering, P A College of Engineering, Mangalore, Karnataka, India.
| | | | - Chandrakanta Mahanty
- Department of Computer Science & Engineering, GITAM School of Technology, GITAM Deemed to Be University, Visakhapatnam, 530045, India
| | - Saravanapriya Kumar
- Department of MCA, Sacred Heart College (Autonomous), Tirupattur, 635601, Tamil Nadu, India.
| | - Ayodele Lasisi
- Department of Computer Science, College Of Computer Science, King Khalid University, Abha, Saudi Arabia
| | - Quadri Noorulhasan Naveed
- Department of Computer Science, College Of Computer Science, King Khalid University, Abha, Saudi Arabia
| | - Anjanabhargavi Kulkarni
- Department of Computer Science and Engineering, Visvesvaraya Technological University, Belagavi, India
| | | | - Addisu Frinjo Emma
- College of Engineering and Technology, School of Mechanical and Automotive Engineering, Dilla University, Gedeo Zone, South Ethiopia Regional State, Po Box 419, Dilla, Ethiopia.
| | - Naoufel Kraiem
- Department of Computer Science, College Of Computer Science, King Khalid University, Abha, Saudi Arabia.
| |
Collapse
|
2
|
Luo C, Wang S, Li T, Chen H, Lv J, Yi Z. Large-Scale Meta-Heuristic Feature Selection Based on BPSO Assisted Rough Hypercuboid Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10889-10903. [PMID: 35552142 DOI: 10.1109/tnnls.2022.3171614] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The selection of prominent features for building more compact and efficient models is an important data preprocessing task in the field of data mining. The rough hypercuboid approach is an emerging technique that can be applied to eliminate irrelevant and redundant features, especially for the inexactness problem in approximate numerical classification. By integrating the meta-heuristic-based evolutionary search technique, a novel global search method for numerical feature selection is proposed in this article based on the hybridization of the rough hypercuboid approach and binary particle swarm optimization (BPSO) algorithm, namely RH-BPSO. To further alleviate the issue of high computational cost when processing large-scale datasets, parallelization approaches for calculating the hybrid feature evaluation criteria are presented by decomposing and recombining hypercuboid equivalence partition matrix via horizontal data partitioning. A distributed meta-heuristic optimized rough hypercuboid feature selection (DiRH-BPSO) algorithm is thus developed and embedded in the Apache Spark cloud computing model. Extensive experimental results indicate that RH-BPSO is promising and can significantly outperform the other representative feature selection algorithms in terms of classification accuracy, the cardinality of the selected feature subset, and execution efficiency. Moreover, experiments on distributed-memory multicore clusters show that DiRH-BPSO is significantly faster than its sequential counterpart and is perfectly capable of completing large-scale feature selection tasks that fail on a single node due to memory constraints. Parallel scalability and extensibility analysis also demonstrate that DiRH-BPSO could scale out and extend well with the growth of computational nodes and the volume of data.
Collapse
|
3
|
Topological reduction algorithm for relation systems. Soft comput 2022. [DOI: 10.1007/s00500-022-07431-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
4
|
A Distributed Attribute Reduction Algorithm for High-Dimensional Data under the Spark Framework. INT J COMPUT INT SYS 2022. [DOI: 10.1007/s44196-022-00076-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
AbstractAttribute reduction is an important issue in rough set theory. However, the rough set theory-based attribute reduction algorithms need to be improved to deal with high-dimensional data. A distributed version of the attribute reduction algorithm is necessary to enable it to effectively handle big data. The partition of attribute space is an important research direction. In this paper, a distributed attribution reduction algorithm based on cosine similarity (DARCS) for high-dimensional data pre-processing under the Spark framework is proposed. First, to avoid the repeated calculation of similar attributes, the algorithm gathers similar attributes based on similarity measure to form multiple clusters. And then one attribute is selected randomly as a representative from each cluster to form a candidate attribute subset to participate in the subsequent reduction operation. At the same time, to improve computing efficiency, an improved method is introduced to calculate the attribute dependency in the divided sub-attribute space. Experiments on eight datasets show that, on the premise of avoiding critical information loss, the reduction ability and computing efficiency of DARCS have been improved by 0.32 to 39.61% and 31.32 to 93.79% respectively compared to the distributed version of attribute reduction algorithm based on a random partitioning of the attributes space.
Collapse
|
5
|
Using Rough Set Theory to Find Minimal Log with Rule Generation. Symmetry (Basel) 2021. [DOI: 10.3390/sym13101906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Data pre-processing is a major difficulty in the knowledge discovery process, especially feature selection on a large amount of data. In literature, various approaches have been suggested to overcome this difficulty. Unlike most approaches, Rough Set Theory (RST) can discover data de-pendency and reduce the attributes without the need for further information. In RST, the discernibility matrix is the mathematical foundation for computing such reducts. Although it proved its efficiency in feature selection, unfortunately it is computationally expensive on high dimensional data. Algorithm complexity is related to the search of the minimal subset of attributes, which requires computing an exponential number of possible subsets. To overcome this limitation, many RST enhancements have been proposed. Contrary to recent methods, this paper implements RST concepts in an iterated manner using R language. First, the dataset was partitioned into a smaller number of subsets and each subset processed independently to generate its own minimal attribute set. Within the iterations, only minimal elements in the discernibility matrix were considered. Finally, the iterated outputs were compared, and those common among all reducts formed the minimal one (Core attributes). A comparison with another novel proposed algorithm using three benchmark datasets was performed. The proposed approach showed its efficiency in calculating the same minimal attribute sets with less execution time.
Collapse
|
6
|
Determination of the risk propagation path of cascading faults in chemical material networks based on complex networks. CAN J CHEM ENG 2021. [DOI: 10.1002/cjce.24011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|