1
|
Khan M, Hooda BK, Gaur A, Singh V, Jindal Y, Tanwar H, Sharma S, Sheoran S, Vishwakarma DK, Khalid M, Albakri GS, Alreshidi MA, Choi JR, Yadav KK. Ensemble and optimization algorithm in support vector machines for classification of wheat genotypes. Sci Rep 2024; 14:22728. [PMID: 39349934 PMCID: PMC11442772 DOI: 10.1038/s41598-024-72056-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 09/03/2024] [Indexed: 10/04/2024] Open
Abstract
This study aimed to classifying wheat genotypes using support vector machines (SVMs) improved with ensemble algorithms and optimization techniques. Utilizing data from 302 wheat genotypes and 14 morphological attributes to evaluate six SVM kernels: linear, radial basis function (RBF), sigmoid, and polynomial degrees 1-3. Various optimization methods, including grid search, random search, genetic algorithms, differential evolution, and particle swarm optimization, were used. The radial basis function kernel achieves the highest accuracy at 93.2%, and the weighted accuracy ensemble further improves it to 94.9%. This study shows the effectiveness of these methods in agricultural research and crop improvement. Notably, optimization-based SVM classification, particularly with particle swarm optimization, saw a significant 1.7% accuracy gain in the test set, reaching 94.9% accuracy. These findings underscore the efficacy of RBF kernels and optimization techniques in improving wheat genotype classification accuracy and highlight the potential of SVMs in agricultural research and crop improvement endeavors.
Collapse
Affiliation(s)
- Mujahid Khan
- Agricultural Research Station (SKNAU, Jobner), Fatehpur-Shekhawati, Sikar, 332301, India
- Department of Mathematics and Statistics, Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
| | - B K Hooda
- Department of Mathematics and Statistics, Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
| | - Arpit Gaur
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
- ICAR-Indian Institute of Wheat and Barley, Karnal, Haryana, 132001, India
| | - Vikram Singh
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
| | - Yogesh Jindal
- Department of Genetics and Plant Breeding, Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
| | - Hemender Tanwar
- Department of Seed Science and Technology, Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
| | - Sushma Sharma
- Department of Seed Science and Technology, Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
| | - Sonia Sheoran
- ICAR-Indian Institute of Wheat and Barley, Karnal, Haryana, 132001, India
| | - Dinesh Kumar Vishwakarma
- Department of Irrigation and Drainage Engineering, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Udham Singh Nagar, Uttarakhand, 263145, India.
| | - Mohammad Khalid
- Department of Pharmaceutics, College of Pharmacy, King Khalid University, 61421, Abha, Asir, Saudi Arabia
| | - Ghadah Shukri Albakri
- Department of Teaching and Learning, College of Education and Human Development, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, 11671, Riyadh, Saudi Arabia
| | | | - Jeong Ryeol Choi
- School of Electronic Engineering, Kyonggi University, Yeongtong-gu, Suwon, Gyeonggi-do, 16227, Republic of Korea.
| | - Krishna Kumar Yadav
- Department of Environmental Science, Parul Institute of Applied Sciences, Parul University, Vadodara, Gujarat, 391760, India
- Environmental and Atmospheric Sciences Research Group, Scientific Research Center, Al-Ayen University, Nasiriyah, Thi-Qar, 64001, Iraq
| |
Collapse
|
2
|
Yasmin A, Haider Butt W, Daud A. Ensemble effort estimation with metaheuristic hyperparameters and weight optimization for achieving accuracy. PLoS One 2024; 19:e0300296. [PMID: 38573895 PMCID: PMC10994292 DOI: 10.1371/journal.pone.0300296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 02/24/2024] [Indexed: 04/06/2024] Open
Abstract
Software development effort estimation (SDEE) is recognized as vital activity for effective project management since under or over estimating can lead to unsuccessful utilization of project resources. Machine learning (ML) algorithms are largely contributing in SDEE domain, particularly ensemble effort estimation (EEE) works well in rectifying bias and subjectivity to solo ML learners. Performance of EEE significantly depends on hyperparameter composition as well as weight assignment mechanism of solo learners. However, in EEE domain, impact of optimization in terms of hyperparameter tunning as well as weight assignment is explored by few researchers. This study aims in improving SDEE performance by incorporating metaheuristic hyperparameter and weight optimization in EEE, which enables accuracy and diversity to the ensemble model. The study proposed Metaheuristic-optimized Multi-dimensional bagging scheme and Weighted Ensemble (MoMdbWE) approach. This is achieved by proposed search space division and hyperparameter optimization method named as Multi-dimensional bagging (Mdb). Metaheuristic algorithm considered for this work is Firefly algorithm (FFA), to get best hyperparameters of three base ML algorithms (Random Forest, Support vector machine and Deep Neural network) since FFA has shown promising results of fitness in terms of MAE. Further enhancement in performance is achieved by incorporating FFA-based weight optimization to construct Metaheuristic-optimized weighted ensemble (MoWE) of individual multi-dimensional bagging schemes. Proposed scheme is implemented on eight frequently utilized effort estimation datasets and results are evaluated by 5 error metrices (MAE, RMSE, MMRE, MdMRE, Pred), standard accuracy and effect size along with Wilcox statistical test. Findings confirmed that the use of FFA optimization for hyperparameter (with search space sub-division) and for ensemble weights, has significantly enhanced performance in comparison with individual base algorithms as well as other homogeneous and heterogenous EEE techniques.
Collapse
Affiliation(s)
- Anum Yasmin
- Department of Computer and Software Engineering, College of Electrical and Mechanical Engineering, National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Wasi Haider Butt
- Department of Computer and Software Engineering, College of Electrical and Mechanical Engineering, National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Ali Daud
- Faculty of Resilience, Rabdan Academy, Abu Dhabi, United Arab Emirates
| |
Collapse
|
3
|
Yang W, Nie Q, Sun Y, Zou D, Tang J, Wang M. Early prediction of atherosclerosis diagnosis with medical ambient intelligence. Front Physiol 2023; 14:1225636. [PMID: 37546535 PMCID: PMC10398961 DOI: 10.3389/fphys.2023.1225636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 06/30/2023] [Indexed: 08/08/2023] Open
Abstract
Atherosclerosis is a chronic vascular disease that poses a significant threat to human health. Common diagnostic methods mainly rely on active screening, which often misses the opportunity for early detection. To overcome this problem, this paper presents a novel medical ambient intelligence system for the early detection of atherosclerosis by leveraging clinical data from medical records. The system architecture includes clinical data extraction, transformation, normalization, feature selection, medical ambient computation, and predictive generation. However, the heterogeneity of examination items from different patients can degrade prediction performance. To enhance prediction performance, the "SEcond-order Classifier (SEC)" is proposed to undertake the medical ambient computation task. The first-order component and second-order cross-feature component are then consolidated and applied to the chosen feature matrix to learn the associations between the physical examination data, respectively. The prediction is lastly produced by aggregating the representations. Extensive experimental results reveal that the proposed method's diagnostic prediction performance is superior to other state-of-the-art methods. Specifically, the Vitamin B12 indicator exhibits the strongest correlation with the early stage of atherosclerosis, while several known relevant biomarkers also demonstrate significant correlation in experimental data. The method proposed in this paper is a standalone tool, and its source code will be released in the future.
Collapse
Affiliation(s)
| | | | | | | | - Jinmo Tang
- *Correspondence: Jinmo Tang, ; Min Wang,
| | - Min Wang
- *Correspondence: Jinmo Tang, ; Min Wang,
| |
Collapse
|
4
|
Neshat M, Lee S, Momin MM, Truong B, van der Werf JHJ, Lee SH. An effective hyper-parameter can increase the prediction accuracy in a single-step genetic evaluation. Front Genet 2023; 14:1104906. [PMID: 37359380 PMCID: PMC10285379 DOI: 10.3389/fgene.2023.1104906] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 05/23/2023] [Indexed: 06/28/2023] Open
Abstract
The H-matrix best linear unbiased prediction (HBLUP) method has been widely used in livestock breeding programs. It can integrate all information, including pedigree, genotypes, and phenotypes on both genotyped and non-genotyped individuals into one single evaluation that can provide reliable predictions of breeding values. The existing HBLUP method requires hyper-parameters that should be adequately optimised as otherwise the genomic prediction accuracy may decrease. In this study, we assess the performance of HBLUP using various hyper-parameters such as blending, tuning, and scale factor in simulated and real data on Hanwoo cattle. In both simulated and cattle data, we show that blending is not necessary, indicating that the prediction accuracy decreases when using a blending hyper-parameter <1. The tuning process (adjusting genomic relationships accounting for base allele frequencies) improves prediction accuracy in the simulated data, confirming previous studies, although the improvement is not statistically significant in the Hanwoo cattle data. We also demonstrate that a scale factor, α, which determines the relationship between allele frequency and per-allele effect size, can improve the HBLUP accuracy in both simulated and real data. Our findings suggest that an optimal scale factor should be considered to increase prediction accuracy, in addition to blending and tuning processes, when using HBLUP.
Collapse
Affiliation(s)
- Mehdi Neshat
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
| | - Soohyun Lee
- Division of Animal Breeding and Genetics, National Institute of Animal Science (NIAS), Cheonan, Republic of Korea
| | - Md. Moksedul Momin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
- Department of Genetics and Animal Breeding, Faculty of Veterinary Medicine, Chattogram Veterinary and Animal Sciences University (CVASU), Chattogram, Bangladesh
| | - Buu Truong
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- Cardiovascular Research Centre, Massachusetts General Hospital, Boston, MA, United States
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad, Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, United States
| | | | - S. Hong Lee
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
- South Australian Health and Medical Research Institute (SAHMRI), Adelaide, SA, Australia
| |
Collapse
|
5
|
|
6
|
Chen W, Yang K, Yu Z, Zhang W. Double-kernel based class-specific broad learning system for multiclass imbalance learning. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
7
|
Abstract
As vital equipment in high-speed train power supply systems, the failure of onboard traction transformers affect the safe and stable operation of the trains. To diagnose faults in onboard traction transformers, this paper proposes a hybrid optimization method based on quickly and accurately using support vector machines (SVMs) as fault diagnosis systems for onboard traction transformers, which can accurately locate and analyze faults. Considering the limitations of traditional transformers for identifying faults, this study used kernel principal component analysis (KPCA) to analyze the feature quantity of dissolved gas analysis (DGA) data, electrical test data, and oil quality test data. The improved seagull optimization algorithm (ISOA) was used to optimize the SVM, and a Henon chaotic map was introduced to initialize the population. Combined with differential evolution (DE) based on the adaptive formula, the foraging formula of the seagull optimization algorithm (SOA) was improved to increase the diversity of the algorithm and enhance its ability to find the optimal parameters of SVM, which made the simulation results more accurate. Finally, the KPCA–ADESOA–SVM model was constructed and applied to fault diagnosis for the traction transformer. The example analysis compared the diagnosis results of the proposed diagnosis model with those of the traditional diagnosis model, showing further optimization of the feature quantity and improvements in the diagnosis accuracy. This proves that the proposed diagnosis model has high generalization performance and can effectively increase the fault diagnosis accuracy and speed of traction transformers.
Collapse
|