1
|
Isla-Cernadas D, Fernandez-Delgado M, Cernadas E, Sirsat MS, Maarouf H, Barro S. Closed-Form Gaussian Spread Estimation for Small and Large Support Vector Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4336-4344. [PMID: 40031077 DOI: 10.1109/tnnls.2024.3377370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
The support vector machine (SVM) with Gaussian kernel often achieves state-of-the-art performance in classification problems, but requires the tuning of the kernel spread. Most optimization methods for spread tuning require training, being slow and not suited for large-scale datasets. We formulate an analytic expression to calculate, directly from data without iterative search, the spread minimizing the difference between Gaussian and ideal kernel matrices. The proposed direct gamma tuning (DGT) equals the performance of and is one to two orders of magnitude faster than the state-of-the art approaches on 30 small datasets. Combined with random sampling of training patterns, it also runs on large classification problems. Our method is very efficient in experiments with 20 large datasets up to 31 million of patterns, it is faster and performs significantly better than linear SVM, and it is also faster than iterative minimization. Code is available upon paper acceptance from this link: https://persoal.citius.usc.es/manuel.fernandez.delgado/papers/dgt/index.html and from CodeOcean: https://codeocean.com/capsule/4271163/tree/v1.
Collapse
|
2
|
Azzam SM, Emam OE, Abolaban AS. An improved Differential evolution with Sailfish optimizer (DESFO) for handling feature selection problem. Sci Rep 2024; 14:13517. [PMID: 38866847 PMCID: PMC11169489 DOI: 10.1038/s41598-024-63328-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 05/28/2024] [Indexed: 06/14/2024] Open
Abstract
As a preprocessing for machine learning and data mining, Feature Selection plays an important role. Feature selection aims to streamline high-dimensional data by eliminating irrelevant and redundant features, which reduces the potential curse of dimensionality of a given large dataset. When working with datasets containing many features, algorithms that aim to identify the most valuable features to improve dataset accuracy may encounter difficulties because of local optima. Many studies have been conducted to solve this problem. One of the solutions is to use meta-heuristic techniques. This paper presents a combination of the Differential evolution and the sailfish optimizer algorithms (DESFO) to tackle the feature selection problem. To assess the effectiveness of the proposed algorithm, a comparison between Differential Evolution, sailfish optimizer, and nine other modern algorithms, including different optimization algorithms, is presented. The evaluation used Random forest and key nearest neighbors as quality measures. The experimental results show that the proposed algorithm is a superior algorithm compared to others. It significantly impacts high classification accuracy, achieving 85.7% with the Random Forest classifier and 100% with the Key Nearest Neighbors classifier across 14 multi-scale benchmarks. According to fitness values, it gained 71% with the Random forest and 85.7% with the Key Nearest Neighbors classifiers.
Collapse
Affiliation(s)
- Safaa M Azzam
- Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, P.O. Box 11795, Helwan, Egypt
| | - O E Emam
- Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, P.O. Box 11795, Helwan, Egypt
| | - Ahmed Sabry Abolaban
- Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, P.O. Box 11795, Helwan, Egypt.
| |
Collapse
|
3
|
Jalloul R, Chethan HK, Alkhatib R. A Review of Machine Learning Techniques for the Classification and Detection of Breast Cancer from Medical Images. Diagnostics (Basel) 2023; 13:2460. [PMID: 37510204 PMCID: PMC10378151 DOI: 10.3390/diagnostics13142460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 07/17/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023] Open
Abstract
Cancer is an incurable disease based on unregulated cell division. Breast cancer is the most prevalent cancer in women worldwide, and early detection can lower death rates. Medical images can be used to find important information for locating and diagnosing breast cancer. The best information for identifying and diagnosing breast cancer comes from medical pictures. This paper reviews the history of the discipline and examines how deep learning and machine learning are applied to detect breast cancer. The classification of breast cancer, using several medical imaging modalities, is covered in this paper. Numerous medical imaging modalities' classification systems for tumors, non-tumors, and dense masses are thoroughly explained. The differences between various medical image types are initially examined using a variety of study datasets. Following that, numerous machine learning and deep learning methods exist for diagnosing and classifying breast cancer. Finally, this review addressed the challenges of categorization and detection and the best results of different approaches.
Collapse
Affiliation(s)
- Reem Jalloul
- Maharaja Research Foundation, University of Mysore, Mysuru 570005, India
| | - H K Chethan
- Department of Computer Science and Engineering, Maharaja Research Foundation, Maharaja Institute of Technology, Mysuru 570004, India
| | - Ramez Alkhatib
- Biomaterial Bank Nord, Research Center Borstel Leibniz Lung Center, Parkallee 35, 23845 Borstel, Germany
| |
Collapse
|
4
|
Abd El-Mageed AA, Abohany AA, Elashry A. Effective Feature Selection Strategy for Supervised Classification based on an Improved Binary Aquila Optimization Algorithm. COMPUTERS & INDUSTRIAL ENGINEERING 2023; 181:109300. [DOI: 10.1016/j.cie.2023.109300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
5
|
Chen W, Wang Y, Tang X, Yan P, Liu X, Lin L, Shi G, Robert E, Huang F. A specific fine-grained identification model for plasma-treated rice growth using multiscale shortcut convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:10223-10243. [PMID: 37322930 DOI: 10.3934/mbe.2023448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
As an agricultural innovation, low-temperature plasma technology is an environmentally friendly green technology that increases crop quality and productivity. However, there is a lack of research on the identification of plasma-treated rice growth. Although traditional convolutional neural networks (CNN) can automatically share convolution kernels and extract features, the outputs are only suitable for entry-level categorization. Indeed, shortcuts from the bottom layers to fully connected layers can be established feasibly in order to utilize spatial and local information from the bottom layers, which contain small distinctions necessary for fine-grain identification. In this work, 5000 original images which contain the basic growth information of rice (including plasma treated rice and the control rice) at the tillering stage were collected. An efficient multiscale shortcut CNN (MSCNN) model utilizing key information and cross-layer features was proposed. The results show that MSCNN outperforms the mainstream models in terms of accuracy, recall, precision and F1 score with 92.64%, 90.87%, 92.88% and 92.69%, respectively. Finally, the ablation experiment, comparing the average precision of MSCNN with and without shortcuts, revealed that the MSCNN with three shortcuts achieved the best performance with the highest precision.
Collapse
Affiliation(s)
- Wenzhuo Chen
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Yuan Wang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Xiaojiang Tang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Pengfei Yan
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Xin Liu
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Lianfeng Lin
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Guannan Shi
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Eric Robert
- GREMI, UMR 7344, CNRS/Université d'Orléans, 45067 Orléans Cedex France
| | - Feng Huang
- College of Science, China Agricultural University, Beijing 100083, China
- GREMI, UMR 7344, CNRS/Université d'Orléans, 45067 Orléans Cedex France
- LE STUDIUM Loire Valley Institute for Advanced Studies, Centre-Val de Loire region, France
| |
Collapse
|
6
|
Aero engines remaining useful life prediction based on enhanced adaptive guided differential evolution. EVOLUTIONARY INTELLIGENCE 2022. [DOI: 10.1007/s12065-022-00805-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
7
|
Haridasan A, Thomas J, Raj ED. Deep learning system for paddy plant disease detection and classification. ENVIRONMENTAL MONITORING AND ASSESSMENT 2022; 195:120. [PMID: 36399232 DOI: 10.1007/s10661-022-10656-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 05/15/2022] [Indexed: 06/16/2023]
Abstract
Automatic detection and analysis of rice crop diseases is widely required in the farming industry, which can be utilized to avoid squandering financial and other resources, reduce yield losses, and improve treatment efficiency, resulting in healthier crop output. An automated approach was proposed for accurately detecting and classifying diseases from a supplied photograph. The proposed system for the recognition of rice plant diseases adopts a computer vision-based approach that employs the techniques of image processing, machine learning, and deep learning, reducing the reliance on conventional methods to protect paddy crops from diseases like bacterial leaf blight, false smut, brown leaf spot, rice blast, and sheath rot, the five primary diseases that frequently plague the Indian rice fields. Following image pre-processing, image segmentation is employed to determine the diseased section of the paddy plant, with the diseases listed above being identified purely on the basis of their visual contents. An integration of a support vector machine classifier and convolutional neural networks are used to recognize and classify specific varieties of paddy plant diseases. With ReLU and softmax functions, the suggested deep learning-based strategy attained the highest validation accuracy of 0.9145. Following recognition, a predictive remedy is recommended, which can assist agriculture-related individuals and organizations in taking suitable measures to combat these diseases.
Collapse
Affiliation(s)
- Amritha Haridasan
- Department of Computer Science and Engineering, Indian Institute of Information Technology, Kottayam, Kerala, India
| | - Jeena Thomas
- Department of Computer Science and Engineering, Indian Institute of Information Technology, Kottayam, Kerala, India
| | - Ebin Deni Raj
- Department of Computer Science and Engineering, Indian Institute of Information Technology, Kottayam, Kerala, India.
| |
Collapse
|
8
|
Akram-Ali-Hammouri Z, Fernandez-Delgado M, Cernadas E, Barro S. Fast Support Vector Classification for Large-Scale Problems. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:6184-6195. [PMID: 34077354 DOI: 10.1109/tpami.2021.3085969] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The support vector machine (SVM) is a very important machine learning algorithm with state-of-the-art performance on many classification problems. However, on large datasets it is very slow and requires much memory. To solve this defficiency, we propose the fast support vector classifier (FSVC) that includes: 1) an efficient closed-form training free of any numerical iterative procedure; 2) a small collection of class prototypes that avoids to store in memory an excessive number of support vectors; and 3) a fast method that selects the spread of the radial basis function kernel directly from data, without classifier execution nor iterative hyper-parameter tuning. The memory requirements of FSVC are very low, spending in average only 6 ·10-7 sec. per pattern, input and class, and processing datasets up to 31 millions of patterns, 30,000 inputs and 131 classes in less than 1.5 hours (less than 3 hours with only 2GB of RAM). In average, the FSVC is 10 times faster, requires 12 times less memory and achieves 4.7 percent more performance than Liblinear, that fails on the 4 largest datasets by lack of memory, being 100 times faster and achieving only 6.7 percent less performance than Libsvm. The time spent by FSVC only depends on the dataset size and thus it can be accurately estimated for new datasets, while Libsvm or Liblinear are much slower on "difficult" datasets, even if they are small. The FSVC adjusts its requirements to the available memory, classifying large datasets in computers with limited memory. Code for the proposed algorithm in the Octave scientific programming language is provided.1.
Collapse
|
9
|
Stock Price Prediction based on Data Mining Combination Model. JOURNAL OF GLOBAL INFORMATION MANAGEMENT 2022. [DOI: 10.4018/jgim.296707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Predicting stock indexes is a common concern in the financial world. This work uses neural network, support vector machine (SVM), mixed data sampling (MIDAS), and other methods in data mining technology to predict the daily closing price of the next 20 days and the monthly average closing price of the future expected daily closing price on the basis of the market performance of stock prices. Additionally, by the mutual ratio of weighted mean square error the study achieves the best prediction result. Combining value investment effectively with nonlinear models, a complete stock forecasting model is established, and empirical research is conducted on it. Results indicate that SVM and MIDAS have good results for stock price forecasting. Among them, MIDAS has a better mid-term forecast, which is approximately 10% higher than the forecast accuracy of the SVM model; Meanwhile, SVM is more accurate in the short-term forecast.
Collapse
|
10
|
Akram-Ali-Hammouri Z, Fernández-Delgado M, Albtoush A, Cernadas E, Barro S. Ideal kernel tuning: Fast and scalable selection of the radial basis kernel spread for support vector classification. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.03.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
11
|
Ramp loss KNN-weighted multi-class twin support vector machine. Soft comput 2022. [DOI: 10.1007/s00500-022-07040-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
12
|
Abd El-Mageed AA, Gad AG, Sallam KM, Munasinghe K, Abohany AA. Improved Binary Adaptive Wind Driven Optimization Algorithm-Based Dimensionality Reduction for Supervised Classification. COMPUTERS & INDUSTRIAL ENGINEERING 2022; 167:107904. [DOI: 10.1016/j.cie.2021.107904] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
13
|
Abstract
AbstractFeature Selection (FS) is an important preprocessing step that is involved in machine learning and data mining tasks for preparing data (especially high-dimensional data) by eliminating irrelevant and redundant features, thus reducing the potential curse of dimensionality of a given large dataset. Consequently, FS is arguably a combinatorial NP-hard problem in which the computational time increases exponentially with an increase in problem complexity. To tackle such a problem type, meta-heuristic techniques have been opted by an increasing number of scholars. Herein, a novel meta-heuristic algorithm, called Sparrow Search Algorithm (SSA), is presented. The SSA still performs poorly on exploratory behavior and exploration-exploitation trade-off because it does not duly stimulate the search within feasible regions, and the exploitation process suffers noticeable stagnation. Therefore, we improve SSA by adopting: i) a strategy for Random Re-positioning of Roaming Agents (3RA); and ii) a novel Local Search Algorithm (LSA), which are algorithmically incorporated into the original SSA structure. To the FS problem, SSA is improved and cloned as a binary variant, namely, the improved Binary SSA (iBSSA), which would strive to select the optimal or near-optimal features from a given dataset while keeping the classification accuracy maximized. For binary conversion, the iBSSA was primarily validated against nine common S-shaped and V-shaped Transfer Functions (TFs), thus producing nine iBSSA variants. To verify the robustness of these variants, three well-known classification techniques, including k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), and Random Forest (RF) were adopted as fitness evaluators with the proposed iBSSA approach and many other competing algorithms, on 18 multifaceted, multi-scale benchmark datasets from the University of California Irvine (UCI) data repository. Then, the overall best-performing iBSSA variant for each of the three classifiers was compared with binary variants of 12 different well-known meta-heuristic algorithms, including the original SSA (BSSA), Artificial Bee Colony (BABC), Particle Swarm Optimization (BPSO), Bat Algorithm (BBA), Grey Wolf Optimization (BGWO), Whale Optimization Algorithm (BWOA), Grasshopper Optimization Algorithm (BGOA) SailFish Optimizer (BSFO), Harris Hawks Optimization (BHHO), Bird Swarm Algorithm (BBSA), Atom Search Optimization (BASO), and Henry Gas Solubility Optimization (BHGSO). Based on a Wilcoxon’s non-parametric statistical test ($$\alpha =0.05$$
α
=
0.05
), the superiority of iBSSA with the three classifiers was very evident against counterparts across the vast majority of the selected datasets, achieving a feature size reduction of up to 92% along with up to 100% classification accuracy on some of those datasets.
Collapse
|
14
|
Yang J, Tang Y, Duan H. Application of Fuzzy Support Vector Machine in Short-Term Power Load Forecasting. JOURNAL OF CASES ON INFORMATION TECHNOLOGY 2022. [DOI: 10.4018/jcit.295248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The realization of short-term load forecasting is the basis of system planning and decision-making, and it is an important index to evaluate the safety and economy of power grid.In order to accurately predict the power load under the influence of many factors, a new short-term power load prediction method based on fuzzy support vector machine and similar daily linear extrapolation is proposed, which combinesthe method of fuzzy support vector machine and linear extrapolation of similar days. The method first selects similar days according to the effect of integrated weather and time on load. Then the fuzzy membership of the training sample is obtained by the normalization processing, and the daily maximum and minimum load is predicted by the fuzzy support vector machine. Finally, the load prediction value is obtained by combining the load trend curve obtained by the similar daily linear extrapolation method. and this method is feasible and effective for short-term forecasting of power load.
Collapse
Affiliation(s)
- Jie Yang
- College of Information Engineering, Hunan University of Science and Engineering, China
| | - Yachun Tang
- College of Information Engineerin, Hunan University of Science and Engineering, China
| | - Huabin Duan
- College of Information Engineering, Hunan University of Science and Engineering, China
| |
Collapse
|
15
|
Wang J, Luo J. A fast parameter optimization approach based on the inter-cluster induced distance in the feature space for support vector machines. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108519] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
16
|
Composite Multivariate Multi-Scale Permutation Entropy and Laplacian Score Based Fault Diagnosis of Rolling Bearing. ENTROPY 2022; 24:e24020160. [PMID: 35205457 PMCID: PMC8870813 DOI: 10.3390/e24020160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 01/11/2022] [Accepted: 01/14/2022] [Indexed: 02/06/2023]
Abstract
As a powerful tool for measuring complexity and randomness, multivariate multi-scale permutation entropy (MMPE) has been widely applied to the feature representation and extraction of multi-channel signals. However, MMPE still has some intrinsic shortcomings that exist in the coarse-grained procedure, and it lacks the precise estimation of entropy value. To address these issues, in this paper a novel non-linear dynamic method named composite multivariate multi-scale permutation entropy (CMMPE) is proposed, for optimizing insufficient coarse-grained process in MMPE, and thus to avoid the loss of information. The simulated signals are used to verify the validity of CMMPE by comparing it with the often-used MMPE method. An intelligent fault diagnosis method is then put forward on the basis of CMMPE, Laplacian score (LS), and bat optimization algorithm-based support vector machine (BA-SVM). Finally, the proposed fault diagnosis method is utilized to analyze the test data of rolling bearings and is then compared with the MMPE, multivariate multi-scale multiscale entropy (MMFE), and multi-scale permutation entropy (MPE) based fault diagnosis methods. The results indicate that the proposed fault diagnosis method of rolling bearing can achieve effective identification of fault categories and is superior to comparative methods.
Collapse
|
17
|
A novel hybrid approach of ABC with SCA for the parameter optimization of SVR in blind image quality assessment. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06435-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
18
|
|
19
|
Oza P, Sharma P, Patel S, Bruno A. A Bottom-Up Review of Image Analysis Methods for Suspicious Region Detection in Mammograms. J Imaging 2021; 7:190. [PMID: 34564116 PMCID: PMC8466003 DOI: 10.3390/jimaging7090190] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 09/09/2021] [Accepted: 09/14/2021] [Indexed: 11/17/2022] Open
Abstract
Breast cancer is one of the most common death causes amongst women all over the world. Early detection of breast cancer plays a critical role in increasing the survival rate. Various imaging modalities, such as mammography, breast MRI, ultrasound and thermography, are used to detect breast cancer. Though there is a considerable success with mammography in biomedical imaging, detecting suspicious areas remains a challenge because, due to the manual examination and variations in shape, size, other mass morphological features, mammography accuracy changes with the density of the breast. Furthermore, going through the analysis of many mammograms per day can be a tedious task for radiologists and practitioners. One of the main objectives of biomedical imaging is to provide radiologists and practitioners with tools to help them identify all suspicious regions in a given image. Computer-aided mass detection in mammograms can serve as a second opinion tool to help radiologists avoid running into oversight errors. The scientific community has made much progress in this topic, and several approaches have been proposed along the way. Following a bottom-up narrative, this paper surveys different scientific methodologies and techniques to detect suspicious regions in mammograms spanning from methods based on low-level image features to the most recent novelties in AI-based approaches. Both theoretical and practical grounds are provided across the paper sections to highlight the pros and cons of different methodologies. The paper's main scope is to let readers embark on a journey through a fully comprehensive description of techniques, strategies and datasets on the topic.
Collapse
Affiliation(s)
- Parita Oza
- Computer Science and Engineering Department, School of Technology, Pandit Deendayal Energy University, Gandhinagar 382007, India; (P.S.); (S.P.)
| | - Paawan Sharma
- Computer Science and Engineering Department, School of Technology, Pandit Deendayal Energy University, Gandhinagar 382007, India; (P.S.); (S.P.)
| | - Samir Patel
- Computer Science and Engineering Department, School of Technology, Pandit Deendayal Energy University, Gandhinagar 382007, India; (P.S.); (S.P.)
| | - Alessandro Bruno
- Department of Computing and Informatics, Bournemouth University, Poole, Dorset BH12 5BB, UK
| |
Collapse
|
20
|
Dudzik W, Nalepa J, Kawulok M. Evolving data-adaptive support vector machines for binary classification. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107221] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
21
|
Liu J, Li Q, Chen Y, Wang B, Li Y, Xin Y. Automatic identification of respiratory events based on nasal airflow and respiratory effort of the chest and abdomen. Physiol Meas 2021; 42. [PMID: 33887711 DOI: 10.1088/1361-6579/abfae5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 04/22/2021] [Indexed: 11/11/2022]
Abstract
Objective.Disease may cause changes in an individual's respiratory pattern, which can be measured as parameters for disease evaluation, usually through manually annotated polysomnographic recordings. In this study, a machine learning model based on nasal airflow and respiratory effort of the chest and abdomen is proposed to automatically identify respiratory events, including normal breathing, hypopnea and apnea.Approach.The nasal airflow and chest-abdominal respiratory effort signals were collected by polysomnography (PSG). Time/frequency domain features, fractional Fourier transform features and sample entropy were calculated to obtain feature sets. Features selected through statistical analysis were used as input variables of the machine learning model. The performance of different input combinations on different models was studied and cross-validated.Main results.The dataset included PSG sleep records of 60 patients provided by the Chinese People's Liberation Army General Hospital. The extreme gradient boosting-based model (XGBoost) performed best in several models with an accuracy of 0.807 and a F1 score of 0.807, depending on the combination of nasal airflow and two respiratory effort signals. The precision for normal breathing, hypopnea and apnea events were 0.764, 0.789 and 0.871, respectively. In addition, the recall scores were 0.833, 0.768 and 0.823 for normal breathing, hypopnea and apnea events, respectively. Moreover, it was found that the standard deviation and kurtosis of nasal airflow were the most important features of the respiratory event detection model.Significance.Since nasal airflow and respiratory effort of the chest and abdomen contain the characteristics of respiratory events, their combined use can improve the classification performance for identification of respiratory events. With this method, respiratory events can be automatically detected and labeled from the PSG records, which can be used to screen for patients with sleep apnea-hypopnea syndrome.
Collapse
Affiliation(s)
- Juan Liu
- School of Life Science, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Qin Li
- School of Life Science, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Yibing Chen
- Department of Respiratory and Critical Medicine, Chinese People's Liberation Army General Hospital, Beijing, People's Republic of China
| | - Binhua Wang
- Medical Innovation Research Division, Chinese People's Liberation Army General Hospital, Beijing, People's Republic of China
| | - Yuzhu Li
- Department of Respiratory and Critical Medicine, Chinese People's Liberation Army General Hospital, Beijing, People's Republic of China
| | - Yi Xin
- School of Life Science, Beijing Institute of Technology, Beijing, People's Republic of China
| |
Collapse
|
22
|
Accuracy Improvement of Transformer Faults Diagnostic Based on DGA Data Using SVM-BA Classifier. ENERGIES 2021. [DOI: 10.3390/en14102970] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The main objective of the current work was to enhance the transformer fault diagnostic accuracy based on dissolved gas analysis (DGA) data with a proposed coupled system of support vector machine (SVM)-bat algorithm (BA) and Gaussian classifiers. Six electrical and thermal fault classes were categorized based on the IEC and IEEE standard rules. The concentration of five main combustible gases (hydrogen, methane, ethane, ethylene, and acetylene) was utilized as an input vector of the two classifiers. Two types of input vectors have been tested; the first input type considered the five gases in ppm, and the second input type considered the gases introduced in the percentage of the sum of the five gases. An extensive database of 481 had been used for training and testing phases (321 data samples for training and 160 data samples for testing). The SVM model conditioning parameter “λ” and penalty margin parameter “C” were adjusted through the bat algorithm to develop a maximum accuracy rate. The SVM-BA and Gaussian classifiers’ accuracy was evaluated and compared with several DGA techniques in the literature.
Collapse
|
23
|
Villa A, Mundanad Narayanan A, Van Huffel S, Bertrand A, Varon C. Utility metric for unsupervised feature selection. PeerJ Comput Sci 2021; 7:e477. [PMID: 33981839 PMCID: PMC8080425 DOI: 10.7717/peerj-cs.477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
Feature selection techniques are very useful approaches for dimensionality reduction in data analysis. They provide interpretable results by reducing the dimensions of the data to a subset of the original set of features. When the data lack annotations, unsupervised feature selectors are required for their analysis. Several algorithms for this aim exist in the literature, but despite their large applicability, they can be very inaccessible or cumbersome to use, mainly due to the need for tuning non-intuitive parameters and the high computational demands. In this work, a publicly available ready-to-use unsupervised feature selector is proposed, with comparable results to the state-of-the-art at a much lower computational cost. The suggested approach belongs to the methods known as spectral feature selectors. These methods generally consist of two stages: manifold learning and subset selection. In the first stage, the underlying structures in the high-dimensional data are extracted, while in the second stage a subset of the features is selected to replicate these structures. This paper suggests two contributions to this field, related to each of the stages involved. In the manifold learning stage, the effect of non-linearities in the data is explored, making use of a radial basis function (RBF) kernel, for which an alternative solution for the estimation of the kernel parameter is presented for cases with high-dimensional data. Additionally, the use of a backwards greedy approach based on the least-squares utility metric for the subset selection stage is proposed. The combination of these new ingredients results in the utility metric for unsupervised feature selection U2FS algorithm. The proposed U2FS algorithm succeeds in selecting the correct features in a simulation environment. In addition, the performance of the method on benchmark datasets is comparable to the state-of-the-art, while requiring less computational time. Moreover, unlike the state-of-the-art, U2FS does not require any tuning of parameters.
Collapse
Affiliation(s)
- Amalia Villa
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Abhijith Mundanad Narayanan
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Sabine Van Huffel
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Alexander Bertrand
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Leuven.AI, KU Leuven Institute for AI, Leuven, Belgium
| | - Carolina Varon
- STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, Leuven, Belgium
- Circuits and Systems (CAS) Group, Delft University of Technology, Delft, The Netherlands
- e-Media Research Lab, Campus GroepT, KU Leuven, Leuven, Belgium
| |
Collapse
|
24
|
Improved manta ray foraging optimization for multi-level thresholding using COVID-19 CT images. Neural Comput Appl 2021; 33:16899-16919. [PMID: 34248291 PMCID: PMC8261821 DOI: 10.1007/s00521-021-06273-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 06/26/2021] [Indexed: 02/06/2023]
Abstract
Coronavirus disease 2019 (COVID-19) is pervasive worldwide, posing a high risk to people's safety and health. Many algorithms were developed to identify COVID-19. One way of identifying COVID-19 is by computed tomography (CT) images. Some segmentation methods are proposed to extract regions of interest from COVID-19 CT images to improve the classification. In this paper, an efficient version of the recent manta ray foraging optimization (MRFO) algorithm is proposed based on the oppositionbased learning called the MRFO-OBL algorithm. The original MRFO algorithm can stagnate in local optima and requires further exploration with adequate exploitation. Thus, to improve the population variety in the search space, we applied Opposition-based learning (OBL) in the MRFO's initialization step. MRFO-OBL algorithm can solve the image segmentation problem using multilevel thresholding. The proposed MRFO-OBL is evaluated using Otsu's method over the COVID-19 CT images and compared with six meta-heuristic algorithms: sine-cosine algorithm, moth flame optimization, equilibrium optimization, whale optimization algorithm, slap swarm algorithm, and original MRFO algorithm. MRFO-OBL obtained useful and accurate results in quality, consistency, and evaluation matrices, such as peak signal-to-noise ratio and structural similarity index. Eventually, MRFO-OBL obtained more robustness for the segmentation than all other algorithms compared. The experimental results demonstrate that the proposed method outperforms the original MRFO and the other compared algorithms under Otsu's method for all the used metrics.
Collapse
|
25
|
Dragonfly Algorithm and Its Applications in Applied Science Survey. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2019; 2019:9293617. [PMID: 31885533 PMCID: PMC6925939 DOI: 10.1155/2019/9293617] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 10/24/2019] [Accepted: 11/13/2019] [Indexed: 11/30/2022]
Abstract
One of the most recently developed heuristic optimization algorithms is dragonfly by Mirjalili. Dragonfly algorithm has shown its ability to optimizing different real-world problems. It has three variants. In this work, an overview of the algorithm and its variants is presented. Moreover, the hybridization versions of the algorithm are discussed. Furthermore, the results of the applications that utilized the dragonfly algorithm in applied science are offered in the following area: machine learning, image processing, wireless, and networking. It is then compared with some other metaheuristic algorithms. In addition, the algorithm is tested on the CEC-C06 2019 benchmark functions. The results prove that the algorithm has great exploration ability and its convergence rate is better than the other algorithms in the literature, such as PSO and GA. In general, in this survey, the strong and weak points of the algorithm are discussed. Furthermore, some future works that will help in improving the algorithm's weak points are recommended. This study is conducted with the hope of offering beneficial information about dragonfly algorithm to the researchers who want to study the algorithm.
Collapse
|
26
|
DA-Based Parameter Optimization of Combined Kernel Support Vector Machine for Cancer Diagnosis. Processes (Basel) 2019. [DOI: 10.3390/pr7050263] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
As is well known, the correct diagnosis for cancer is critical to save patients’ lives. Support vector machine (SVM) has already made an important contribution to the field of cancer classification. However, different kernel function configurations and their parameters will significantly affect the performance of SVM classifier. To improve the classification accuracy of SVM classifier for cancer diagnosis, this paper proposed a novel cancer classification algorithm based on the dragonfly algorithm and SVM with a combined kernel function (DA-CKSVM) which was constructed from a radial basis function (RBF) kernel and a polynomial kernel. Experiments were performed on six cancer data sets from University of California, Irvine (UCI) machine learning repository and two cancer data sets from Cancer Program Legacy Publication Resources to evaluate the validity of the proposed algorithm. Compared with four well-known algorithms: dragonfly algorithm-SVM (DA-SVM), particle swarm optimization-SVM (PSO-SVM), bat algorithm-SVM (BA-SVM), and genetic algorithm-SVM (GA-SVM), the proposed algorithm was able to find the optimal parameters of SVM classifier and achieved better classification accuracy on cancer datasets.
Collapse
|
27
|
Parameters optimization of support vector machines for imbalanced data using social ski driver algorithm. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04159-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
28
|
A Novel Bat Algorithm with Multiple Strategies Coupling for Numerical Optimization. MATHEMATICS 2019. [DOI: 10.3390/math7020135] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
A bat algorithm (BA) is a heuristic algorithm that operates by imitating the echolocation behavior of bats to perform global optimization. The BA is widely used in various optimization problems because of its excellent performance. In the bat algorithm, the global search capability is determined by the parameter loudness and frequency. However, experiments show that each operator in the algorithm can only improve the performance of the algorithm at a certain time. In this paper, a novel bat algorithm with multiple strategies coupling (mixBA) is proposed to solve this problem. To prove the effectiveness of the algorithm, we compared it with CEC2013 benchmarks test suits. Furthermore, the Wilcoxon and Friedman tests were conducted to distinguish the differences between it and other algorithms. The results prove that the proposed algorithm is significantly superior to others on the majority of benchmark functions.
Collapse
|
29
|
|
30
|
Sayed GI, Tharwat A, Hassanien AE. Chaotic dragonfly algorithm: an improved metaheuristic algorithm for feature selection. APPL INTELL 2018. [DOI: 10.1007/s10489-018-1261-8] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
31
|
Jayabarathi T, Raghunathan T, Gandomi AH. The Bat Algorithm, Variants and Some Practical Engineering Applications: A Review. NATURE-INSPIRED ALGORITHMS AND APPLIED OPTIMIZATION 2018. [DOI: 10.1007/978-3-319-67669-2_14] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
32
|
|
33
|
Hassanien AE, Tharwat A, Own HS. Computational model for vitamin D deficiency using hair mineral analysis. Comput Biol Chem 2017; 70:198-210. [PMID: 28923545 DOI: 10.1016/j.compbiolchem.2017.08.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Revised: 08/09/2017] [Accepted: 08/22/2017] [Indexed: 01/03/2023]
Abstract
Vitamin D deficiency is prevalent in the Arabian Gulf region, especially among women. Recent studies show that the vitamin D deficiency is associated with a mineral status of a patient. Therefore, it is important to assess the mineral status of the patient to reveal the hidden mineral imbalance associated with vitamin D deficiency. A well-known test such as the red blood cells is fairly expensive, invasive, and less informative. On the other hand, a hair mineral analysis can be considered an accurate, excellent, highly informative tool to measure mineral imbalance associated with vitamin D deficiency. In this study, 118 apparently healthy Kuwaiti women were assessed for their mineral levels and vitamin D status by a hair mineral analysis (HMA). This information was used to build a computerized model that would predict vitamin D deficiency based on its association with the levels and ratios of minerals. The first phase of the proposed model introduces a novel hybrid optimization algorithm, which can be considered as an improvement of Bat Algorithm (BA) to select the most discriminative features. The improvement includes using the mutation process of Genetic Algorithm (GA) to update the positions of bats with the aim of speeding up convergence; thus, making the algorithm more feasible for wider ranges of real-world applications. Due to the imbalanced class distribution in our dataset, in the second phase, different sampling methods such as Random Under-Sampling, Random Over-Sampling, and Synthetic Minority Oversampling Technique are used to solve the problem of imbalanced datasets. In the third phase, an AdaBoost ensemble classifier is used to predicting the vitamin D deficiency. The results showed that the proposed model achieved good results to detect the deficiency in vitamin D.
Collapse
Affiliation(s)
- Aboul Ella Hassanien
- Faculty of Computers and Information, Cairo University, Egypt; Scientific Research Group in Egypt (SRGE), Egypt1.
| | - Alaa Tharwat
- Faculty of Engineering, Suez Canal University, Egypt; Faculty of Computer Science and Engineering, Frankfurt University of Applied Sciences, 60318 Frankfurt am Main, Germany; Scientific Research Group in Egypt (SRGE), Egypt1.
| | - Hala S Own
- Department of Solar and Space Research, National Research Institute of Astronomy and Geophysics, El-Marsad Street, P.O. Box 11421 Helwan, Egypt.
| |
Collapse
|