1
|
Ghasemkhani B, Balbal KF, Birant KU, Birant D. A Novel Classification Method: Neighborhood-Based Positive Unlabeled Learning Using Decision Tree (NPULUD). ENTROPY (BASEL, SWITZERLAND) 2024; 26:403. [PMID: 38785652 PMCID: PMC11120015 DOI: 10.3390/e26050403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 04/29/2024] [Accepted: 05/02/2024] [Indexed: 05/25/2024]
Abstract
In a standard binary supervised classification task, the existence of both negative and positive samples in the training dataset are required to construct a classification model. However, this condition is not met in certain applications where only one class of samples is obtainable. To overcome this problem, a different classification method, which learns from positive and unlabeled (PU) data, must be incorporated. In this study, a novel method is presented: neighborhood-based positive unlabeled learning using decision tree (NPULUD). First, NPULUD uses the nearest neighborhood approach for the PU strategy and then employs a decision tree algorithm for the classification task by utilizing the entropy measure. Entropy played a pivotal role in assessing the level of uncertainty in the training dataset, as a decision tree was developed with the purpose of classification. Through experiments, we validated our method over 24 real-world datasets. The proposed method attained an average accuracy of 87.24%, while the traditional supervised learning approach obtained an average accuracy of 83.99% on the datasets. Additionally, it is also demonstrated that our method obtained a statistically notable enhancement (7.74%), with respect to state-of-the-art peers, on average.
Collapse
Affiliation(s)
- Bita Ghasemkhani
- Graduate School of Natural and Applied Sciences, Dokuz Eylul University, Izmir 35390, Turkey;
| | | | - Kokten Ulas Birant
- Information Technologies Research and Application Center (DEBTAM), Dokuz Eylul University, Izmir 35390, Turkey;
- Department of Computer Engineering, Dokuz Eylul University, Izmir 35390, Turkey
| | - Derya Birant
- Department of Computer Engineering, Dokuz Eylul University, Izmir 35390, Turkey
| |
Collapse
|
2
|
Berlin L, Galyaev A, Lysenko P. Comparison of Information Criteria for Detection of Useful Signals in Noisy Environments. SENSORS (BASEL, SWITZERLAND) 2023; 23:2133. [PMID: 36850735 PMCID: PMC9966083 DOI: 10.3390/s23042133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 02/08/2023] [Accepted: 02/12/2023] [Indexed: 06/18/2023]
Abstract
This paper considers the appearance of indications of useful acoustic signals in the signal/noise mixture. Various information characteristics (information entropy, Jensen-Shannon divergence, spectral information divergence and statistical complexity) are investigated in the context of solving this problem. Both time and frequency domains are studied for the calculation of information entropy. The effectiveness of statistical complexity is shown in comparison with other information metrics for different signal-to-noise ratios. Two different approaches for statistical complexity calculations are also compared. In addition, analytical formulas for complexity and disequilibrium are obtained using entropy variation in the case of signal spectral distribution. The connection between the statistical complexity criterion and the Neyman-Pearson approach for hypothesis testing is discussed. The effectiveness of the proposed approach is shown for different types of acoustic signals and noise models, including colored noises, and different signal-to-noise ratios, especially when the estimation of additional noise characteristics is impossible.
Collapse
|
3
|
El-Gindy SAE, Ibrahim FE, Alabasy M, Abdelzaher HM, El-Refy M, Khalaf AAM, El-Dolil SM, El-Fishawy AS, Taha TE, El-Rabaie ESM, Dessouky MI, El-Dokany I, Oraby OA, N. Alotaiby T, Alshebeili SA, Abd El-Samie FE. Detection of Abnormal Activities from Various Signals Based on Statistical Analysis. WIRELESS PERSONAL COMMUNICATIONS 2022; 125:1013-1046. [DOI: 10.1007/s11277-022-09565-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 01/29/2022] [Indexed: 09/02/2023]
|
4
|
Masciadri A, Lin C, Comai S, Salice F. A Multi-Resident Number Estimation Method for Smart Homes. SENSORS (BASEL, SWITZERLAND) 2022; 22:4823. [PMID: 35808320 PMCID: PMC9269108 DOI: 10.3390/s22134823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 06/15/2022] [Accepted: 06/21/2022] [Indexed: 06/15/2023]
Abstract
Population aging requires innovative solutions to increase the quality of life and preserve autonomous and independent living at home. A need of particular significance is the identification of behavioral drifts. A relevant behavioral drift concerns sociality: older people tend to isolate themselves. There is therefore the need to find methodologies to identify if, when, and how long the person is in the company of other people (possibly, also considering the number). The challenge is to address this task in poorly sensorized apartments, with non-intrusive sensors that are typically wireless and can only provide local and simple information. The proposed method addresses technological issues, such as PIR (Passive InfraRed) blind times, topological issues, such as sensor interference due to the inability to separate detection areas, and algorithmic issues. The house is modeled as a graph to constrain transitions between adjacent rooms. Each room is associated with a set of values, for each identified person. These values decay over time and represent the probability that each person is still in the room. Because the used sensors cannot determine the number of people, the approach is based on a multi-branch inference that, over time, differentiates the movements in the apartment and estimates the number of people. The proposed algorithm has been validated with real data obtaining an accuracy of 86.8%.
Collapse
Affiliation(s)
- Andrea Masciadri
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; (C.L.); (S.C.); (F.S.)
| | - Changhong Lin
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; (C.L.); (S.C.); (F.S.)
| | - Sara Comai
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; (C.L.); (S.C.); (F.S.)
| | - Fabio Salice
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; (C.L.); (S.C.); (F.S.)
| |
Collapse
|
5
|
Aung ST, Wongsawat Y. Prediction of epileptic seizures based on multivariate multiscale modified-distribution entropy. PeerJ Comput Sci 2021; 7:e744. [PMID: 34722874 PMCID: PMC8530096 DOI: 10.7717/peerj-cs.744] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 09/22/2021] [Indexed: 06/13/2023]
Abstract
Epilepsy is a common neurological disease that affects a wide range of the world population and is not limited by age. Moreover, seizures can occur anytime and anywhere because of the sudden abnormal discharge of brain neurons, leading to malfunction. The seizures of approximately 30% of epilepsy patients cannot be treated with medicines or surgery; hence these patients would benefit from a seizure prediction system to live normal lives. Thus, a system that can predict a seizure before its onset could improve not only these patients' social lives but also their safety. Numerous seizure prediction methods have already been proposed, but the performance measures of these methods are still inadequate for a complete prediction system. Here, a seizure prediction system is proposed by exploring the advantages of multivariate entropy, which can reflect the complexity of multivariate time series over multiple scales (frequencies), called multivariate multiscale modified-distribution entropy (MM-mDistEn), with an artificial neural network (ANN). The phase-space reconstruction and estimation of the probability density between vectors provide hidden complex information. The multivariate time series property of MM-mDistEn provides more understandable information within the multichannel data and makes it possible to predict of epilepsy. Moreover, the proposed method was tested with two different analyses: simulation data analysis proves that the proposed method has strong consistency over the different parameter selections, and the results from experimental data analysis showed that the proposed entropy combined with an ANN obtains performance measures of 98.66% accuracy, 91.82% sensitivity, 99.11% specificity, and 0.84 area under the curve (AUC) value. In addition, the seizure alarm system was applied as a postprocessing step for prediction purposes, and a false alarm rate of 0.014 per hour and an average prediction time of 26.73 min before seizure onset were achieved by the proposed method. Thus, the proposed entropy as a feature extraction method combined with an ANN can predict the ictal state of epilepsy, and the results show great potential for all epilepsy patients.
Collapse
|
6
|
Smart Care Using a DNN-Based Approach for Activities of Daily Living (ADL) Recognition. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app11010010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Health care for independently living elders is more important than ever. Automatic recognition of their Activities of Daily Living (ADL) is the first step to solving the health care issues faced by seniors in an efficient way. The paper describes a Deep Neural Network (DNN)-based recognition system aimed at facilitating smart care, which combines ADL recognition, image/video processing, movement calculation, and DNN. An algorithm is developed for processing skeletal data, filtering noise, and pattern recognition for identification of the 10 most common ADL including standing, bending, squatting, sitting, eating, hand holding, hand raising, sitting plus drinking, standing plus drinking, and falling. The evaluation results show that this DNN-based system is suitable method for dealing with ADL recognition with an accuracy rate of over 95%. The findings support the feasibility of this system that is efficient enough for both practical and academic applications.
Collapse
|
7
|
Lindstrom MR, Jung H, Larocque D. Functional Kernel Density Estimation: Point and Fourier Approaches to Time Series Anomaly Detection. ENTROPY (BASEL, SWITZERLAND) 2020; 22:e22121363. [PMID: 33266340 PMCID: PMC7759980 DOI: 10.3390/e22121363] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 11/27/2020] [Indexed: 06/12/2023]
Abstract
We present an unsupervised method to detect anomalous time series among a collection of time series. To do so, we extend traditional Kernel Density Estimation for estimating probability distributions in Euclidean space to Hilbert spaces. The estimated probability densities we derive can be obtained formally through treating each series as a point in a Hilbert space, placing a kernel at those points, and summing the kernels (a "point approach"), or through using Kernel Density Estimation to approximate the distributions of Fourier mode coefficients to infer a probability density (a "Fourier approach"). We refer to these approaches as Functional Kernel Density Estimation for Anomaly Detection as they both yield functionals that can score a time series for how anomalous it is. Both methods naturally handle missing data and apply to a variety of settings, performing well when compared with an outlyingness score derived from a boxplot method for functional data, with a Principal Component Analysis approach for functional data, and with the Functional Isolation Forest method. We illustrate the use of the proposed methods with aviation safety report data from the International Air Transport Association (IATA).
Collapse
Affiliation(s)
| | - Hyuntae Jung
- Global Aviation Data Management, International Air Transport Association (IATA), Montréal, QC H2Y 1C6, Canada;
| | - Denis Larocque
- Department of Decision Sciences, HEC Montréal, Montréal, QC H2Y 1C6, Canada;
| |
Collapse
|